VTK
/build/vtk7-ISktbs/vtk7-7.1.1+dfsg1/Documentation/Doxygen/ArrayDispatch-VTK-7-1.md
Go to the documentation of this file.
1 @page VTK-7-1-ArrayDispatch vtkArrayDispatch and Related Tools
2 @tableofcontents
3 
4 # Background # {#VTKAD-Background}
5 
6 VTK datasets store most of their important information in subclasses of
7 `vtkDataArray`. Vertex locations (`vtkPoints::Data`), cell topology
8 (`vtkCellArray::Ia`), and numeric point, cell, and generic attributes
9 (`vtkFieldData::Data`) are the dataset features accessed most frequently by VTK
10 algorithms, and these all rely on the `vtkDataArray` API.
11 
12 # Terminology # {#VTKAD-Terminology}
13 
14 This page uses the following terms:
15 
16 A __ValueType__ is the element type of an array. For instance, `vtkFloatArray`
17 has a ValueType of `float`.
18 
19 An __ArrayType__ is a subclass of `vtkDataArray`. It specifies not only a
20 ValueType, but an array implementation as well. This becomes important as
21 `vtkDataArray` subclasses will begin to stray from the typical
22 "array-of-structs" ordering that has been exclusively used in the past.
23 
24 A __dispatch__ is a runtime-resolution of a `vtkDataArray`'s ArrayType, and is
25 used to call a section of executable code that has been tailored for that
26 ArrayType. Dispatching has compile-time and run-time components. At
27 compile-time, the possible ArrayTypes to be used are determined and a worker
28 code template is generated for each type. At run-time, the type of a specific
29 array is determined and the proper worker instantiation is called.
30 
31 __Template explosion__ refers to a sharp increase in the size of a compiled
32 binary that results from instantiating a template function or class on many
33 different types.
34 
35 ## vtkDataArray ## {#VTKAD-vtkDataArray}
36 
37 The data array type hierarchy in VTK has a unique feature when compared to
38 typical C++ containers: a non-templated base class. All arrays containing
39 numeric data inherit `vtkDataArray`, a common interface that sports a very
40 useful API. Without knowing the underlying ValueType stored in data array, an
41 algorithm or user may still work with any `vtkDataArray` in meaningful ways:
42 The array can be resized, reshaped, read, and rewritten easily using a generic
43 API that substitutes double-precision floating point numbers for the array's
44 actual ValueType. For instance, we can write a simple function that computes
45 the magnitudes for a set of vectors in one array and store the results in
46 another using nothing but the typeless `vtkDataArray` API:
47 
48 ~~~{.cpp}
49 // 3 component magnitude calculation using the vtkDataArray API.
50 // Inefficient, but easy to write:
51 void calcMagnitude(vtkDataArray *vectors, vtkDataArray *magnitude)
52 {
53  vtkIdType numVectors = vectors->GetNumberOfTuples();
54  for (vtkIdType tupleIdx = 0; tupleIdx < numVectors; ++tupleIdx)
55  {
56  // What data types are magnitude and vectors using?
57  // We don't care! These methods all use double.
58  magnitude->SetComponent(tupleIdx, 0,
59  std::sqrt(vectors->GetComponent(tupleIdx, 0) *
60  vectors->GetComponent(tupleIdx, 0) +
61  vectors->GetComponent(tupleIdx, 1) *
62  vectors->GetComponent(tupleIdx, 1) +
63  vectors->GetComponent(tupleIdx, 2) *
64  vectors->GetComponent(tupleIdx, 2));
65  }
66 }
67 ~~~
68 
69 ## The Costs of Flexiblity ## {#VTKAD-TheCostsOfFlexiblity}
70 
71 However, this flexibility comes at a cost. Passing data through a generic API
72 has a number of issues:
73 
74 __Accuracy__
75 
76 Not all ValueTypes are fully expressible as a `double`. The truncation of
77 integers with > 52 bits of precision can be a particularly nasty issue.
78 
79 __Performance__
80 
81 __Virtual overhead__: The only way to implement such a system is to route the
82 `vtkDataArray` calls through a run-time resolution of ValueTypes. This is
83 implemented through the virtual override mechanism of C++, which adds a small
84 overhead to each API call.
85 
86 __Missed optimization__: The virtual indirection described above also prevents
87 the compiler from being able to make assumptions about the layout of the data
88 in-memory. This information could be used to perform advanced optimizations,
89 such as vectorization.
90 
91 So what can one do if they want fast, optimized, type-safe access to the data
92 stored in a `vtkDataArray`? What options are available?
93 
94 ## The Old Solution: vtkTemplateMacro ## {#VTKAD-vtkTemplateMacro}
95 
96 The `vtkTemplateMacro` is described in this section. While it is no longer
97 considered a best practice to use this construct in new code, it is still
98 usable and likely to be encountered when reading the VTK source code. Newer
99 code should use the `vtkArrayDispatch` mechanism, which is detailed later. The
100 discussion of `vtkTemplateMacro` will help illustrate some of the practical
101 issues with array dispatching.
102 
103 With a few minor exceptions that we won't consider here, prior to VTK 7.1 it
104 was safe to assume that all numeric `vtkDataArray` objects were also subclasses
105 of `vtkDataArrayTemplate`. This template class provided the implementation of
106 all documented numeric data arrays such as `vtkDoubleArray`, `vtkIdTypeArray`,
107 etc, and stores the tuples in memory as a contiguous array-of-structs (AOS).
108 For example, if we had an array that stored 3-component tuples as floating
109 point numbers, we could define a tuple as:
110 
111 ~~~{.cpp}
112 struct Tuple { float x; float y; float z; };
113 ~~~
114 
115 An array-of-structs, or AOS, memory buffer containing this data could be
116 described as:
117 
118 ~~~{.cpp}
119 Tuple ArrayOfStructsBuffer[NumTuples];
120 ~~~
121 
122 As a result, `ArrayOfStructsBuffer` will have the following memory layout:
123 
124 ~~~{.cpp}
125 { x1, y1, z1, x2, y2, z2, x3, y3, z3, ...}
126 ~~~
127 
128 That is, the components of each tuple are stored in adjacent memory locations,
129 one tuple after another. While this is not exactly how `vtkDataArrayTemplate`
130 implemented its memory buffers, it accurately describes the resulting memory
131 layout.
132 
133 `vtkDataArray` also defines a `GetDataType` method, which returns an enumerated
134 value describing a type. We can used to discover the ValueType stored in the
135 array.
136 
137 Combine the AOS memory convention and `GetDataType()` with a horrific little
138 method on the data arrays named `GetVoidPointer()`, and a path to efficient,
139 type-safe access was available. `GetVoidPointer()` does what it says on the
140 tin: it returns the memory address for the array data's base location as a
141 `void*`. While this breaks encapsulation and sets off warning bells for the
142 more pedantic among us, the following technique was safe and efficient when
143 used correctly:
144 
145 ~~~{.cpp}
146 // 3-component magnitude calculation using GetVoidPointer.
147 // Efficient and fast, but assumes AOS memory layout
148 template <typename ValueType>
149 void calcMagnitudeWorker(ValueType *vectors, ValueType *magnitude,
150  vtkIdType numVectors)
151 {
152  for (vtkIdType tupleIdx = 0; tupleIdx < numVectors; ++tupleIdx)
153  {
154  // We now have access to the raw memory buffers, and assuming
155  // AOS memory layout, we know how to access them.
156  magnitude[tupleIdx] =
157  std::sqrt(vectors[3 * tupleIdx + 0] *
158  vectors[3 * tupleIdx + 0] +
159  vectors[3 * tupleIdx + 1] *
160  vectors[3 * tupleIdx + 1] +
161  vectors[3 * tupleIdx + 2] *
162  vectors[3 * tupleIdx + 2]);
163  }
164 }
165 
166 void calcMagnitude(vtkDataArray *vectors, vtkDataArray *magnitude)
167 {
168  assert("Arrays must have same datatype!" &&
169  vtkDataTypesCompare(vectors->GetDataType(),
170  magnitude->GetDataType()));
171  switch (vectors->GetDataType())
172  {
173  vtkTemplateMacro(calcMagnitudeWorker<VTK_TT*>(
174  static_cast<VTK_TT*>(vectors->GetVoidPointer(0)),
175  static_cast<VTK_TT*>(magnitude->GetVoidPointer(0)),
176  vectors->GetNumberOfTuples());
177  }
178 }
179 ~~~
180 
181 The `vtkTemplateMacro`, as you may have guessed, expands into a series of case
182 statements that determine an array's ValueType from the `int GetDataType()`
183 return value. The ValueType is then `typedef`'d to `VTK_TT`, and the macro's
184 argument is called for each numeric type returned from `GetDataType`. In this
185 case, the call to `calcMagnitudeWorker` is made by the macro, with `VTK_TT`
186 `typedef`'d to the array's ValueType.
187 
188 This is the typical usage pattern for `vtkTemplateMacro`. The `calcMagnitude`
189 function calls a templated worker implementation that uses efficient, raw
190 memory access to a typesafe memory buffer so that the worker's code can be as
191 efficient as possible. But this assumes AOS memory ordering, and as we'll
192 mention, this assumption may no longer be valid as VTK moves further into the
193 field of in-situ analysis.
194 
195 But first, you may have noticed that the above example using `vtkTemplateMacro`
196 has introduced a step backwards in terms of functionality. In the
197 `vtkDataArray` implementation, we didn't care if both arrays were the same
198 ValueType, but now we have to ensure this, since we cast both arrays' `void`
199 pointers to `VTK_TT`*. What if vectors is an array of integers, but we want to
200 calculate floating point magnitudes?
201 
202 ## vtkTemplateMacro with Multiple Arrays ## {#VTKAD-Dual-vtkTemplateMacro}
203 
204 The best solution prior to VTK 7.1 was to use two worker functions. The first
205 is templated on vector's ValueType, and the second is templated on both array
206 ValueTypes:
207 
208 ~~~{.cpp}
209 // 3-component magnitude calculation using GetVoidPointer and a
210 // double-dispatch to resolve ValueTypes of both arrays.
211 // Efficient and fast, but assumes AOS memory layout, lots of boilerplate
212 // code, and the sensitivity to template explosion issues increases.
213 template <typename VectorType, typename MagnitudeType>
214 void calcMagnitudeWorker2(VectorType *vectors, MagnitudeType *magnitude,
215  vtkIdType numVectors)
216 {
217  for (vtkIdType tupleIdx = 0; tupleIdx < numVectors; ++tupleIdx)
218  {
219  // We now have access to the raw memory buffers, and assuming
220  // AOS memory layout, we know how to access them.
221  magnitude[tupleIdx] =
222  std::sqrt(vectors[3 * tupleIdx + 0] *
223  vectors[3 * tupleIdx + 0] +
224  vectors[3 * tupleIdx + 1] *
225  vectors[3 * tupleIdx + 1] +
226  vectors[3 * tupleIdx + 2] *
227  vectors[3 * tupleIdx + 2]);
228  }
229 }
230 
231 // Vector ValueType is known (VectorType), now use vtkTemplateMacro on
232 // magnitude:
233 template <typename VectorType>
234 void calcMagnitudeWorker1(VectorType *vectors, vtkDataArray *magnitude,
235  vtkIdType numVectors)
236 {
237  switch (magnitude->GetDataType())
238  {
239  vtkTemplateMacro(calcMagnitudeWorker2(vectors,
240  static_cast<VTK_TT*>(magnitude->GetVoidPointer(0)), numVectors);
241  }
242 }
243 
244 void calcMagnitude(vtkDataArray *vectors, vtkDataArray *magnitude)
245 {
246  // Dispatch vectors first:
247  switch (vectors->GetDataType())
248  {
249  vtkTemplateMacro(calcMagnitudeWorker1<VTK_TT*>(
250  static_cast<VTK_TT*>(vectors->GetVoidPointer(0)),
251  magnitude, vectors->GetNumberOfTuples());
252  }
253 }
254 ~~~
255 
256 This works well, but it's a bit ugly and has the same issue as before regarding
257 memory layout. Double dispatches using this method will also see more problems
258 regarding binary size. The number of template instantiations that the compiler
259 needs to generate is determined by `I = T^D`, where `I` is the number of
260 template instantiations, `T` is the number of types considered, and `D` is the
261 number of dispatches. As of VTK 7.1, `vtkTemplateMacro` considers 14 data
262 types, so this double-dispatch will produce 14 instantiations of
263 `calcMagnitudeWorker1` and 196 instantiations of `calcMagnitudeWorker2`. If we
264 tried to resolve 3 `vtkDataArray`s into raw C arrays, 2744 instantiations of
265 the final worker function would be generated. As more arrays are considered,
266 the need for some form of restricted dispatch becomes very important to keep
267 this template explosion in check.
268 
269 ## Data Array Changes in VTK 7.1 ## {#VTKAD-Changes-in-VTK-71}
270 
271 Starting with VTK 7.1, the Array-Of-Structs (AOS) memory layout is no longer
272 the only `vtkDataArray` implementation provided by the library. The
273 Struct-Of-Arrays (SOA) memory layout is now available throught the
274 `vtkSOADataArrayTemplate` class. The SOA layout assumes that the components of
275 an array are stored separately, as in:
276 
277 ~~~{.cpp}
278 struct StructOfArraysBuffer
279 {
280  float *x; // Pointer to array containing x components
281  float *y; // Same for y
282  float *z; // Same for z
283 };
284 ~~~
285 
286 The new SOA arrays were added to improve interoperability between VTK and
287 simulation packages for live visualization of in-situ results. Many simulations
288 use the SOA layout for their data, and natively supporting these arrays in VTK
289 will allow analysis of live data without the need to explicitly copy it into a
290 VTK data structure.
291 
292 As a result of this change, a new mechanism is needed to efficiently access
293 array data. `vtkTemplateMacro` and `GetVoidPointer` are no longer an acceptable
294 solution -- implementing `GetVoidPointer` for SOA arrays requires creating a
295 deep copy of the data into a new AOS buffer, a waste of both processor time and
296 memory.
297 
298 So we need a replacement for `vtkTemplateMacro` that can abstract away things
299 like storage details while providing performance that is on-par with raw memory
300 buffer operations. And while we're at it, let's look at removing the tedium of
301 multi-array dispatch and reducing the problem of 'template explosion'. The
302 remainder of this page details such a system.
303 
304 # Best Practices for vtkDataArray Post-7.1 # {#VTKAD-BestPractices}
305 
306 We'll describe a new set of tools that make managing template instantiations
307 for efficient array access both easy and extensible. As an overview, the
308 following new features will be discussed:
309 
310 * `vtkGenericDataArray`: The new templated base interface for all numeric
311 `vtkDataArray` subclasses.
312 * `vtkArrayDispatch`: Collection of code generation tools that allow concise
313 and precise specification of restrictable dispatch for up to 3 arrays
314 simultaneously.
315 * `vtkArrayDownCast`: Access to specialized downcast implementations from code
316 templates.
317 * `vtkDataArrayAccessor`: Provides `Get` and `Set` methods for
318 accessing/modifying array data as efficiently as possible. Allows a single
319 worker implementation to work efficiently with `vtkGenericDataArray`
320 subclasses, or fallback to use the `vtkDataArray` API if needed.
321 * `VTK_ASSUME`: New abstraction for the compiler `__assume` directive to
322 provide optimization hints.
323 
324 These will be discussed more fully, but as a preview, here's our familiar
325 `calcMagnitude` example implemented using these new tools:
326 
327 ~~~{.cpp}
328 // Modern implementation of calcMagnitude using new concepts in VTK 7.1:
329 // A worker functor. The calculation is implemented in the function template
330 // for operator().
331 struct CalcMagnitudeWorker
332 {
333  // The worker accepts VTK array objects now, not raw memory buffers.
334  template <typename VectorArray, typename MagnitudeArray>
335  void operator()(VectorArray *vectors, MagnitudeArray *magnitude)
336  {
337  // This allows the compiler to optimize for the AOS array stride.
338  VTK_ASSUME(vectors->GetNumberOfComponents() == 3);
339  VTK_ASSUME(magnitude->GetNumberOfComponents() == 1);
340 
341  // These allow this single worker function to be used with both
342  // the vtkDataArray 'double' API and the more efficient
343  // vtkGenericDataArray APIs, depending on the template parameters:
344  vtkDataArrayAccessor<VectorArray> v(vectors);
345  vtkDataArrayAccessor<MagnitudeArray> m(magnitude);
346 
347  vtkIdType numVectors = vectors->GetNumberOfTuples();
348  for (vtkIdType tupleIdx = 0; tupleIdx < numVectors; ++tupleIdx)
349  {
350  // Set and Get compile to inlined optimizable raw memory accesses for
351  // vtkGenericDataArray subclasses.
352  m.Set(tupleIdx, 0, std::sqrt(v.Get(tupleIdx, 0) * v.Get(tupleIdx, 0) +
353  v.Get(tupleIdx, 1) * v.Get(tupleIdx, 1) +
354  v.Get(tupleIdx, 2) * v.Get(tupleIdx, 2)));
355  }
356  }
357 };
358 
359 void calcMagnitude(vtkDataArray *vectors, vtkDataArray *magnitude)
360 {
361  // Create our worker functor:
362  CalcMagnitudeWorker worker;
363 
364  // Define our dispatcher. We'll let vectors have any ValueType, but only
365  // consider float/double arrays for magnitudes. These combinations will
366  // use a 'fast-path' implementation generated by the dispatcher:
367  typedef vtkArrayDispatch::Dispatch2ByValueType
368  <
369  vtkArrayDispatch::AllTypes, // ValueTypes allowed by first array
370  vtkArrayDispatch::Reals // ValueTypes allowed by second array
371  > Dispatcher;
372 
373  // Execute the dispatcher:
374  if (!Dispatcher::Execute(vectors, magnitude, worker))
375  {
376  // If Execute() fails, it means the dispatch failed due to an
377  // unsupported array type. In this case, it's likely that the magnitude
378  // array is using an integral type. This is an uncommon case, so we won't
379  // generate a fast path for these, but instead call an instantiation of
380  // CalcMagnitudeWorker::operator()<vtkDataArray, vtkDataArray>.
381  // Through the use of vtkDataArrayAccessor, this falls back to using the
382  // vtkDataArray double API:
383  worker(vectors, magnitude);
384  }
385 }
386 ~~~
387 
388 # vtkGenericDataArray # {#VTKAD-vtkGenericDataArray}
389 
390 The `vtkGenericDataArray` class template drives the new `vtkDataArray` class
391 hierarchy. The ValueType is introduced here, both as a template parameter and a
392 class-scope `typedef`. This allows a typed API to be written that doesn't
393 require conversion to/from a common type (as `vtkDataArray` does with double).
394 It does not implement any storage details, however. Instead, it uses the CRTP
395 idiom to forward key method calls to a derived class without using a virtual
396 function call. By eliminating this indirection, `vtkGenericDataArray` defines
397 an interface that can be used to implement highly efficient code, because the
398 compiler is able to see past the method calls and optimize the underlying
399 memory accesses instead.
400 
401 There are two main subclasses of `vtkGenericDataArray`:
402 `vtkAOSDataArrayTemplate` and `vtkSOADataArrayTemplate`. These implement
403 array-of-structs and struct-of-arrays storage, respectively.
404 
405 # vtkTypeList # {#VTKAD-vtkTypeList}
406 
407 Type lists are a metaprogramming construct used to generate a list of C++
408 types. They are used in VTK to implement restricted array dispatching. As we'll
409 see, `vtkArrayDispatch` offers ways to reduce the number of generated template
410 instantiations by enforcing constraints on the arrays used to dispatch. For
411 instance, if one wanted to only generate templated worker implementations for
412 `vtkFloatArray` and `vtkIntArray`, a typelist is used to specify this:
413 
414 ~~~{.cpp}
415 // Create a typelist of 2 types, vtkFloatArray and vtkIntArray:
416 typedef vtkTypeList_Create_2(vtkFloatArray, vtkIntArray) MyArrays;
417 
418 Worker someWorker = ...;
419 vtkDataArray *someArray = ...;
420 
421 // Use vtkArrayDispatch to generate code paths for these arrays:
422 vtkArrayDispatch::DispatchByArray<MyArrays>(someArray, someWorker);
423 ~~~
424 
425 There's not much to know about type lists as a user, other than how to create
426 them. As seen above, there is a set of macros named `vtkTypeList_Create_X`,
427 where X is the number of types in the created list, and the arguments are the
428 types to place in the list. In the example above, the new type list is
429 typically bound to a friendlier name using a local `typedef`, which is a common
430 practice.
431 
432 The `vtkTypeList.h` header defines some additional type list operations that
433 may be useful, such as deleting and appending types, looking up indices, etc.
434 `vtkArrayDispatch::FilterArraysByValueType` may come in handy, too. But for
435 working with array dispatches, most users will only need to create new ones, or
436 use one of the following predefined `vtkTypeLists`:
437 
438 * `vtkArrayDispatch::Reals`: All floating point ValueTypes.
439 * `vtkArrayDispatch::Integrals`: All integral ValueTypes.
440 * `vtkArrayDispatch::AllTypes`: Union of Reals and Integrals.
441 * `vtkArrayDispatch::Arrays`: Default list of ArrayTypes to use in dispatches.
442 
443 The last one is special -- `vtkArrayDispatch::Arrays` is a typelist of
444 ArrayTypes set application-wide when VTK is built. This `vtkTypeList` of
445 `vtkDataArray` subclasses is used for unrestricted dispatches, and is the list
446 that gets filtered when restricting a dispatch to specific ValueTypes.
447 
448 Refining this list allows the user building VTK to have some control over the
449 dispatch process. If SOA arrays are never going to be used, they can be removed
450 from dispatch calls, reducing compile times and binary size. On the other hand,
451 a user applying in-situ techniques may want them available, because they'll be
452 used to import views of intermediate results.
453 
454 By default, `vtkArrayDispatch::Arrays` contains all AOS arrays. The `CMake`
455 option `VTK_DISPATCH_SOA_ARRAYS` will enable SOA array dispatch as well. More
456 advanced possibilities exist and are described in
457 `VTK/CMake/vtkCreateArrayDispatchArrayList.cmake`.
458 
459 # vtkArrayDownCast # {#VTKAD-vtkArrayDownCast}
460 
461 In VTK, all subclasses of `vtkObject` (including the data arrays) support a
462 downcast method called `SafeDownCast`. It is used similarly to the C++
463 `dynamic_cast` -- given an object, try to cast it to a more derived type or
464 return `NULL` if the object is not the requested type. Say we have a
465 `vtkDataArray` and want to test if it is actually a `vtkFloatArray`. We can do
466 this:
467 
468 ~~~{.cpp}
469 void DoSomeAction(vtkDataArray *dataArray)
470 {
471  vtkFloatArray *floatArray = vtkFloatArray::SafeDownCast(dataArray);
472  if (floatArray)
473  {
474  // ... (do work with float array)
475  }
476 }
477 ~~~
478 
479 This works, but it can pose a serious problem if `DoSomeAction` is called
480 repeatedly. `SafeDownCast` works by performing a series of virtual calls and
481 string comparisons to determine if an object falls into a particular class
482 hierarchy. These string comparisons add up and can actually dominate
483 computational resources if an algorithm implementation calls `SafeDownCast` in
484 a tight loop.
485 
486 In such situations, it's ideal to restructure the algorithm so that the
487 downcast only happens once and the same result is used repeatedly, but
488 sometimes this is not possible. To lessen the cost of downcasting arrays, a
489 `FastDownCast` method exists for common subclasses of `vtkAbstractArray`. This
490 replaces the string comparisons with a single virtual call and a few integer
491 comparisons and is far cheaper than the more general SafeDownCast. However, not
492 all array implementations support the `FastDownCast` method.
493 
494 This creates a headache for templated code. Take the following example:
495 
496 ~~~{.cpp}
497 template <typename ArrayType>
498 void DoSomeAction(vtkAbstractArray *array)
499 {
500  ArrayType *myArray = ArrayType::SafeDownCast(array);
501  if (myArray)
502  {
503  // ... (do work with myArray)
504  }
505 }
506 ~~~
507 
508 We cannot use `FastDownCast` here since not all possible ArrayTypes support it.
509 But we really want that performance increase for the ones that do --
510 `SafeDownCast`s are really slow! `vtkArrayDownCast` fixes this issue:
511 
512 ~~~{.cpp}
513 template <typename ArrayType>
514 void DoSomeAction(vtkAbstractArray *array)
515 {
516  ArrayType *myArray = vtkArrayDownCast<ArrayType>(array);
517  if (myArray)
518  {
519  // ... (do work with myArray)
520  }
521 }
522 ~~~
523 
524 `vtkArrayDownCast` automatically selects `FastDownCast` when it is defined for
525 the ArrayType, and otherwise falls back to `SafeDownCast`. This is the
526 preferred array downcast method for performance, uniformity, and reliability.
527 
528 # vtkDataArrayAccessor # {#VTKAD-vtkDataArrayAccessor}
529 
530 Array dispatching relies on having templated worker code carry out some
531 operation. For instance, take this `vtkArrayDispatch` code that locates the
532 maximum value in an array:
533 
534 ~~~{.cpp}
535 // Stores the tuple/component coordinates of the maximum value:
536 struct FindMax
537 {
538  vtkIdType Tuple; // Result
539  int Component; // Result
540 
541  FindMax() : Tuple(-1), Component(-1) {}
542 
543  template <typename ArrayT>
544  void operator()(ArrayT *array)
545  {
546  // The type to use for temporaries, and a temporary to store
547  // the current maximum value:
548  typedef typename ArrayT::ValueType ValueType;
549  ValueType max = std::numeric_limits<ValueType>::min();
550 
551  // Iterate through all tuples and components, noting the location
552  // of the largest element found.
553  vtkIdType numTuples = array->GetNumberOfTuples();
554  int numComps = array->GetNumberOfComponents();
555  for (vtkIdType tupleIdx = 0; tupleIdx < numTuples; ++tupleIdx)
556  {
557  for (int compIdx = 0; compIdx < numComps; ++compIdx)
558  {
559  if (max < array->GetTypedComponent(tupleIdx, compIdx))
560  {
561  max = array->GetTypedComponent(tupleIdx, compIdx);
562  this->Tuple = tupleIdx;
563  this->Component = compIdx;
564  }
565  }
566  }
567  }
568 };
569 
570 void someFunction(vtkDataArray *array)
571 {
572  FindMax maxWorker;
573  vtkArrayDispatch::Dispatch::Execute(array, maxWorker);
574  // Do work using maxWorker.Tuple and maxWorker.Component...
575 }
576 ~~~
577 
578 There's a problem, though. Recall that only the arrays in
579 `vtkArrayDispatch::Arrays` are tested for dispatching. What happens if the
580 array passed into someFunction wasn't on that list?
581 
582 The dispatch will fail, and `maxWorker.Tuple` and `maxWorker.Component` will be
583 left to their initial values of -1. That's no good. What if `someFunction` is a
584 critical path where we want to use a fast dispatched worker if possible, but
585 still have valid results to use if dispatching fails? Well, we can fall back on
586 the `vtkDataArray` API and do things the slow way in that case. When a
587 dispatcher is given an unsupported array, Execute() returns false, so let's
588 just add a backup implementation:
589 
590 ~~~{.cpp}
591 // Stores the tuple/component coordinates of the maximum value:
592 struct FindMax
593 { /* As before... */ };
594 
595 void someFunction(vtkDataArray *array)
596 {
597  FindMax maxWorker;
598  if (!vtkArrayDispatch::Dispatch::Execute(array, maxWorker))
599  {
600  // Reimplement FindMax::operator(), but use the vtkDataArray API's
601  // "virtual double GetComponent()" instead of the more efficient
602  // "ValueType GetTypedComponent()" from vtkGenericDataArray.
603  }
604 }
605 ~~~
606 
607 Ok, that works. But ugh...why write the same algorithm twice? That's extra
608 debugging, extra testing, extra maintenance burden, and just plain not fun.
609 
610 Enter `vtkDataArrayAccessor`. This utility template does a very simple, yet
611 useful, job. It provides component and tuple based `Get` and `Set` methods that
612 will call the corresponding method on the array using either the `vtkDataArray`
613 or `vtkGenericDataArray` API, depending on the class's template parameter. It
614 also defines an `APIType`, which can be used to allocate temporaries, etc. This
615 type is double for `vtkDataArray`s and `vtkGenericDataArray::ValueType` for
616 `vtkGenericDataArray`s.
617 
618 Another nice benefit is that `vtkDataArrayAccessor` has a more compact API. The
619 only defined methods are Get and Set, and they're overloaded to work on either
620 tuples or components (though component access is encouraged as it is much, much
621 more efficient). Note that all non-element access operations (such as
622 `GetNumberOfTuples`) should still be called on the array pointer using
623 `vtkDataArray` API.
624 
625 Using `vtkDataArrayAccessor`, we can write a single worker template that works
626 for both `vtkDataArray` and `vtkGenericDataArray`, without a loss of
627 performance in the latter case. That worker looks like this:
628 
629 ~~~{.cpp}
630 // Better, uses vtkDataArrayAccessor:
631 struct FindMax
632 {
633  vtkIdType Tuple; // Result
634  int Component; // Result
635 
636  FindMax() : Tuple(-1), Component(-1) {}
637 
638  template <typename ArrayT>
639  void operator()(ArrayT *array)
640  {
641  // Create the accessor:
642  vtkDataArrayAccessor<ArrayT> access(array);
643 
644  // Prepare the temporary. We'll use the accessor's APIType instead of
645  // ArrayT::ValueType, since that is appropriate for the vtkDataArray
646  // fallback:
647  typedef typename vtkDataArrayAccessor<ArrayT>::APIType ValueType;
648  ValueType max = std::numeric_limits<ValueType>::min();
649 
650  // Iterate as before, but use access.Get instead of
651  // array->GetTypedComponent. GetTypedComponent is still used
652  // when ArrayT is a vtkGenericDataArray, but
653  // vtkDataArray::GetComponent is now used as a fallback when ArrayT
654  // is vtkDataArray.
655  vtkIdType numTuples = array->GetNumberOfTuples();
656  int numComps = array->GetNumberOfComponents();
657  for (vtkIdType tupleIdx = 0; tupleIdx < numTuples; ++tupleIdx)
658  {
659  for (int compIdx = 0; compIdx < numComps; ++compIdx)
660  {
661  if (max < access.Get(tupleIdx, compIdx))
662  {
663  max = access.Get(tupleIdx, compIdx);
664  this->Tuple = tupleIdx;
665  this->Component = compIdx;
666  }
667  }
668  }
669  }
670 };
671 ~~~
672 
673 Now when we call `operator()` with say, `ArrayT=vtkFloatArray`, we'll get an
674 optimized, efficient code path. But we can also call this same implementation
675 with `ArrayT=vtkDataArray` and still get a correct result (assuming that the
676 `vtkDataArray`'s double API represents the data well enough).
677 
678 Using the `vtkDataArray` fallback path is straightforward. At the call site:
679 
680 ~~~{.cpp}
681 void someFunction(vtkDataArray *array)
682 {
683  FindMax maxWorker;
684  if (!vtkArrayDispatch::Dispatch::Execute(array, maxWorker))
685  {
686  maxWorker(array); // Dispatch failed, call vtkDataArray fallback
687  }
688  // Do work using maxWorker.Tuple and maxWorker.Component -- now we know
689  // for sure that they're initialized!
690 }
691 ~~~
692 
693 Using the above pattern for calling a worker and always going through
694 `vtkDataArrayAccessor` to `Get`/`Set` array elements ensures that any worker
695 implementation can be its own fallback path.
696 
697 # VTK_ASSUME # {#VTKAD-VTK_ASSUME}
698 
699 While performance testing the new array classes, we compared the performance of
700 a dispatched worker using the `vtkDataArrayAccessor` class to the same
701 algorithm using raw memory buffers. We managed to achieve the same performance
702 out of the box for most cases, using both AOS and SOA array implementations. In
703 fact, with `--ffast-math` optimizations on GCC 4.9, the optimizer is able to
704 remove all function calls and apply SIMD vectorized instructions in the
705 dispatched worker, showing that the new array API is thin enough that the
706 compiler can see the algorithm in terms of memory access.
707 
708 But there was one case where performance suffered. If iterating through an AOS
709 data array with a known number of components using `GetTypedComponent`, the raw
710 pointer implementation initially outperformed the dispatched array. To
711 understand why, note that the AOS implementation of `GetTypedComponent` is along
712 the lines of:
713 
714 ~~~{.cpp}
715 ValueType vtkAOSDataArrayTemplate::GetTypedComponent(vtkIdType tuple,
716  int comp) const
717 {
718  // AOSData is a ValueType* pointing at the base of the array data.
719  return this->AOSData[tuple * this->NumberOfComponents + comp];
720 }
721 ~~~
722 
723 Because `NumberOfComponents` is unknown at compile time, the optimizer cannot
724 assume anything about the stride of the components in the array. This leads to
725 missed optimizations for vectorized read/writes and increased complexity in the
726 instructions used to iterate through the data.
727 
728 For such cases where the number of components is, in fact, known at compile
729 time (due to a calling function performing some validation, for instance), it
730 is possible to tell the compiler about this fact using `VTK_ASSUME`.
731 
732 `VTK_ASSUME` wraps a compiler-specific `__assume` statement, which is used to
733 pass such optimization hints. Its argument is an expression of some condition
734 that is guaranteed to always be true. This allows more aggressive optimizations
735 when used correctly, but be forewarned that if the condition is not met at
736 runtime, the results are unpredictable and likely catastrophic.
737 
738 But if we're writing a filter that only operates on 3D point sets, we know the
739 number of components in the point array will always be 3. In this case we can
740 write:
741 
742 ~~~{.cpp}
743 VTK_ASSUME(pointsArray->GetNumberOfComponents() == 3);
744 ~~~
745 
746 in the worker function and this instructs the compiler that the array's
747 internal `NumberOfComponents` variable will always be 3, and thus the stride of
748 the array is known. Of course, the caller of this worker function should ensure
749 that this is a 3-component array and fail gracefully if it is not.
750 
751 There are many scenarios where `VTK_ASSUME` can offer a serious performance
752 boost, the case of known tuple size is a common one that's really worth
753 remembering.
754 
755 # vtkArrayDispatch # {#VTKAD-vtkArrayDispatch}
756 
757 The dispatchers implemented in the vtkArrayDispatch namespace provide array
758 dispatching with customizable restrictions on code generation and a simple
759 syntax that hides the messy details of type resolution and multi-array
760 dispatch. There are several "flavors" of dispatch available that operate on up
761 to three arrays simultaneously.
762 
763 ## Components Of A Dispatch ## {#VTKAD-ComponentsOfADispatch}
764 
765 Using the `vtkArrayDispatch` system requires three elements: the array(s), the
766 worker, and the dispatcher.
767 
768 ### The Arrays ### {#VTKAD-TheArrays}
769 
770 All dispatched arrays must be subclasses of `vtkDataArray`. It is important to
771 identify as many restrictions as possible. Must every ArrayType be considered
772 during dispatch, or is the array's ValueType (or even the ArrayType itself)
773 restricted? If dispatching multiple arrays at once, are they expected to have
774 the same ValueType? These scenarios are common, and these conditions can be
775 used to reduce the number of instantiations of the worker template.
776 
777 ### The Worker ### {#VTKAD-TheWorker}
778 
779 The worker is some generic callable. In C++98, a templated functor is a good
780 choice. In C++14, a generic lambda is a usable option as well. For our
781 purposes, we'll only consider the functor approach, as C++14 is a long ways off
782 for core VTK code.
783 
784 At a minimum, the worker functor should define `operator()` to make it
785 callable. This should be a function template with a template parameter for each
786 array it should handle. For a three array dispatch, it should look something
787 like this:
788 
789 ~~~{.cpp}
790 struct ThreeArrayWorker
791 {
792  template <typename Array1T, typename Array2T, typename Array3T>
793  void operator()(Array1T *array1, Array2T *array2, Array3T *array3)
794  {
795  /* Do stuff... */
796  }
797 };
798 ~~~
799 
800 At runtime, the dispatcher will call `ThreeWayWorker::operator()` with a set of
801 `Array1T`, `Array2T`, and `Array3T` that satisfy any dispatch restrictions.
802 
803 Workers can be stateful, too, as seen in the `FindMax` worker earlier where the
804 worker simply identified the component and tuple id of the largest value in the
805 array. The functor stored them for the caller to use in further analysis:
806 
807 ~~~{.cpp}
808 // Example of a stateful dispatch functor:
809 struct FindMax
810 {
811  // Functor state, holds results that are accessible to the caller:
812  vtkIdType Tuple;
813  int Component;
814 
815  // Set initial values:
816  FindMax() : Tuple(-1), Component(-1) {}
817 
818  // Template method to set Tuple and Component ivars:
819  template <typename ArrayT>
820  void operator()(ArrayT *array)
821  {
822  /* Do stuff... */
823  }
824 };
825 ~~~
826 
827 ### The Dispatcher ### {#VTKAD-TheDispatcher}
828 
829 The dispatcher is the workhorse of the system. It is responsible for applying
830 restrictions, resolving array types, and generating the requested template
831 instantiations. It has responsibilities both at run-time and compile-time.
832 
833 During compilation, the dispatcher will identify the valid combinations of
834 arrays that can be used according to the restrictions. This is done by starting
835 with a typelist of arrays, either supplied as a template parameter or by
836 defaulting to `vtkArrayDispatch::Arrays`, and filtering them by ValueType if
837 needed. For multi-array dispatches, additional restrictions may apply, such as
838 forcing the second and third arrays to have the same ValueType as the first. It
839 must then generate the required code for the dispatch -- that is, the templated
840 worker implementation must be instantiated for each valid combination of
841 arrays.
842 
843 At runtime, it tests each of the dispatched arrays to see if they match one of
844 the generated code paths. Runtime type resolution is carried out using
845 `vtkArrayDownCast` to get the best performance available for the arrays of
846 interest. If it finds a match, it calls the worker's `operator()` method with
847 the properly typed arrays. If no match is found, it returns `false` without
848 executing the worker.
849 
850 ## Restrictions: Why They Matter ## {#VTKAD-RestrictionsWhyTheyMatter}
851 
852 We've made several mentions of using restrictions to reduce the number of
853 template instantiations during a dispatch operation. You may be wondering if it
854 really matters so much. Let's consider some numbers.
855 
856 VTK is configured to use 13 ValueTypes for numeric data. These are the standard
857 numeric types `float`, `int`, `unsigned char`, etc. By default, VTK will define
858 `vtkArrayDispatch::Arrays` to use all 13 types with `vtkAOSDataArrayTemplate`
859 for the standard set of dispatchable arrays. If enabled during compilation, the
860 SOA data arrays are added to this list for a total of 26 arrays.
861 
862 Using these 26 arrays in a single, unrestricted dispatch will result in 26
863 instantiations of the worker template. A double dispatch will generate 676
864 workers. A triple dispatch with no restrictions creates a whopping 17,576
865 functions to handle the possible combinations of arrays. That's a _lot_ of
866 instructions to pack into the final binary object.
867 
868 Applying some simple restrictions can reduce this immensely. Say we know that
869 the arrays will only contain `float`s or `double`s. This would reduce the
870 single dispatch to 4 instantiations, the double dispatch to 16, and the triple
871 to 64. We've just reduced the generated code size significantly. We could even
872 apply such a restriction to just create some 'fast-paths' and let the integral
873 types fallback to using the `vtkDataArray` API by using
874 `vtkDataArrayAccessor`s. Dispatch restriction is a powerful tool for reducing
875 the compiled size of a binary object.
876 
877 Another common restriction is that all arrays in a multi-array dispatch have
878 the same ValueType, even if that ValueType is not known at compile time. By
879 specifying this restriction, a double dispatch on all 26 AOS/SOA arrays will
880 only produce 52 worker instantiations, down from 676. The triple dispatch drops
881 to 104 instantiations from 17,576.
882 
883 Always apply restrictions when they are known, especially for multi-array
884 dispatches. The savings are worth it.
885 
886 ## Types of Dispatchers ## {#VTKAD-TypesOfDispatchers}
887 
888 Now that we've discussed the components of a dispatch operation, what the
889 dispatchers do, and the importance of restricting dispatches, let's take a look
890 at the types of dispatchers available.
891 
892 ---
893 
894 ### vtkArrayDispatch::Dispatch ### {#VTKAD-Dispatch}
895 
896 This family of dispatchers take no parameters and perform an unrestricted
897 dispatch over all arrays in `vtkArrayDispatch::Arrays`.
898 
899 __Variations__:
900 * `vtkArrayDispatch::Dispatch`: Single dispatch.
901 * `vtkArrayDispatch::Dispatch2`: Double dispatch.
902 * `vtkArrayDispatch::Dispatch3`: Triple dispatch.
903 
904 __Arrays considered__: All arrays in `vtkArrayDispatch::Arrays`.
905 
906 __Restrictions__: None.
907 
908 __Usecase__: Used when no useful information exists that can be used to apply
909 restrictions.
910 
911 __Example Usage__:
912 
913 ~~~{.cpp}
914 vtkArrayDispatch::Dispatch::Execute(array, worker);
915 ~~~
916 
917 ---
918 
919 ### vtkArrayDispatch::DispatchByArray ### {#VTKAD-DispatchByArray}
920 
921 This family of dispatchers takes a `vtkTypeList` of explicit array types to use
922 during dispatching. They should only be used when an array's exact type is
923 restricted. If dispatching multiple arrays and only one has such type
924 restrictions, use `vtkArrayDispatch::Arrays` (or a filtered version) for the
925 unrestricted arrays.
926 
927 __Variations__:
928 * `vtkArrayDispatch::DispatchByArray`: Single dispatch.
929 * `vtkArrayDispatch::Dispatch2ByArray`: Double dispatch.
930 * `vtkArrayDispatch::Dispatch3ByArray`: Triple dispatch.
931 
932 __Arrays considered__: All arrays explicitly listed in the parameter lists.
933 
934 __Restrictions__: Array must be explicitly listed in the dispatcher's type.
935 
936 __Usecase__: Used when one or more arrays have known implementations.
937 
938 __Example Usage__:
939 
940 An example here would be a filter that processes an input array of some
941 integral type and produces either a `vtkDoubleArray` or a `vtkFloatArray`,
942 depending on some condition. Since the input array's implementation is unknown
943 (it comes from outside the filter), we'll rely on a ValueType-filtered version
944 of `vtkArrayDispatch::Arrays` for its type. However, we know the output array
945 is either `vtkDoubleArray` or `vtkFloatArray`, so we'll want to be sure to
946 apply that restriction:
947 
948 ~~~{.cpp}
949 // input has an unknown implementation, but an integral ValueType.
950 vtkDataArray *input = ...;
951 
952 // Output is always either vtkFloatArray or vtkDoubleArray:
953 vtkDataArray *output = someCondition ? vtkFloatArray::New()
954  : vtkDoubleArray::New();
955 
956 // Define the valid ArrayTypes for input by filtering
957 // vtkArrayDispatch::Arrays to remove non-integral types:
958 typedef typename vtkArrayDispatch::FilterArraysByValueType
959  <
960  vtkArrayDispatch::Arrays,
961  vtkArrayDispatch::Integrals
962  >::Result InputTypes;
963 
964 // For output, create a new vtkTypeList with the only two possibilities:
965 typedef vtkTypeList_Create_2(vtkFloatArray, vtkDoubleArray) OutputTypes;
966 
967 // Typedef the dispatch to a more manageable name:
968 typedef vtkArrayDispatch::Dispatch2ByArray
969  <
970  InputTypes,
971  OutputTypes
972  > MyDispatch;
973 
974 // Execute the dispatch:
975 MyDispatch::Execute(input, output, someWorker);
976 ~~~
977 
978 ---
979 
980 ### vtkArrayDispatch::DispatchByValueType ### {#VTKAD-DispatchByValueType}
981 
982 This family of dispatchers takes a vtkTypeList of ValueTypes for each array and
983 restricts dispatch to only arrays in vtkArrayDispatch::Arrays that have one of
984 the specified value types.
985 
986 __Variations__:
987 * `vtkArrayDispatch::DispatchByValueType`: Single dispatch.
988 * `vtkArrayDispatch::Dispatch2ByValueType`: Double dispatch.
989 * `vtkArrayDispatch::Dispatch3ByValueType`: Triple dispatch.
990 
991 __Arrays considered__: All arrays in `vtkArrayDispatch::Arrays` that meet the
992 ValueType requirements.
993 
994 __Restrictions__: Arrays that do not satisfy the ValueType requirements are
995 eliminated.
996 
997 __Usecase__: Used when one or more of the dispatched arrays has an unknown
998 implementation, but a known (or restricted) ValueType.
999 
1000 __Example Usage__:
1001 
1002 Here we'll consider a filter that processes three arrays. The first is a
1003 complete unknown. The second is known to hold `unsigned char`, but we don't
1004 know the implementation. The third holds either `double`s or `float`s, but its
1005 implementation is also unknown.
1006 
1007 ~~~{.cpp}
1008 // Complete unknown:
1009 vtkDataArray *array1 = ...;
1010 // Some array holding unsigned chars:
1011 vtkDataArray *array2 = ...;
1012 // Some array holding either floats or doubles:
1013 vtkDataArray *array3 = ...;
1014 
1015 // Typedef the dispatch to a more manageable name:
1016 typedef vtkArrayDispatch::Dispatch3ByValueType
1017  <
1018  vtkArrayDispatch::AllTypes,
1019  vtkTypeList_Create_1(unsigned char),
1020  vtkArrayDispatch::Reals
1021  > MyDispatch;
1022 
1023 // Execute the dispatch:
1024 MyDispatch::Execute(array1, array2, array3, someWorker);
1025 ~~~
1026 
1027 ---
1028 
1029 ### vtkArrayDispatch::DispatchByArrayWithSameValueType ### {#VTKAD-DispatchByArrayWithSameValueType}
1030 
1031 This family of dispatchers takes a `vtkTypeList` of ArrayTypes for each array
1032 and restricts dispatch to only consider arrays from those typelists, with the
1033 added requirement that all dispatched arrays share a ValueType.
1034 
1035 __Variations__:
1036 * `vtkArrayDispatch::Dispatch2ByArrayWithSameValueType`: Double dispatch.
1037 * `vtkArrayDispatch::Dispatch3ByArrayWithSameValueType`: Triple dispatch.
1038 
1039 __Arrays considered__: All arrays in the explicit typelists that meet the
1040 ValueType requirements.
1041 
1042 __Restrictions__: Combinations of arrays with differing ValueTypes are
1043 eliminated.
1044 
1045 __Usecase__: When one or more arrays are known to belong to a restricted set of
1046 ArrayTypes, and all arrays are known to share the same ValueType, regardless of
1047 implementation.
1048 
1049 __Example Usage__:
1050 
1051 Let's consider a double array dispatch, with `array1` known to be one of four
1052 common array types (AOS `float`, `double`, `int`, and `vtkIdType` arrays), and
1053 the other is a complete unknown, although we know that it holds the same
1054 ValueType as `array1`.
1055 
1056 ~~~{.cpp}
1057 // AOS float, double, int, or vtkIdType array:
1058 vtkDataArray *array1 = ...;
1059 // Unknown implementation, but the ValueType matches array1:
1060 vtkDataArray *array2 = ...;
1061 
1062 // array1's possible types:
1063 typedef vtkTypeList_Create_4(vtkFloatArray, vtkDoubleArray,
1064  vtkIntArray, vtkIdTypeArray) Array1Types;
1065 
1066 // array2's possible types:
1067 typedef typename vtkArrayDispatch::FilterArraysByValueType
1068  <
1069  vtkArrayDispatch::Arrays,
1070  vtkTypeList_Create_4(float, double, int, vtkIdType)
1071  > Array2Types;
1072 
1073 // Typedef the dispatch to a more manageable name:
1074 typedef vtkArrayDispatch::Dispatch2ByArrayWithSameValueType
1075  <
1076  Array1Types,
1077  Array2Types
1078  > MyDispatch;
1079 
1080 // Execute the dispatch:
1081 MyDispatch::Execute(array1, array2, someWorker);
1082 ~~~
1083 
1084 ---
1085 
1086 ### vtkArrayDispatch::DispatchBySameValueType ### {#VTKAD-DispatchBySameValueType}
1087 
1088 This family of dispatchers takes a single `vtkTypeList` of ValueType and
1089 restricts dispatch to only consider arrays from `vtkArrayDispatch::Arrays` with
1090 those ValueTypes, with the added requirement that all dispatched arrays share a
1091 ValueType.
1092 
1093 __Variations__:
1094 * `vtkArrayDispatch::Dispatch2BySameValueType`: Double dispatch.
1095 * `vtkArrayDispatch::Dispatch3BySameValueType`: Triple dispatch.
1096 * `vtkArrayDispatch::Dispatch2SameValueType`: Double dispatch using
1097 `vtkArrayDispatch::AllTypes`.
1098 * `vtkArrayDispatch::Dispatch3SameValueType`: Triple dispatch using
1099 `vtkArrayDispatch::AllTypes`.
1100 
1101 __Arrays considered__: All arrays in `vtkArrayDispatch::Arrays` that meet the
1102 ValueType requirements.
1103 
1104 __Restrictions__: Combinations of arrays with differing ValueTypes are
1105 eliminated.
1106 
1107 __Usecase__: When one or more arrays are known to belong to a restricted set of
1108 ValueTypes, and all arrays are known to share the same ValueType, regardless of
1109 implementation.
1110 
1111 __Example Usage__:
1112 
1113 Let's consider a double array dispatch, with `array1` known to be one of four
1114 common ValueTypes (`float`, `double`, `int`, and `vtkIdType` arrays), and
1115 `array2` known to have the same ValueType as `array1`.
1116 
1117 ~~~{.cpp}
1118 // Some float, double, int, or vtkIdType array:
1119 vtkDataArray *array1 = ...;
1120 // Unknown, but the ValueType matches array1:
1121 vtkDataArray *array2 = ...;
1122 
1123 // The allowed ValueTypes:
1124 typedef vtkTypeList_Create_4(float, double, int, vtkIdType) ValidValueTypes;
1125 
1126 // Typedef the dispatch to a more manageable name:
1127 typedef vtkArrayDispatch::Dispatch2BySameValueType
1128  <
1129  ValidValueTypes
1130  > MyDispatch;
1131 
1132 // Execute the dispatch:
1133 MyDispatch::Execute(array1, array2, someWorker);
1134 ~~~
1135 
1136 ---
1137 
1138 # Advanced Usage # {#VTKAD-AdvancedUsage}
1139 
1140 ## Accessing Memory Buffers ## {#VTKAD-AccessingMemoryBuffers}
1141 
1142 Despite the thin `vtkGenericDataArray` API's nice feature that compilers can
1143 optimize memory accesses, sometimes there are still legitimate reasons to
1144 access the underlying memory buffer. This can still be done safely by providing
1145 overloads to your worker's `operator()` method. For instance,
1146 `vtkDataArray::DeepCopy` uses a generic implementation when mixed array
1147 implementations are used, but has optimized overloads for copying between
1148 arrays with the same ValueType and implementation. The worker for this dispatch
1149 is shown below as an example:
1150 
1151 ~~~{.cpp}
1152 // Copy tuples from src to dest:
1153 struct DeepCopyWorker
1154 {
1155  // AoS --> AoS same-type specialization:
1156  template <typename ValueType>
1157  void operator()(vtkAOSDataArrayTemplate<ValueType> *src,
1158  vtkAOSDataArrayTemplate<ValueType> *dst)
1159  {
1160  std::copy(src->Begin(), src->End(), dst->Begin());
1161  }
1162 
1163  // SoA --> SoA same-type specialization:
1164  template <typename ValueType>
1165  void operator()(vtkSOADataArrayTemplate<ValueType> *src,
1166  vtkSOADataArrayTemplate<ValueType> *dst)
1167  {
1168  vtkIdType numTuples = src->GetNumberOfTuples();
1169  for (int comp; comp < src->GetNumberOfComponents(); ++comp)
1170  {
1171  ValueType *srcBegin = src->GetComponentArrayPointer(comp);
1172  ValueType *srcEnd = srcBegin + numTuples;
1173  ValueType *dstBegin = dst->GetComponentArrayPointer(comp);
1174 
1175  std::copy(srcBegin, srcEnd, dstBegin);
1176  }
1177  }
1178 
1179  // Generic implementation:
1180  template <typename Array1T, typename Array2T>
1181  void operator()(Array1T *src, Array2T *dst)
1182  {
1183  vtkDataArrayAccessor<Array1T> s(src);
1184  vtkDataArrayAccessor<Array2T> d(dst);
1185 
1186  typedef typename vtkDataArrayAccessor<Array2T>::APIType DestType;
1187 
1188  vtkIdType tuples = src->GetNumberOfTuples();
1189  int comps = src->GetNumberOfComponents();
1190 
1191  for (vtkIdType t = 0; t < tuples; ++t)
1192  {
1193  for (int c = 0; c < comps; ++c)
1194  {
1195  d.Set(t, c, static_cast<DestType>(s.Get(t, c)));
1196  }
1197  }
1198  }
1199 };
1200 ~~~
1201 
1202 # Putting It All Together # {#VTKAD-PuttingItAllTogether}
1203 
1204 Now that we've explored the new tools introduced with VTK 7.1 that allow
1205 efficient, implementation agnostic array access, let's take another look at the
1206 `calcMagnitude` example from before and identify the key features of the
1207 implementation:
1208 
1209 ~~~{.cpp}
1210 // Modern implementation of calcMagnitude using new concepts in VTK 7.1:
1211 struct CalcMagnitudeWorker
1212 {
1213  template <typename VectorArray, typename MagnitudeArray>
1214  void operator()(VectorArray *vectors, MagnitudeArray *magnitude)
1215  {
1216  VTK_ASSUME(vectors->GetNumberOfComponents() == 3);
1217  VTK_ASSUME(magnitude->GetNumberOfComponents() == 1);
1218 
1219  vtkDataArrayAccessor<VectorArray> v(vectors);
1220  vtkDataArrayAccessor<MagnitudeArray> m(magnitude);
1221 
1222  vtkIdType numVectors = vectors->GetNumberOfTuples();
1223  for (vtkIdType tupleIdx = 0; tupleIdx < numVectors; ++tupleIdx)
1224  {
1225  m.Set(tupleIdx, 0, std::sqrt(v.Get(tupleIdx, 0) * v.Get(tupleIdx, 0) +
1226  v.Get(tupleIdx, 1) * v.Get(tupleIdx, 1) +
1227  v.Get(tupleIdx, 2) * v.Get(tupleIdx, 2)));
1228  }
1229  }
1230 };
1231 
1232 void calcMagnitude(vtkDataArray *vectors, vtkDataArray *magnitude)
1233 {
1234  CalcMagnitudeWorker worker;
1235  typedef vtkArrayDispatch::Dispatch2ByValueType
1236  <
1237  vtkArrayDispatch::AllTypes,
1238  vtkArrayDispatch::Reals
1239  > Dispatcher;
1240 
1241  if (!Dispatcher::Execute(vectors, magnitude, worker))
1242  {
1243  worker(vectors, magnitude); // vtkDataArray fallback
1244  }
1245 }
1246 ~~~
1247 
1248 This implementation:
1249 
1250 * Uses dispatch restrictions to reduce the number of instantiated templated
1251 worker functions.
1252  * Assuming 26 types are in `vtkArrayDispatch::Arrays` (13 AOS + 13 SOA).
1253  * The first array is unrestricted. All 26 array types are considered.
1254  * The second array is restricted to `float` or `double` ValueTypes, which
1255  translates to 4 array types (one each, SOA and AOS).
1256  * 26 * 4 = 104 possible combinations exist. We've eliminated 26 * 22 = 572
1257  combinations that an unrestricted double-dispatch would have generated (it
1258  would create 676 instantiations).
1259 * The calculation is still carried out at `double` precision when the ValueType
1260 restrictions are not met.
1261  * Just because we don't want those other 572 cases to have special code
1262  generated doesn't necessarily mean that we wouldn't want them to run.
1263  * Thanks to `vtkDataArrayAccessor`, we have a fallback implementation that
1264  reuses our templated worker code.
1265  * In this case, the dispatch is really just a fast-path implementation for
1266  floating point output types.
1267 * The performance should be identical to iterating through raw memory buffers.
1268  * The `vtkGenericDataArray` API is transparent to the compiler. The
1269  specialized instantiations of `operator()` can be heavily optimized since the
1270  memory access patterns are known and well-defined.
1271  * Using `VTK_ASSUME` tells the compiler that the arrays have known strides,
1272  allowing further compile-time optimizations.
1273 
1274 Hopefully this has convinced you that the `vtkArrayDispatch` and related tools
1275 are worth using to create flexible, efficient, typesafe implementations for
1276 your work with VTK. Please direct any questions you may have on the subject to
1277 the VTK mailing lists.