40 * parallelism.
41 *
42 * <p>The abstract class {@link jdk.incubator.vector.Vector} represents an ordered immutable sequence of
43 * values of the same element type 'e' that is one of the following primitive types -
44 * byte, short, int, long, float, or double. The type variable E corresponds to the
45 * boxed element type, specifically the class that wraps a value of e in an object
46 * (such as Integer class that wraps a value of int).
47 *
48 * <p>Vector declares a set of vector operations (methods) that are common to
49 * all element types (such as addition). Subclasses of Vector corresponding to
50 * a specific element type declare further operations that are specific to that element type
51 * (such as access to element values in lanes, logical operations on values of integral
52 * elements types, or transcendental operations on values of floating point element
53 * types). There are six abstract subclasses of {@link jdk.incubator.vector.Vector} corresponding to the supported set of
54 * element types: {@link jdk.incubator.vector.ByteVector}, {@link jdk.incubator.vector.ShortVector},
55 * {@link jdk.incubator.vector.IntVector}, {@link jdk.incubator.vector.LongVector},
56 * {@link jdk.incubator.vector.FloatVector}, and {@link jdk.incubator.vector.DoubleVector}.
57 *
58 * In addition to element type, vectors are parameterized by their <em>shape</em>,
59 * which is their length. The supported shapes are
60 * represented by the enum {@link jdk.incubator.vector.Vector.Shape}.
61 * The combination of element type and shape determines a <em>vector species</em>,
62 * represented by {@link jdk.incubator.vector.Vector.Species}. The various typed
63 * vector classes expose static constants corresponding to the supported species,
64 * and static methods on these types generally take a species as a parameter.
65 * For example,
66 * {@link jdk.incubator.vector.FloatVector#fromArray(Species<Float>, float[], int) FloatVector.fromArray()}
67 * creates and returns a float vector of the specified species, with elements
68 * loaded from the specified float array.
69 *
70 * <p>
71 * The species instance for a specific combination of element type and shape
72 * can be obtained by reading the appropriate static field, as follows:
73 * <p>
74 * {@code Vector.Species<Float> s = FloatVector.SPECIES_256;
75 * <p>
76 *
77 * Code that is agnostic to species can request the "preferred" species for a
78 * given element type, where the optimal size is selected for the current platform:
79 * <p>
80 * {@code Vector.Species<Float> s = FloatVector.SPECIES_PREFERRED;
81 * <p>
82 *
83 * <p>
84 * Here is an example of multiplying elements of two float arrays {@code a and b} using vector computation
85 * and storing result in array {@code c}.
86 * <pre>{@code
87 * static final Vector.Species<Float> SPECIES = FloatVector.SPECIES_512;
88 *
89 * void vectorMultiply(float[] a, float[] b, float[] c) {
90 * int i = 0;
91 * // It is assumed array arguments are of the same size
92 * for (; i < (a.length & ~(SPECIES.length() - 1));
93 * i += SPECIES.length()) {
94 * FloatVector va = FloatVector.fromArray(SPECIES, a, i);
95 * FloatVector vb = FloatVector.fromArray(SPECIES, b, i);
96 * FloatVector vc = va.mul(vb)
97 * vc.intoArray(c, i);
98 * }
99 *
100 * for (; i < a.length; i++) {
101 * c[i] = a[i] * b[i];
102 * }
103 * }
104 * }</pre>
105 *
106 * The scalar computation after the vector computation is required to process the tail of
107 * elements, the length of which is smaller than the species length.
108 *
109 * The example above uses vectors hardcoded to a concrete shape (512-bit). Instead, we could use preferred
110 * species as shown below, to make the code dynamically adapt to optimal shape for the platform on which it runs.
111 *
112 * <pre>{@code
113 * static final Vector.Species<Float> SPECIES = FloatVector.SPECIES_PREFERRED;
114 * }</pre>
115 *
116 * <h2>Vector operations</h2>
117 * We use the term <em>lanes</em> when defining operations on vectors. The number of lanes
118 * in a vector is the number of scalar elements it holds. For example, a vector of
119 * type {@code Float} and shape {@code Shape.S_256_BIT} has eight lanes.
120 * Vector operations can be grouped into various categories and their behavior
121 * generally specified as follows:
122 * <ul>
123 * <li>
124 * A vector unary operation operates on one input vector to produce a
125 * result vector.
126 * For each lane of the input vector the
127 * lane element is operated on using the specified scalar unary operation and
128 * the element result is placed into the vector result at the same lane.
129 * The following pseudocode expresses the behavior of this operation category,
130 * where {@code e} is the element type and {@code EVector} corresponds to the
131 * primitive Vector type:
132 *
133 * <pre>{@code
134 * EVector a = ...;
135 * e[] ar = new e[a.length()];
136 * for (int i = 0; i < a.length(); i++) {
137 * ar[i] = scalar_unary_op(a.get(i));
138 * }
139 * EVector r = EVector.fromArray(a.species(), ar, 0);
140 * }</pre>
141 *
142 * Unless otherwise specified the input and result vectors will have the same
143 * element type and shape.
144 *
145 * <li>
146 * A vector binary operation operates on two input
147 * vectors to produce a result vector.
148 * For each lane of the two input vectors,
149 * a and b say, the corresponding lane elements from a and b are operated on
150 * using the specified scalar binary operation and the element result is placed
151 * into the vector result at the same lane.
152 * The following pseudocode expresses the behavior of this operation category:
153 *
154 * <pre>{@code
155 * EVector a = ...;
156 * EVector b = ...;
157 * e[] ar = new e[a.length()];
158 * for (int i = 0; i < a.length(); i++) {
159 * ar[i] = scalar_binary_op(a.get(i), b.get(i));
160 * }
161 * EVector r = EVector.fromArray(a.species(), ar, 0);
162 * }</pre>
163 *
164 * Unless otherwise specified the two input and result vectors will have the
165 * same element type and shape.
166 *
167 * <li>
168 * Generalizing from unary and binary operations, a vector n-ary
169 * operation operates on n input vectors to produce a
170 * result vector.
171 * N lane elements from each input vector are operated on
172 * using the specified n-ary scalar operation and the element result is placed
173 * into the vector result at the same lane.
174 *
175 * Unless otherwise specified the n input and result vectors will have the same
176 * element type and shape.
177 *
178 * <li>
179 * A vector reduction operation operates on all the lane
180 * elements of an input vector, and applies an accumulation function to all the
181 * lane elements to produce a scalar result.
182 * If the reduction operation is associative then the result may be accumulated
183 * by operating on the lane elements in any order using a specified associative
184 * scalar binary operation and identity value. Otherwise, the reduction
185 * operation specifies the behavior of the accumulation function.
186 * The following pseudocode expresses the behavior of this operation category
187 * if it is associative:
188 * <pre>{@code
189 * EVector a = ...;
190 * e r = <identity value>;
191 * for (int i = 0; i < a.length(); i++) {
192 * r = assoc_scalar_binary_op(r, a.get(i));
193 * }
194 * }</pre>
195 *
196 * Unless otherwise specified the scalar result type and element type will be
197 * the same.
198 *
199 * <li>
200 * A vector binary test operation operates on two input vectors to produce a
201 * result mask. For each lane of the two input vectors, a and b say, the
202 * the corresponding lane elements from a and b are operated on using the
203 * specified scalar binary test operation and the boolean result is placed
204 * into the mask at the same lane.
205 * The following pseudocode expresses the behavior of this operation category:
206 * <pre>{@code
207 * EVector a = ...;
208 * EVector b = ...;
209 * boolean[] ar = new boolean[a.length()];
210 * for (int i = 0; i < a.length(); i++) {
211 * ar[i] = scalar_binary_test_op(a.get(i), b.get(i));
212 * }
213 * Mask<E> r = EVector.maskFromArray(a.species(), ar, 0);
214 * }</pre>
215 *
216 * Unless otherwise specified the two input vectors and result mask will have
217 * the same element type and shape.
218 *
219 * <li>
220 * The prior categories of operation can be said to operate within the vector
221 * lanes, where lane access is uniformly applied to all vectors, specifically
222 * the scalar operation is applied to elements taken from input vectors at the
223 * same lane, and if appropriate applied to the result vector at the same lane.
224 * A further category of operation is a cross-lane vector operation where lane
225 * access is defined by the arguments to the operation. Cross-lane operations
226 * generally rearrange lane elements, for example by permutation (commonly
227 * controlled by a {@link jdk.incubator.vector.Vector.Shuffle}) or by blending (commonly controlled by a
228 * {@link jdk.incubator.vector.Vector.Mask}). Such an operation explicitly specifies how it rearranges lane
229 * elements.
230 * </ul>
231 *
232 * Some vector operations are specified as instance methods (such as adding
233 * one vector to another); others are specified as static methods (such as
234 * loading vector values from an array.)
235 * <p>
236 * If a vector operation does not belong to one of the above categories then
237 * the operation explicitly specifies how it processes the lane elements of
238 * input vectors, and where appropriate expresses the behavior using
239 * pseudocode.
240 *
241 * <p>
242 * Many vector operations provide an additional {@link jdk.incubator.vector.Vector.Mask mask}-accepting
243 * variant.
244 * The mask controls which lanes are selected for application of the scalar
245 * operation. Masks are a key component for the support of control flow in
246 * vector computations.
247 * <p>
248 * For certain operation categories the mask accepting variants can be specified
249 * in generic terms. If a lane of the mask is set then the scalar operation is
250 * applied to corresponding lane elements, otherwise if a lane of a mask is not
251 * set then a default scalar operation is applied and its result is placed into
252 * the vector result at the same lane. The default operation is specified for
253 * the following operation categories:
254 * <ul>
255 * <li>
256 * For a vector n-ary operation the default operation is a function that returns
257 * it's first argument, specifically a lane element of the first input vector.
258 * <li>
259 * For an associative vector reduction operation the default operation is a
260 * function that returns the identity value.
261 * <li>
262 * For vector binary test operation the default operation is a function that
263 * returns false.
264 *</ul>
265 * Otherwise, the mask accepting variant of the operation explicitly specifies
266 * how it processes the lane elements of input vectors, and where appropriate
267 * expresses the behavior using pseudocode.
268 *
269 * <p>
270 * For convenience, many vector operations of arity greater than one provide
271 * an additional scalar-accepting variant (such as adding a constant scalar
272 * value to all lanes of a vector). This variant accepts compatible
273 * scalar values instead of vectors for the second and subsequent input vectors,
274 * if any.
275 * Unless otherwise specified the scalar variant behaves as if each scalar value
276 * is transformed to a vector using the appropriate vector {@code broadcast} operation, and
277 * then the vector accepting vector operation is applied using the transformed
278 * values.
279 *
280 * <h2> Performance notes </h2>
281 * This package depends on the runtime's ability to dynamically compile vector operations
282 * into optimal vector hardware instructions. There is a default scalar implementation
283 * for each operation which is used if the operation cannot be compiled to vector instructions.
284 *
|
40 * parallelism.
41 *
42 * <p>The abstract class {@link jdk.incubator.vector.Vector} represents an ordered immutable sequence of
43 * values of the same element type 'e' that is one of the following primitive types -
44 * byte, short, int, long, float, or double. The type variable E corresponds to the
45 * boxed element type, specifically the class that wraps a value of e in an object
46 * (such as Integer class that wraps a value of int).
47 *
48 * <p>Vector declares a set of vector operations (methods) that are common to
49 * all element types (such as addition). Subclasses of Vector corresponding to
50 * a specific element type declare further operations that are specific to that element type
51 * (such as access to element values in lanes, logical operations on values of integral
52 * elements types, or transcendental operations on values of floating point element
53 * types). There are six abstract subclasses of {@link jdk.incubator.vector.Vector} corresponding to the supported set of
54 * element types: {@link jdk.incubator.vector.ByteVector}, {@link jdk.incubator.vector.ShortVector},
55 * {@link jdk.incubator.vector.IntVector}, {@link jdk.incubator.vector.LongVector},
56 * {@link jdk.incubator.vector.FloatVector}, and {@link jdk.incubator.vector.DoubleVector}.
57 *
58 * In addition to element type, vectors are parameterized by their <em>shape</em>,
59 * which is their length. The supported shapes are
60 * represented by the enum {@link jdk.incubator.vector.VectorShape}.
61 * The combination of element type and shape determines a <em>vector species</em>,
62 * represented by {@link jdk.incubator.vector.VectorSpecies}. The various typed
63 * vector classes expose static constants corresponding to the supported species,
64 * and static methods on these types generally take a species as a parameter.
65 * For example,
66 * {@link jdk.incubator.vector.FloatVector#fromArray(VectorSpecies, float[], int) FloatVector.fromArray()}
67 * creates and returns a float vector of the specified species, with elements
68 * loaded from the specified float array.
69 *
70 * <p>
71 * The species instance for a specific combination of element type and shape
72 * can be obtained by reading the appropriate static field, as follows:
73 * <p>
74 * {@code VectorSpecies<Float> s = FloatVector.SPECIES_256};
75 * <p>
76 *
77 * Code that is agnostic to species can request the "preferred" species for a
78 * given element type, where the optimal size is selected for the current platform:
79 * <p>
80 * {@code VectorSpecies<Float> s = FloatVector.SPECIES_PREFERRED};
81 * <p>
82 *
83 * <p>
84 * Here is an example of multiplying elements of two float arrays {@code a and b} using vector computation
85 * and storing result in array {@code c}.
86 * <pre>{@code
87 * static final VectorSpecies<Float> SPECIES = FloatVector.SPECIES_512;
88 *
89 * void vectorMultiply(float[] a, float[] b, float[] c) {
90 * int i = 0;
91 * // It is assumed array arguments are of the same size
92 * for (; i < (a.length & ~(SPECIES.length() - 1));
93 * i += SPECIES.length()) {
94 * FloatVector va = FloatVector.fromArray(SPECIES, a, i);
95 * FloatVector vb = FloatVector.fromArray(SPECIES, b, i);
96 * FloatVector vc = va.mul(vb)
97 * vc.intoArray(c, i);
98 * }
99 *
100 * for (; i < a.length; i++) {
101 * c[i] = a[i] * b[i];
102 * }
103 * }
104 * }</pre>
105 *
106 * The scalar computation after the vector computation is required to process the tail of
107 * elements, the length of which is smaller than the species length.
108 *
109 * The example above uses vectors hardcoded to a concrete shape (512-bit). Instead, we could use preferred
110 * species as shown below, to make the code dynamically adapt to optimal shape for the platform on which it runs.
111 *
112 * <pre>{@code
113 * static final VectorSpecies<Float> SPECIES = FloatVector.SPECIES_PREFERRED;
114 * }</pre>
115 *
116 * <h2>Vector operations</h2>
117 * We use the term <em>lanes</em> when defining operations on vectors. The number of lanes
118 * in a vector is the number of scalar elements it holds. For example, a vector of
119 * type {@code Float} and shape {@code VectorShape.S_256_BIT} has eight lanes.
120 * Vector operations can be grouped into various categories and their behavior
121 * generally specified as follows:
122 * <ul>
123 * <li>
124 * A lane-wise unary operation operates on one input vector and produce a
125 * result vector.
126 * For each lane of the input vector the
127 * lane element is operated on using the specified scalar unary operation and
128 * the element result is placed into the vector result at the same lane.
129 * The following pseudocode expresses the behavior of this operation category,
130 * where {@code e} is the element type and {@code EVector} corresponds to the
131 * primitive Vector type:
132 *
133 * <pre>{@code
134 * EVector a = ...;
135 * e[] ar = new e[a.length()];
136 * for (int i = 0; i < a.length(); i++) {
137 * ar[i] = scalar_unary_op(a.get(i));
138 * }
139 * EVector r = EVector.fromArray(a.species(), ar, 0);
140 * }</pre>
141 *
142 * Unless otherwise specified the input and result vectors will have the same
143 * element type and shape.
144 *
145 * <li>
146 * A lane-wise binary operation operates on two input
147 * vectors to produce a result vector.
148 * For each lane of the two input vectors,
149 * a and b say, the corresponding lane elements from a and b are operated on
150 * using the specified scalar binary operation and the element result is placed
151 * into the vector result at the same lane.
152 * The following pseudocode expresses the behavior of this operation category:
153 *
154 * <pre>{@code
155 * EVector a = ...;
156 * EVector b = ...;
157 * e[] ar = new e[a.length()];
158 * for (int i = 0; i < a.length(); i++) {
159 * ar[i] = scalar_binary_op(a.get(i), b.get(i));
160 * }
161 * EVector r = EVector.fromArray(a.species(), ar, 0);
162 * }</pre>
163 *
164 * Unless otherwise specified the two input and result vectors will have the
165 * same element type and shape.
166 *
167 * <li>
168 * Generalizing from unary and binary operations, a lane-wise n-ary
169 * operation operates on n input vectors to produce a
170 * result vector.
171 * N lane elements from each input vector are operated on
172 * using the specified n-ary scalar operation and the element result is placed
173 * into the vector result at the same lane.
174 *
175 * Unless otherwise specified the n input and result vectors will have the same
176 * element type and shape.
177 *
178 * <li>
179 * A vector reduction operation operates on all the lane
180 * elements of an input vector, and applies an accumulation function to all the
181 * lane elements to produce a scalar result.
182 * If the reduction operation is associative then the result may be accumulated
183 * by operating on the lane elements in any order using a specified associative
184 * scalar binary operation and identity value. Otherwise, the reduction
185 * operation specifies the behavior of the accumulation function.
186 * The following pseudocode expresses the behavior of this operation category
187 * if it is associative:
188 * <pre>{@code
189 * EVector a = ...;
190 * e r = <identity value>;
191 * for (int i = 0; i < a.length(); i++) {
192 * r = assoc_scalar_binary_op(r, a.get(i));
193 * }
194 * }</pre>
195 *
196 * Unless otherwise specified the scalar result type and element type will be
197 * the same.
198 *
199 * <li>
200 * A lane-wise binary test operation operates on two input vectors to produce a
201 * result mask. For each lane of the two input vectors, a and b say, the
202 * the corresponding lane elements from a and b are operated on using the
203 * specified scalar binary test operation and the boolean result is placed
204 * into the mask at the same lane.
205 * The following pseudocode expresses the behavior of this operation category:
206 * <pre>{@code
207 * EVector a = ...;
208 * EVector b = ...;
209 * boolean[] ar = new boolean[a.length()];
210 * for (int i = 0; i < a.length(); i++) {
211 * ar[i] = scalar_binary_test_op(a.get(i), b.get(i));
212 * }
213 * VectorMask<E> r = VectorMask.fromArray(a.species(), ar, 0);
214 * }</pre>
215 *
216 * Unless otherwise specified the two input vectors and result mask will have
217 * the same element type and shape.
218 *
219 * <li>
220 * The prior categories of operation can be said to operate within the vector
221 * lanes, where lane access is uniformly applied to all vectors, specifically
222 * the scalar operation is applied to elements taken from input vectors at the
223 * same lane, and if appropriate applied to the result vector at the same lane.
224 * A further category of operation is a cross-lane vector operation where lane
225 * access is defined by the arguments to the operation. Cross-lane operations
226 * generally rearrange lane elements, for example by permutation (commonly
227 * controlled by a {@link jdk.incubator.vector.VectorShuffle}) or by blending (commonly controlled by a
228 * {@link jdk.incubator.vector.VectorMask}). Such an operation explicitly specifies how it rearranges lane
229 * elements.
230 * </ul>
231 *
232 * <p>
233 * If a vector operation does not belong to one of the above categories then
234 * the operation explicitly specifies how it processes the lane elements of
235 * input vectors, and where appropriate expresses the behavior using
236 * pseudocode.
237 *
238 * <p>
239 * Many vector operations provide an additional {@link jdk.incubator.vector.VectorMask mask}-accepting
240 * variant.
241 * The mask controls which lanes are selected for application of the scalar
242 * operation. Masks are a key component for the support of control flow in
243 * vector computations.
244 * <p>
245 * For certain operation categories the mask accepting variants can be specified
246 * in generic terms. If a lane of the mask is set then the scalar operation is
247 * applied to corresponding lane elements, otherwise if a lane of a mask is not
248 * set then a default scalar operation is applied and its result is placed into
249 * the vector result at the same lane. The default operation is specified as follows:
250 * <ul>
251 * <li>
252 * For a lane-wise n-ary operation the default operation is a function that returns
253 * it's first argument, specifically the lane element of the first input vector.
254 * <li>
255 * For an associative vector reduction operation the default operation is a
256 * function that returns the identity value.
257 * <li>
258 * For lane-wise binary test operation the default operation is a function that
259 * returns false.
260 * </ul>
261 * Otherwise, the mask accepting variant of the operation explicitly specifies
262 * how it processes the lane elements of input vectors, and where appropriate
263 * expresses the behavior using pseudocode.
264 *
265 * <p>
266 * For convenience, many vector operations of arity greater than one provide
267 * an additional scalar-accepting variant (such as adding a constant scalar
268 * value to all lanes of a vector). This variant accepts compatible
269 * scalar values instead of vectors for the second and subsequent input vectors,
270 * if any.
271 * Unless otherwise specified the scalar variant behaves as if each scalar value
272 * is transformed to a vector using the appropriate vector {@code broadcast} operation, and
273 * then the vector accepting vector operation is applied using the transformed
274 * values.
275 *
276 * <h2> Performance notes </h2>
277 * This package depends on the runtime's ability to dynamically compile vector operations
278 * into optimal vector hardware instructions. There is a default scalar implementation
279 * for each operation which is used if the operation cannot be compiled to vector instructions.
280 *
|