87 * static final VectorSpecies<Float> SPECIES = FloatVector.SPECIES_512;
88 *
89 * void vectorMultiply(float[] a, float[] b, float[] c) {
90 * int i = 0;
91 * // It is assumed array arguments are of the same size
92 * for (; i < (a.length & ~(SPECIES.length() - 1));
93 * i += SPECIES.length()) {
94 * FloatVector va = FloatVector.fromArray(SPECIES, a, i);
95 * FloatVector vb = FloatVector.fromArray(SPECIES, b, i);
96 * FloatVector vc = va.mul(vb)
97 * vc.intoArray(c, i);
98 * }
99 *
100 * for (; i < a.length; i++) {
101 * c[i] = a[i] * b[i];
102 * }
103 * }
104 * }</pre>
105 *
106 * The scalar computation after the vector computation is required to process the tail of
107 * elements, the length of which is smaller than the species length.
108 *
109 * The example above uses vectors hardcoded to a concrete shape (512-bit). Instead, we could use preferred
110 * species as shown below, to make the code dynamically adapt to optimal shape for the platform on which it runs.
111 *
112 * <pre>{@code
113 * static final VectorSpecies<Float> SPECIES = FloatVector.SPECIES_PREFERRED;
114 * }</pre>
115 *
116 * <h2>Vector operations</h2>
117 * We use the term <em>lanes</em> when defining operations on vectors. The number of lanes
118 * in a vector is the number of scalar elements it holds. For example, a vector of
119 * type {@code Float} and shape {@code VectorShape.S_256_BIT} has eight lanes.
120 * Vector operations can be grouped into various categories and their behavior
121 * generally specified as follows:
122 * <ul>
123 * <li>
124 * A lane-wise unary operation operates on one input vector and produce a
125 * result vector.
126 * For each lane of the input vector the
127 * lane element is operated on using the specified scalar unary operation and
128 * the element result is placed into the vector result at the same lane.
129 * The following pseudocode expresses the behavior of this operation category,
130 * where {@code e} is the element type and {@code EVector} corresponds to the
131 * primitive Vector type:
132 *
133 * <pre>{@code
134 * EVector a = ...;
135 * e[] ar = new e[a.length()];
136 * for (int i = 0; i < a.length(); i++) {
137 * ar[i] = scalar_unary_op(a.get(i));
138 * }
139 * EVector r = EVector.fromArray(a.species(), ar, 0);
140 * }</pre>
141 *
142 * Unless otherwise specified the input and result vectors will have the same
143 * element type and shape.
144 *
145 * <li>
146 * A lane-wise binary operation operates on two input
147 * vectors to produce a result vector.
148 * For each lane of the two input vectors,
149 * a and b say, the corresponding lane elements from a and b are operated on
150 * using the specified scalar binary operation and the element result is placed
151 * into the vector result at the same lane.
152 * The following pseudocode expresses the behavior of this operation category:
153 *
154 * <pre>{@code
155 * EVector a = ...;
156 * EVector b = ...;
157 * e[] ar = new e[a.length()];
158 * for (int i = 0; i < a.length(); i++) {
159 * ar[i] = scalar_binary_op(a.get(i), b.get(i));
160 * }
161 * EVector r = EVector.fromArray(a.species(), ar, 0);
162 * }</pre>
163 *
164 * Unless otherwise specified the two input and result vectors will have the
165 * same element type and shape.
166 *
167 * <li>
168 * Generalizing from unary and binary operations, a lane-wise n-ary
169 * operation operates on n input vectors to produce a
170 * result vector.
171 * N lane elements from each input vector are operated on
172 * using the specified n-ary scalar operation and the element result is placed
173 * into the vector result at the same lane.
174 *
175 * Unless otherwise specified the n input and result vectors will have the same
176 * element type and shape.
177 *
178 * <li>
179 * A vector reduction operation operates on all the lane
180 * elements of an input vector, and applies an accumulation function to all the
181 * lane elements to produce a scalar result.
182 * If the reduction operation is associative then the result may be accumulated
183 * by operating on the lane elements in any order using a specified associative
184 * scalar binary operation and identity value. Otherwise, the reduction
185 * operation specifies the behavior of the accumulation function.
186 * The following pseudocode expresses the behavior of this operation category
187 * if it is associative:
188 * <pre>{@code
189 * EVector a = ...;
190 * e r = <identity value>;
191 * for (int i = 0; i < a.length(); i++) {
192 * r = assoc_scalar_binary_op(r, a.get(i));
193 * }
194 * }</pre>
195 *
196 * Unless otherwise specified the scalar result type and element type will be
197 * the same.
198 *
199 * <li>
200 * A lane-wise binary test operation operates on two input vectors to produce a
201 * result mask. For each lane of the two input vectors, a and b say, the
202 * the corresponding lane elements from a and b are operated on using the
203 * specified scalar binary test operation and the boolean result is placed
204 * into the mask at the same lane.
205 * The following pseudocode expresses the behavior of this operation category:
206 * <pre>{@code
207 * EVector a = ...;
208 * EVector b = ...;
209 * boolean[] ar = new boolean[a.length()];
210 * for (int i = 0; i < a.length(); i++) {
211 * ar[i] = scalar_binary_test_op(a.get(i), b.get(i));
212 * }
213 * VectorMask<E> r = VectorMask.fromArray(a.species(), ar, 0);
214 * }</pre>
215 *
216 * Unless otherwise specified the two input vectors and result mask will have
217 * the same element type and shape.
218 *
219 * <li>
220 * The prior categories of operation can be said to operate within the vector
221 * lanes, where lane access is uniformly applied to all vectors, specifically
222 * the scalar operation is applied to elements taken from input vectors at the
223 * same lane, and if appropriate applied to the result vector at the same lane.
224 * A further category of operation is a cross-lane vector operation where lane
225 * access is defined by the arguments to the operation. Cross-lane operations
226 * generally rearrange lane elements, for example by permutation (commonly
227 * controlled by a {@link jdk.incubator.vector.VectorShuffle}) or by blending (commonly controlled by a
228 * {@link jdk.incubator.vector.VectorMask}). Such an operation explicitly specifies how it rearranges lane
229 * elements.
230 * </ul>
231 *
|
87 * static final VectorSpecies<Float> SPECIES = FloatVector.SPECIES_512;
88 *
89 * void vectorMultiply(float[] a, float[] b, float[] c) {
90 * int i = 0;
91 * // It is assumed array arguments are of the same size
92 * for (; i < (a.length & ~(SPECIES.length() - 1));
93 * i += SPECIES.length()) {
94 * FloatVector va = FloatVector.fromArray(SPECIES, a, i);
95 * FloatVector vb = FloatVector.fromArray(SPECIES, b, i);
96 * FloatVector vc = va.mul(vb)
97 * vc.intoArray(c, i);
98 * }
99 *
100 * for (; i < a.length; i++) {
101 * c[i] = a[i] * b[i];
102 * }
103 * }
104 * }</pre>
105 *
106 * The scalar computation after the vector computation is required to process the tail of
107 * elements, the length of which is smaller than the species length. {@code VectorSpecies} also defines a
108 * {@link jdk.incubator.vector.VectorSpecies#loopBound(int) loopBound()} helper method which can be used in place of
109 * {@code (a.length & ~(SPECIES.length() - 1))} in the above code to determine the terminating condition.
110 *
111 * The example above uses vectors hardcoded to a concrete shape (512-bit). Instead, we could use preferred
112 * species as shown below, to make the code dynamically adapt to optimal shape for the platform on which it runs.
113 *
114 * <pre>{@code
115 * static final VectorSpecies<Float> SPECIES = FloatVector.SPECIES_PREFERRED;
116 * }</pre>
117 *
118 * <h2>Vector operations</h2>
119 * We use the term <em>lanes</em> when defining operations on vectors. The number of lanes
120 * in a vector is the number of scalar elements it holds. For example, a vector of
121 * type {@code Float} and shape {@code VectorShape.S_256_BIT} has eight lanes.
122 * Vector operations can be grouped into various categories and their behavior
123 * generally specified as follows:
124 * <ul>
125 * <li>
126 * A lane-wise unary operation operates on one input vector and produce a
127 * result vector.
128 * For each lane of the input vector the
129 * lane element is operated on using the specified scalar unary operation and
130 * the element result is placed into the vector result at the same lane.
131 * The following pseudocode expresses the behavior of this operation category,
132 * where {@code e} is the element type and {@code EVector} corresponds to the
133 * primitive Vector type:
134 *
135 * <pre>{@code
136 * EVector a = ...;
137 * e[] ar = new e[a.length()];
138 * for (int i = 0; i < a.length(); i++) {
139 * ar[i] = scalar_unary_op(a.lane(i));
140 * }
141 * EVector r = EVector.fromArray(a.species(), ar, 0);
142 * }</pre>
143 *
144 * Unless otherwise specified the input and result vectors will have the same
145 * element type and shape.
146 *
147 * <li>
148 * A lane-wise binary operation operates on two input
149 * vectors to produce a result vector.
150 * For each lane of the two input vectors,
151 * a and b say, the corresponding lane elements from a and b are operated on
152 * using the specified scalar binary operation and the element result is placed
153 * into the vector result at the same lane.
154 * The following pseudocode expresses the behavior of this operation category:
155 *
156 * <pre>{@code
157 * EVector a = ...;
158 * EVector b = ...;
159 * e[] ar = new e[a.length()];
160 * for (int i = 0; i < a.length(); i++) {
161 * ar[i] = scalar_binary_op(a.lane(i), b.lane(i));
162 * }
163 * EVector r = EVector.fromArray(a.species(), ar, 0);
164 * }</pre>
165 *
166 * Unless otherwise specified the two input and result vectors will have the
167 * same element type and shape.
168 *
169 * <li>
170 * Generalizing from unary and binary operations, a lane-wise n-ary
171 * operation operates on n input vectors to produce a
172 * result vector.
173 * N lane elements from each input vector are operated on
174 * using the specified n-ary scalar operation and the element result is placed
175 * into the vector result at the same lane.
176 *
177 * Unless otherwise specified the n input and result vectors will have the same
178 * element type and shape.
179 *
180 * <li>
181 * A vector reduction operation operates on all the lane
182 * elements of an input vector, and applies an accumulation function to all the
183 * lane elements to produce a scalar result.
184 * If the reduction operation is associative then the result may be accumulated
185 * by operating on the lane elements in any order using a specified associative
186 * scalar binary operation and identity value. Otherwise, the reduction
187 * operation specifies the behavior of the accumulation function.
188 * The following pseudocode expresses the behavior of this operation category
189 * if it is associative:
190 * <pre>{@code
191 * EVector a = ...;
192 * e r = <identity value>;
193 * for (int i = 0; i < a.length(); i++) {
194 * r = assoc_scalar_binary_op(r, a.lane(i));
195 * }
196 * }</pre>
197 *
198 * Unless otherwise specified the scalar result type and element type will be
199 * the same.
200 *
201 * <li>
202 * A lane-wise binary test operation operates on two input vectors to produce a
203 * result mask. For each lane of the two input vectors, a and b say, the
204 * the corresponding lane elements from a and b are operated on using the
205 * specified scalar binary test operation and the boolean result is placed
206 * into the mask at the same lane.
207 * The following pseudocode expresses the behavior of this operation category:
208 * <pre>{@code
209 * EVector a = ...;
210 * EVector b = ...;
211 * boolean[] ar = new boolean[a.length()];
212 * for (int i = 0; i < a.length(); i++) {
213 * ar[i] = scalar_binary_test_op(a.lane(i), b.lane(i));
214 * }
215 * VectorMask<E> r = VectorMask.fromArray(a.species(), ar, 0);
216 * }</pre>
217 *
218 * Unless otherwise specified the two input vectors and result mask will have
219 * the same element type and shape.
220 *
221 * <li>
222 * The prior categories of operation can be said to operate within the vector
223 * lanes, where lane access is uniformly applied to all vectors, specifically
224 * the scalar operation is applied to elements taken from input vectors at the
225 * same lane, and if appropriate applied to the result vector at the same lane.
226 * A further category of operation is a cross-lane vector operation where lane
227 * access is defined by the arguments to the operation. Cross-lane operations
228 * generally rearrange lane elements, for example by permutation (commonly
229 * controlled by a {@link jdk.incubator.vector.VectorShuffle}) or by blending (commonly controlled by a
230 * {@link jdk.incubator.vector.VectorMask}). Such an operation explicitly specifies how it rearranges lane
231 * elements.
232 * </ul>
233 *
|