BFloat16 floating-point (BF16) dot product (vector). This instruction delimits the source vectors into pairs of 16-bit BF16 elements. Within each pair, the elements in the first source vector are multiplied by the corresponding elements in the second source vector. The resulting single-precision products are then summed and added destructively to the single-precision element in the destination vector which aligns with the pair of BF16 values in the first source vector. The instruction does not update the FPSCR exception status.
It has encodings from the following instruction sets: A32 ( A1 ) and T32 ( T1 ) .
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | D | 0 | 0 | Vn | Vd | 1 | 1 | 0 | 1 | N | Q | M | 0 | Vm |
if !HaveAArch32BF16Ext() then UNDEFINED; if Q == '1' && (Vd<0> == '1' || Vn<0> == '1' || Vm<0> == '1') then UNDEFINED; integer d = UInt(D:Vd); integer n = UInt(N:Vn); integer m = UInt(M:Vm); integer regs = if Q == '1' then 2 else 1;
15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | D | 0 | 0 | Vn | Vd | 1 | 1 | 0 | 1 | N | Q | M | 0 | Vm |
if InITBlock() then UNPREDICTABLE; if !HaveAArch32BF16Ext() then UNDEFINED; if Q == '1' && (Vd<0> == '1' || Vn<0> == '1' || Vm<0> == '1') then UNDEFINED; integer d = UInt(D:Vd); integer n = UInt(N:Vn); integer m = UInt(M:Vm); integer regs = if Q == '1' then 2 else 1;
<q> |
<Qd> |
Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2. |
<Qn> |
Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2. |
<Qm> |
Is the 128-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field as <Qm>*2. |
<Dd> |
Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field. |
<Dn> |
Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field. |
<Dm> |
Is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field. |
bits(64) operand1; bits(64) operand2; bits(64) result; CheckAdvSIMDEnabled(); for r = 0 to regs-1 operand1 = Din[n+r]; operand2 = Din[m+r]; result = Din[d+r]; for e = 0 to 1 bits(16) elt1_a = Elem[operand1, 2 * e + 0, 16]; bits(16) elt1_b = Elem[operand1, 2 * e + 1, 16]; bits(16) elt2_a = Elem[operand2, 2 * e + 0, 16]; bits(16) elt2_b = Elem[operand2, 2 * e + 1, 16]; bits(32) sum = FPAdd_BF16(BFMulH(elt1_a, elt2_a), BFMulH(elt1_b, elt2_b)); Elem[result, e, 32] = FPAdd_BF16(Elem[result, e, 32], sum); D[d+r] = result;
Internal version only: isa v01_31, pseudocode v2023-06_rel, sve v2023-06_rel ; Build timestamp: 2023-07-04T18:06
Copyright © 2010-2023 Arm Limited or its affiliates. All rights reserved. This document is Non-Confidential.