[[Arm]]

-Armv8.2-A is supported from GCC 7 series
 AArch64
 GCC has been updated to the latest revision of the procedure call standard (AAPCS64) to provide support for parameter passing when data types have been over-aligned.
 The ARMv8.3-A architecture is now supported. It can be used by specifying the -march=armv8.3-a option.
 The option -msign-return-address= is supported to enable return address protection using ARMv8.3-A Pointer Authentication Extensions. For more information on the arguments accepted by this option, please refer to AArch64-Options.
 The ARMv8.2-A architecture and the ARMv8.2-A 16-bit Floating-Point Extensions are now supported. They can be used by specifying the -march=armv8.2-a or -march=armv8.2-a+fp16 options. The 16-bit Floating-Point Extensions introduce new half-precision data processing floating-point instructions.



-FMA for HGEMM
 vfmaq_f16 (float16x8_t __a, float16x8_t __b, float16x8_t __c)
 {
   return __builtin_aarch64_fmav8hf (__b, __c, __a);
 }
-https://github.com/gcc-mirror/gcc/blob/87fb575328cc5d954b91672681aacfc383134b12/gcc/config/aarch64/arm_neon.h#L31225-L31230

-Compiler Explorer
-https://godbolt.org/z/obJxS9
 #include <arm_neon.h>
 #include <iostream>
 
 int main(int argc, char** argv)
 {
     float16_t value[] = {1.0f, 2.0f, 3.0f, 4.0f, 5.0f, 6.0f, 7.0f, 8.0f, 10.0f, 20.0f, 30.0f, 40.0f, 50.0f, 60.0f, 70.0f, 80.0f, 100.0f, 100.0f, 100.0f, 100.0f, 100.0f, 100.0f, 100.0f, 100.0f, };
     float16_t resultWrite[8];
     float16x8_t a = vld1q_f16(value + 0);
     float16x8_t b = vld1q_f16(value + 8);
     float16x8_t c = vld1q_f16(value + 16);
     float16x8_t result = vfmaq_f16(a, b, c);
     vst1q_f16(resultWrite, result);
     std::cout << resultWrite << std::endl;
     return 0;
 }



トップ   新規 一覧 単語検索 最終更新   ヘルプ   最終更新のRSS