arm - Alignment requirements for uint8x16_t being loaded from byte array? -
we have assert firing under debug builds checks alignment. assert byte array that's loaded uint8x16_t using vld1q_u8. while assert fires, have not observed sig_bus.
here's use in code:
const byte* input = ...; ... assert(isalignedon(input, getalignmentof(uint8x16_t)); uint64x2_t message = vreinterpretq_u64_u8(vld1q_u8(input)); i tried following, , assert fires alignment of uint8_t*:
assert(isalignedon(input, getalignmentof(uint8_t*)); uint64x2_t message = vreinterpretq_u64_u8(vld1q_u8(input)); what alignment requirements byte array when loading uint8x16_t vld1q_u8?
in above code, input function paramter. isalignedon checks alignment of 2 arguments, ensuring first aligned @ least second. getalignmentof abstraction retrieves alignment type or variable.
uint8x16_t , uint64x2_t 128-bit arm neon vector datatypes expected placed in q register. vld1q_u8 neon pseudo instruction expected compiled vld1.8 instruction. vreinterpretq_u64_u8 neon pseudo instruction eases use of datatypes.
when writing direct assembler (either inline or in external files) can choose whether want specify alignment (e.g. vld1.8 {q0}, [r0, :64]) or leave out (e.g. vld1.8 {q0}, [r0]). if isn't specified, doesn't require specific alignment @ all, dric512 says.
when using vld1q_u8 via intrinsics, don't ever specify alignment, far know, compiler doesn't assume it, , produces instruction without alignment specification. i'm not sure if compilers can deduce cases alignment guaranteed , use alignment specifier in cases. (both gcc, clang , msvc seem produce vld1.8 without alignment specifiers in particular case.)
do note issue on 32 bit arm; in aarch64, there's no alignment specifier ld1 instruction. there, alignment still helps, you'll worse performance if use unaligned addresses.
Comments
Post a Comment