https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111933
rsaxvc at gmail dot com changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |rsaxvc at gmail dot com --- Comment #3 from rsaxvc at gmail dot com --- (In reply to Davide Bettio from comment #2) > ...I was writing a function for reading uint32_t and uint64_t values at any > address... I believe memcpy() is the right approach, as dereferencing a misaligned pointer is unaligned behaviour. My suspicion is that assuming unalinged access is unsafe is intentional for ESP32, because some of the internal memories like IRAM require strict alignment, though most do not. Quoting from https://blog.espressif.com/esp32-programmers-memory-model-259444d89387 , "...IRAM has access limitations in terms of alignment of address and size. If an unaligned access is made, it results into an exception. The ESP-IDF, after release 4.2, handles these exceptions transparently to provide load/store as desired by the caller. As these unaligned accesses result in exception, the access is slower than the DRAM access. Typically each exception handling requires approximately 167 CPU cycles (i.e. 0.7 usec per access at 240 MHz or 1 usec per access at 160 MHz)." It does look like the equivalent 16-bit unaligned load could be faster: uint16_t from_unaligned_u16(void*p){ uint16_t ret; memcpy(&ret,p,sizeof(ret)); return ret; } readU16: //round-trips through the stack entry sp, 48 l8ui a8, a2, 0 l8ui a2, a2, 1 s8i a8, sp, 0 s8i a2, sp, 1 l16ui a2, sp, 0 retw.n uint32_t from_unaligned_u16_seq(uint8_t *p){ uint32_t p1 = p[1]; uint32_t p0 = p[0]; return p0 | p1 << 8; } readU16Seq: //works in registers entry sp, 32 l8ui a8, a2, 1 l8ui a2, a2, 0 slli a8, a8, 8 or a2, a8, a2 retw.n But for the 32-bit version I couldn't get anything shorter than what GCC did.