When testing the FindMatchLength function from the Snappy compression library on RISC-V architecture, I observed that the first implementation (optimized for x86-64, PPC, and ARM under little-endian conditions) significantly outperforms the fallback implementation used for other architectures. The first implementation leverages 64-bit operations and specific optimizations, such as unaligned 64-bit loads and conditional moves, which appear to contribute to its superior performance. In contrast, the fallback implementation, which is used for RISC-V, relies on 32-bit operations and a simpler loop structure, resulting in lower efficiency.