Thanks for your great work!
I want to test the decoding latency and peak memory usage of these methods. I tried implementing it but the speed is slow than full kv.(Maybe I made some mistakes), so could you please provide the corresponding script as a reference?