Extending llama.cpp RISC-V GEMM tile from 4×8 to 4×16 — register pressure analysis, LLVM IR metrics, and cycle-accurate gem5 simulation showing a 31% improvement in cycles per output value.