Experimenting with a range of RISC-V vectorisation ops for comparison to the scalar op in llama.cpp.
Posts for: #llama.cpp
Optimising LLM Inference on RISC-V with Custom GEMM Kernels
Extending llama.cpp RISC-V GEMM tile from 4×8 to 4×16 — register pressure analysis, LLVM IR metrics, and cycle-accurate gem5 simulation showing a 31% improvement in cycles per output value.
RISC-V LLM Code Reference
A concise reference for RISC-V vector intrinsics, QEMU emulation, gem5 simulation, and llama.cpp kernel patterns used in LLM inference optimisation.