SSM conv kernel — RVV vectorised

ggml_compute_forward_ssm_conv_f32_rvv · RISC-V Vector · LMUL=m4 · vl=4 · SpacemiT X60 VLEN=256

—

conv_x active column

weight active column

vsum lanes

output stored

conv_x (src0)

vlse32 stride = ncs×4 bytes — gathers same column from all vl rows

weights (src1)

vlse32 stride = nc×4 bytes

output (dst)

vse32 — one instruction writes all vl lanes

vsum register

LMUL=m4 · vl=4 lanes · f32

RVV instruction

—

Scalar equivalent

—

1 / 6