Skip to content

Collection of kernel accelerators optimised for LLM execution

License

Notifications You must be signed in to change notification settings

ECASLab/hls-fpga-accelerators

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

hls-fpga-accelerators

Collection of kernel accelerators optimised for LLM execution

Compilation

Matrix Multiplication

cd matmul
vitis_hls -f matmul.tcl

Possible adjustments through environment variables:

Environment Variable Possible Values Default
DATATYPE FLOAT4, FLOAT8, FLOAT16, FLOAT32, FIXED8, FIXED16 FIXED16
BUS 64, 128, 256, 512, 1024, 2048 512
B_COLS Power of two from 64 on 4096
C_COLS Power of two from 64 on 4096
PART xcu250-figd2104-2L-e, xck26-sfvc784-2LV-c xcu250-figd2104-2L-e

The xcu250-figd2104-2L-e is an Alveo U250, whereas xck26-sfvc784-2LV-c is a Kria K26

Function signature:

void matmul(RawDataT *a, RawDataT *b, RawDataT *c, int a_rows, int b_cols, int c_cols)
  • a: memory-mapped matrix A
  • b: memory-mapped matrix B (assumed transposed)
  • c: memory-mapped matrix C
  • a_rows: rows of matrix A
  • b_cols: columns of matrix A
  • c_cols: columns of matrix C

Matrix/Vector Elementwise

cd elementwise
vitis_hls -f elementwise.tcl

Possible adjustments through environment variables:

Environment Variable Possible Values Default
DATATYPE FLOAT4, FLOAT8, FLOAT16, FLOAT32, FIXED8, FIXED16 FIXED16
BUS 64, 128, 256, 512, 1024, 2048 512
M_COLS Power of two from 64 on 4096
M_ROWS Power of two from 64 on 4096
PART xcu250-figd2104-2L-e, xck26-sfvc784-2LV-c xcu250-figd2104-2L-e

The xcu250-figd2104-2L-e is an Alveo U250, whereas xck26-sfvc784-2LV-c is a Kria K26

Function signature:

void elementwise(RawDataT *in1, RawDataT *in2, RawDataT *out, uint64_t size,
                 int op);
  • in1: memory-mapped matrix A
  • in2: memory-mapped matrix B
  • out: memory-mapped matrix C
  • size: total number of elements of the matrix: cols * rows
  • op: 0: add, 1: multiply

Unary

cd unary
vitis_hls -f unary.tcl

Possible adjustments through environment variables:

Environment Variable Possible Values Default
DATATYPE FLOAT4, FLOAT8, FLOAT16, FLOAT32, FIXED8, FIXED16 FIXED16
BUS 64, 128, 256, 512, 1024, 2048 512
M_COLS Power of two from 64 on 4096
M_ROWS Power of two from 64 on 4096
PART xcu250-figd2104-2L-e, xck26-sfvc784-2LV-c xcu250-figd2104-2L-e
IMPLEXP LUT, STD LUT

The xcu250-figd2104-2L-e is an Alveo U250, whereas xck26-sfvc784-2LV-c is a Kria K26

IMPLEXP: implementation of the exponential. STD: standard HLS library and LUT: approximate LUT interpolation

Function signature:

void unary(RawDataT *in, RawDataT *out, uint64_t size, int op);
  • in1: memory-mapped matrix A
  • out: memory-mapped matrix C
  • size: total number of elements of the matrix: cols * rows
  • op: 0: none, 1: ReLU, 2: SILU

RMSNORM

cd rmsnorm
vitis_hls -f rmsnorm.tcl

Possible adjustments through environment variables:

Environment Variable Possible Values Default
DATATYPE FLOAT4, FLOAT8, FLOAT16, FLOAT32, FIXED8, FIXED16 FIXED16
BUS 64, 128, 256, 512, 1024, 2048 512
M_COLS Power of two from 64 on 4096
M_ROWS Power of two from 64 on 4096
PART xcu250-figd2104-2L-e, xck26-sfvc784-2LV-c xcu250-figd2104-2L-e

The xcu250-figd2104-2L-e is an Alveo U250, whereas xck26-sfvc784-2LV-c is a Kria K26

It is better to use FLOAT data types given the nature of the normalisation.

Function signature:

void rmsnorm(RawDataT *in, RawDataT *out, uint64_t size);
  • in1: memory-mapped matrix A
  • out: memory-mapped matrix C
  • size: total number of elements of the matrix: cols * rows

Softmax

cd softmax
vitis_hls -f softmax.tcl

Possible adjustments through environment variables:

Environment Variable Possible Values Default
DATATYPE FLOAT4, FLOAT8, FLOAT16, FLOAT32 FLOAT16
BUS 64, 128, 256, 512, 1024, 2048 512
M_COLS Power of two from 64 on 4096
M_ROWS Power of two from 64 on 4096
PART xcu250-figd2104-2L-e, xck26-sfvc784-2LV-c xcu250-figd2104-2L-e

The xcu250-figd2104-2L-e is an Alveo U250, whereas xck26-sfvc784-2LV-c is a Kria K26

It is better to use FLOAT data types given the nature of the normalisation.

Function signature:

void softmax(RawDataT *in, RawDataT *out, uint64_t size);
  • in1: memory-mapped matrix A
  • out: memory-mapped matrix C
  • size: total number of elements of the matrix: cols * rows

Authors

About

Collection of kernel accelerators optimised for LLM execution

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages