SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
-
Updated
Jul 3, 2024 - Python
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.
Neural Network Compression Framework for enhanced OpenVINO™ inference
LLMC is an elegant tool for LLM compression.
🤗 Optimum Intel: Accelerate inference with Intel optimization tools
Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models
[CVPR 2023] Towards Any Structural Pruning; LLMs / SAM / Diffusion / Transformers / YOLOv8 / CNNs
Code to reproduce the experiments of the ICLR24-paper: "Sparse Model Soups: A Recipe for Improved Pruning via Model Averaging"
[ECCV 2024] 3D Small Object Detection with Dynamic Spatial Pruning
Config driven, easy backup cli for restic.
[NeurIPS 2023] LLM-Pruner: On the Structural Pruning of Large Language Models. Support LLaMA, Llama-2, BLOOM, Vicuna, Baichuan, etc.
[CVPR'24] Once for Both: Single Stage of Importance and Sparsity Search for Vision Transformer Compression
Sparsity-aware deep learning inference runtime for CPUs
Efficient computing methods developed by Huawei Noah's Ark Lab
Your local Flux surgeon
FasterAI: Prune and Distill your models with FastAI and PyTorch
Official Pytorch Implementation of "Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity"
Chess engine
Prune is a simple tool that lets you remove archives in a folder, deleting any archives not matching the specified retention options.
TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework.
Add a description, image, and links to the pruning topic page so that developers can more easily learn about it.
To associate your repository with the pruning topic, visit your repo's landing page and select "manage topics."