quantize https://github.com/eladhoffer/quantized.pytorch 使用pytorch

我们真的需要模型压缩吗？ http://mitchgordon.me/machine/learning/2020/01/13/do-we-really-need-model-compression.html

[ECCV 2018] PyTorch implementation for AMC: AutoML for Model Compression and Acceleration on Mobile Devices.

https://github.com/mit-han-lab/amc

神经网络量化相关文献集 https://github.com/xu3kev/neural-networks-quantization-notes

Pytorch Implementation of Neural Architecture Optimization https://github.com/renqianluo/NAO_pytorch

XNNPACK：面向手机和浏览器的高效浮点神经网络推理运算器 https://github.com/google/XNNPACK

TensorRT 深度学习优化 https://github.com/ardianumam/Tensorflow-TensorRT

prune https://github.com/jacobgil/pytorch-pruning https://jacobgil.github.io/deeplearning/pruning-deep-learning 2016年的算法

https://github.com/Eric-mingjie/rethinking-network-pruning 提供了6种网络剪枝方法，使用pytorch https://github.com/alexfjw/prunnable-layers-pytorch

【用TensorRT网络定义API实现流行的深度学习网络】 https://github.com/wang-xinyu/tensorrtx

【量化噪声训练极限模型压缩】 https://github.com/pytorch/fairseq/tree/master/examples/quant_noise

(综述)面向移动设备/边缘计算的新型神经网络架构

https://machinethink.net/blog/mobile-architectures/

Pruning AI networks without impacting performance https://github.com/DNNToolBox/Net-Trim-v1

基于Keras的AutoML机器学习自动化库 https://github.com/jhfjhfj1/autokeras

prune & quantize https://github.com/NervanaSystems/distiller intel开源的网络蒸馏工具，使用pytorch框架，不断更新中。。。教程：https://github.com/NervanaSystems/distiller/wiki/Tutorial:-Using-Distiller-to-prune-a-PyTorch-language-model

tencent https://github.com/Tencent/PocketFlow

自动化超参数搜索，后面可接很多不同框架 https://github.com/tobegit3hub/advisor

网络结构搜索，谷歌出品 https://github.com/tensorflow/adanet

Slimmable Neural Networks https://github.com/JiahuiYu/slimmable_networks https://github.com/JiahuiYu/slimmable_networks/tree/detection

Trained Rank Pruning for Efficient Deep Neural Networks https://github.com/yuhuixu1993/Trained-Rank-Pruning

AutoML相关资源列表 https://github.com/dragen1860/awesome-AutoML

深度网络模型压缩与加速相关文献大列表 https://github.com/sun254/awesome-model-compression-and-acceleration

超参优化框架Optuna https://github.com/pfnet/optuna

微软发布的AutoML工具包(自动网络结构搜索/超参优化) https://github.com/Microsoft/nni https://github.com/Microsoft/nni/blob/master/docs/GetStarted.md

Keras模型超参调优工具 https://github.com/autonomio/talos

神经网络机器学习自动化框架 https://github.com/CiscoAI/amla

PyTorch 实现的NEAT (NeuroEvolution of Augmenting Topologies)神经进化算法 https://github.com/uber-research/PyTorch-NEAT/

模型超参自动搜索工具 https://github.com/NVIDIA/Milano

神经网络模型量化方法简介 http://chenrudan.github.io/blog/2018/10/02/networkquantization.html

Rethinking the Value of Network Pruning https://github.com/Eric-mingjie/rethinking-network-pruning

在线超参数优化平台Bender https://github.com/Dreem-Organization/benderopt

Tensorflow Implementation of ChannelNets (NIPS18) https://arxiv.org/abs/1809.01330 https://github.com/HongyangGao/ChannelNets

Neural Architecture Optimization https://github.com/renqianluo/NAO

AdaNet 简介：快速灵活的 AutoML，提供学习保证 https://github.com/tensorflow/adanet https://mp.weixin.qq.com/s?__biz=MzU1OTMyNDcxMQ==&mid=2247485095&idx=1&sn=990ca481921261e9ff5d850bd3753d6e&chksm=fc184defcb6fc4f986862449c630c80e9f285647a6bcac4a9e03c1108d11554a427d62da148e&scene=0&xtrack=1#rd

Sonnet是个基于TensorFlow的库，可以帮助你建立复杂的神经网络。该项目由Deepmind的Malcolm Reynolds创建。 https://github.com/deepmind/sonnet

神经网络简化应用框架 https://github.com/mindsdb/mindsdb

腾讯发布的模型压缩自动化(AutoMC)框架 https://github.com/Tencent/PocketFlow

面向手机优化的量化神经网络算子库 https://github.com/pytorch/QNNPACK

NVIDIA TensorRT：NVIDIA GPU和深度学习加速器的C++高性能推理库 https://github.com/NVIDIA/TensorRT

PeleeNet: An efficient DenseNet architecture for mobile devices https://github.com/Robert-JunWang/PeleeNet

深度网络压缩/加速最新进展列表 https://github.com/MingSun-Tse/EfficientDNNs

用于寻找和分析神经网络重要神经元的工具包 https://github.com/fdalvi/NeuroX

开源:Keras + Hyperopt方便超参优化的简单封装Hyperas http://maxpumperla.com/hyperas/

PyTorch机器学习自动化：自动框架搜索、超参优化 https://github.com/automl/Auto-PyTorch

Tensorized Embedding Layers for Efficient Model Compression https://github.com/KhrulkovV/tt-pytorch

基于pytorch实现模型压缩

https://github.com/666DZY666/model-compression

DeepSpeed：微软的深度学习优化库，让分布式训练更简单、更高效、更有效 https://github.com/microsoft/DeepSpeed https://github.com/microsoft/DeepSpeedExamples

Adlik：深度学习模型端到端优化框架，其模型编译器支持剪枝、量化和结构压缩等多种优化技术，可以方便地用于使用 TensorFlow、 Keras、 PyTorch 等开发的模型，服务平台提供基于部署环境的具有优化运行时的深度学习模型 https://github.com/Adlik/Adlik

Code for paper "Learning to Reweight Examples for Robust Deep Learning" https://github.com/uber-research/learning-to-reweight-examples

Code for “Discrimination-aware-Channel-Pruning-for-Deep-Neural-Networks” https://github.com/SCUT-AILab/DCP

ZeroQ: A Novel Zero Shot Quantization Framework https://github.com/amirgholami/ZeroQ

Graph Transforms to Quantize and Retrain Deep Neural Nets in TensorFlow. https://arxiv.org/abs/1903.08066 https://github.com/Xilinx/graffitist

A PyTorch implementation of "Incremental Network Quantization: Towards Lossless CNNs with Low-Precision Weights" https://github.com/Mxbonn/INQ-pytorch

This repository contains the training code of Quantization Networks introduced in our CVPR 2019 paper: Quantization Networks. https://github.com/aliyun/alibabacloud-quantization-networks

Code released for "FNNP: Fast Neural Network Pruning Using Adaptive Batch Normalization" https://github.com/anonymous47823493/FNNP

Code for the NuerIPS'19 paper "Gate Decorator: Global Filter Pruning Method for Accelerating Deep Convolutional Neural Networks" https://github.com/youzhonghui/gate-decorator-pruning

【PyTorch实现的深度模型压缩】 https://github.com/666DZY666/model-compression

【深度网络压缩文献/代码列表】 https://github.com/csyhhu/Awesome-Deep-Neural-Network-Compression

【模型压缩相关文献资源大列表】 https://github.com/ChanChiChoi/awesome-model-compression

【神经网络压缩与加速资源集锦】 https://github.com/mrgloom/Network-Speed-and-Compression

Neural Rejuvenation: Improving Deep Network Training by Enhancing Computational Resource Utilization at CVPR'19 https://github.com/joe-siyuan-qiao/NeuralRejuvenation-CVPR19

面向目标检测/语义分割的机器学习自动化(AutoML)

https://github.com/NoamRosenberg/AutoML

【神经网络修剪技术研究指南】 https://pan.baidu.com/s/1onGzAUw4pKrySM1uS1HTeg

机器学习模型压缩相关文献、工具、学习资料列表 https://github.com/cedrickchee/awesome-ml-model-compression

深度神经网络修剪 https://towardsdatascience.com/pruning-deep-neural-network-56cae1ec5505

面向图像分类和检测的神经网络压缩 https://arxiv.org/abs/1907.05686 https://ai.facebook.com/blog/compressing-neural-networks-for-image-classification-and-detection/

Partial Channel Connections for Memory-Efficient Differentiable Architecture Search https://github.com/yuhuixu1993/PC-DARTS

Code for: "And the bit goes down: Revisiting the quantization of neural networks" https://github.com/facebookresearch/kill-the-bits

Implementation with latest PyTorch for multi-gpu DARTS https://arxiv.org/abs/1806.09055 https://github.com/alphadl/darts.pytorch1.1 https://github.com/quark0/darts

神经网络架构搜索相关资源大列表 https://github.com/D-X-Y/awesome-NAS

【模型性能工具库，为训练好的神经网络模型提供高级量化和压缩技术】 https://github.com/quic/aimet

【用PyTorch&AutoTorch实现的RegNet神经网络架构搜索】 https://github.com/zhanghang1989/RegNet-Search-PyTorch

【神经网络的修剪】《Neural Network Pruning》 https://nathanhubens.github.io/posts/deep%20learning/2020/05/22/pruning.html

【深度学习超参管理器 https://github.com/megvii-research/hpman

【模型优化基础】《Model Optimization 101》 https://docs.google.com/presentation/d/1tCbwcls4c_Imx0tC3kOW3yINSIbcHuKgcHimCAT3B0Q/edit#slide=id.p

【PyTorch神经网络压缩框架】 https://github.com/openvinotoolkit/nncf_pytorch

QTool：神经网络量化工具箱 https://github.com/blueardour/model-quantization

Group Sparsity: The Hinge Between Filter Pruning and Decomposition for Network Compression. CVPR2020. https://github.com/ofsoundof/group_sparsity

Lossless CNN Channel Pruning via Gradient Resetting and Convolutional Re-parameterization https://github.com/DingXiaoH/ResRep

走马观花AutoML - 知乎 https://zhuanlan.zhihu.com/p/212512984

模型量化论文/文档/代码列表 https://github.com/htqin/awesome-model-quantization

Permute, Quantize, and Fine-tune: Efficient Compression of Neural Networks

https://github.com/uber-research/permute-quantize-finetune

Memory Optimization for Deep Networks https://github.com/utsaslab/MONeT

忒修斯之船启发下的知识蒸馏新思路 https://weibo.com/ttarticle/p/show?id=2309404569773894664467#_0

Intel® Low Precision Optimization Tool：Intel低精度优化工具 https://github.com/intel/lpot

micronet：深度网络模型压缩与部署库

https://github.com/666DZY666/micronet

Using ideas from product quantization for state-of-the-art neural network compression. https://github.com/uber-research/permute-quantize-finetune

A GPU algorithm for sparse matrix-matrix multiplication https://github.com/oresths/tSparse

Parallel Hyperparameter Optimization in Python https://github.com/ARM-software/mango

source code of the paper: Robust Quantization: One Model to Rule Them All https://github.com/moranshkolnik/RobustQuantization

Code and checkpoints of compressed networks for the paper titled "HYDRA: Pruning Adversarially Robust Neural Networks" (NeurIPS 2020) https://github.com/inspire-group/hydra

GPU implementation of Xnor network on inference level. https://github.com/metcan/Binary-Convolutional-Neural-Network-Inference-on-GPU

Towards Optimal Structured CNN Pruning via Generative Adversarial Learning(GAL)

https://github.com/ShaohuiLin/GAL

TensorRT implementation of "RepVGG: Making VGG-style ConvNets Great Again" https://github.com/upczww/TensorRT-RepVGG

Pushing the Limit of Post-Training Quantization by Block Reconstruction https://github.com/yhhhli/BRECQ

Sparsify：易用的autoML神经网络稀疏化优化接口 https://github.com/neuralmagic/sparsify

DeepSparse Engine：用稀疏化模型提供前所未有性能的CPU推理引擎

https://github.com/neuralmagic/deepsparse

神经网络量化/低位定点训练硬件友好算法设计相关资料集 https://github.com/A-suozhang/awesome-quantization-and-fixed-point-training

“机器学习优化”课程资料 github.com/rishabhk108/AdvancedOptML

《 BNN - BN = ? Training Binary Neural Networks without Batch Normalization》(CVPRW 2021) github.com/VITA-Group/BNN_NoBN

《AngularGrad: A New Optimization Technique for Angular Convergence of Convolutional Neural Networks》(2021) github.com/mhaut/AngularGrad

Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better https://www.arxiv-vanity.com/papers/2106.08962

Forward 深度学习推理加速框架 - A library for high performance deep learning inference on NVIDIA GPUs. github.com/Tencent/Forward

Triton：让没有CUDA经验的研究人员编写高效的GPU代码 github.com/openai/triton

mmsegmentation-distiller：基于mmsegmentation的知识蒸馏工具箱 github.com/pppppM/mmsegmentation-distiller

TorchDistiller：知识蒸馏开源PyTorch代码集，特别面向感知任务，包括语义分割、深度估计、目标检测和实例分割 github.com/irfanICMLL/TorchDistiller

用NVIDIA开源模块实现加速SE(3)-Transformer训练，“使用内存少9x，速度比基准官方实现快21x” github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/DrugDiscovery/SE3Transformer

面向计算机视觉知识蒸馏文献集 github.com/lilujunai/Awesome-Knowledge-Distillation-for-CV

DS-Net++: Dynamic Weight Slicing for Efficient Inference in CNNs and Transformers

github.com/changlin31/DS-Net

PP-LCNet: A Lightweight CPU Convolutional Neural Network https://arxiv.org/abs/2109.15099

TinyNeuralNetwork：高效、易用的深度学习模型压缩框架，包含模型结构搜索、剪枝、量化、模型转换等功能 github.com/alibaba/TinyNeuralNetwork

Model Compression Research Package：用于研究神经网络压缩和加速方法的库 github.com/IntelLabs/Model-Compression-Research-Package

利用数据集蒸馏更高效训练机器学习模型 https://ai.googleblog.com/2021/12/training-machine-learning-models-more.html

Intel® Neural Compressor：运行在Intel CPU/GPU上的神经网络压缩库 github.com/intel/neural-compressor

pytorch自动化模型压缩工具库 - 针对pytorch模型的自动化模型结构分析和修改工具集，包含自动分析模型结构的模型压缩算法库 github.com/THU-MIG/torch-model-compression

YOLOv5-Compression - YOLOv5 Series Multi-backbone(TPH-YOLOv5, Ghostnet, ShuffleNetv2, Mobilenetv3Small, EfficientNetLite, PP-LCNet, SwinTransformer YOLO), Module(CBAM, DCN), Pruning (EagleEye, Network Slimming) and Quantization (MQBench) Compression Tool Box.

github.com/Gumpest/YOLOv5-Multibackbone-Compression

OpenDelta：参数高效调优开源框架 github.com/thunlp/OpenDelta

NNCF：神经网络压缩框架 github.com/openvinotoolkit/nncf

scs4onnx：ONNX模型压缩工具 github.com/PINTO0309/scs4onnx

高效深度学习：深度学习过程加速技巧集

github.com/Mountchicken/Efficient-Deep-Learning

ATOM - Automated Tool for Optimized Modelling - A Python package for fast exploration of machine learning pipelines’ github.com/tvdboom/ATOM

【模型量化论文/文档/代码列表】'Awesome Model Quantization - A list of papers, docs, codes about model quantization' by Haotong Qin GitHub: github.com/htqin/awesome-model-quantization

【Model Compression Toolkit (MCT)：模型压缩工具包，用于在受限硬件下高效优化神经网络模型】’Model Compression Toolkit (MCT) - Model Compression Toolkit (MCT) is an open source project for neural network model optimization under efficient, constrained hardware.' by Sony GitHub: github.com/sony/model_optimization

【Awesome AutoDL：深度学习自动化(神经架构搜索和超参数自动优化)相关资源大列表】’Awesome AutoDL - A curated list of automated deep learning (including neural architecture search and hyper-parameter optimization) resources.' by D-X-Y GitHub: github.com/D-X-Y/Awesome-AutoDL

【VoltaML：用于加速机器学习和深度学习模型的轻量开源库，可优化、编译和部署模型到目标CPU和GPU设备，只需一行代码】’Accelerating Huggingface Models using voltaML - VoltaML is a lightweight library to convert and run your ML/DL deep learning models in high performance inference runtimes like TensorRT, TorchScript, ONNX and TVM.' by VoltaML GitHub: github.com/VoltaML/voltaML

【一行代码提高Hugging Face Transformers性能】《BetterTransformer, Out of the Box Performance for Hugging Face Transformers》by Younes Belkada medium.com/pytorch/bettertransformer-out-of-the-box-performance-for-huggingface-transformers-3fbe27d50ab2

【voltaML-fast-stable-diffusion：一行代码加速Stable Diffusion(10x)的轻量库】'voltaML-fast-stable-diffusion - Lightweight library to accelerate Stable-Diffusion, Dreambooth into fastest inference models with single line of code 🔥 🔥' by VoltaML GitHub: github.com/VoltaML/voltaML-fast-stable-diffusion

【Intel平台加速版Hugging Face transformers扩展工具包，利用Intel神经压缩器提供的一套丰富的模型压缩技术: 量化、剪枝、蒸馏等，显著提高了英特尔平台上的推理效率】’Intel® Extension for Transformers: Accelerating Transformer-based Models on Intel Platforms' by Intel GitHub: github.com/intel/intel-extension-for-transformers

【大型 Transformer 模型推理优化】《Large Transformer Model Inference Optimization | Lil'Log》 https://lilianweng.github.io/posts/2023-01-10-inference-optimization/

'mperf - 面向移动/嵌入式平台的算子性能调优工具箱' MegEngine GitHub: github.com/MegEngine/mperf

【Dipoorlet：离线量化工具，可以对给定校准数据集上的ONNX模型进行离线量化。支持多种激活校准算法，如Mse、Minmax、Hist等；支持权重转换以获得更好的量化结果，如BiasCorrection、WeightEqualization等；支持最新的离线微调算法以提高量化性能，如Adaround、Brecq、Qdrop。此外，Dipoorlet能生成多个平台所需的量化参数，并提供详细的量化分析以帮助用户识别模型量化中的准确性瓶颈。安装和使用都很简单，用户需要准备校准数据集，并在Pytorch分布式环境或集群环境中运行Dipoorlet】'Dipoorlet - Offline Quantization Tools for Deploy.' ModelTC GitHub: github.com/ModelTC/Dipoorlet

【Autodistill：一种利用基础模型训练监督模型的方法，可以在没有标签的图像上进行推理。通过自动蒸馏，可以实现从无标签图像到在边缘设备上进行推理的自定义模型，完全无需人工干预。目前，自动蒸馏支持目标检测和实例分割等视觉任务，未来还可以扩展到支持语言等其他模型】'Autodistill - Images to inference with no labeling (use foundation models to train supervised models)' GitHub: github.com/autodistill/autodistill

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Opt.md

Opt.md

[ECCV 2018] PyTorch implementation for AMC: AutoML for Model Compression and Acceleration on Mobile Devices.

(综述)面向移动设备/边缘计算的新型神经网络架构

基于pytorch实现模型压缩

面向目标检测/语义分割的机器学习自动化(AutoML)

Permute, Quantize, and Fine-tune: Efficient Compression of Neural Networks

micronet：深度网络模型压缩与部署库

Towards Optimal Structured CNN Pruning via Generative Adversarial Learning(GAL)

DeepSparse Engine：用稀疏化模型提供前所未有性能的CPU推理引擎

DS-Net++: Dynamic Weight Slicing for Efficient Inference in CNNs and Transformers

YOLOv5-Compression - YOLOv5 Series Multi-backbone(TPH-YOLOv5, Ghostnet, ShuffleNetv2, Mobilenetv3Small, EfficientNetLite, PP-LCNet, SwinTransformer YOLO), Module(CBAM, DCN), Pruning (EagleEye, Network Slimming) and Quantization (MQBench) Compression Tool Box.

高效深度学习：深度学习过程加速技巧集

Files

Opt.md

Latest commit

History

Opt.md

File metadata and controls

[ECCV 2018] PyTorch implementation for AMC: AutoML for Model Compression and Acceleration on Mobile Devices.

(综述)面向移动设备/边缘计算的新型神经网络架构

基于pytorch实现模型压缩

面向目标检测/语义分割的机器学习自动化(AutoML)

Permute, Quantize, and Fine-tune: Efficient Compression of Neural Networks

micronet：深度网络模型压缩与部署库

Towards Optimal Structured CNN Pruning via Generative Adversarial Learning(GAL)

DeepSparse Engine：用稀疏化模型提供前所未有性能的CPU推理引擎

DS-Net++: Dynamic Weight Slicing for Efficient Inference in CNNs and Transformers

YOLOv5-Compression - YOLOv5 Series Multi-backbone(TPH-YOLOv5, Ghostnet, ShuffleNetv2, Mobilenetv3Small, EfficientNetLite, PP-LCNet, SwinTransformer YOLO), Module(CBAM, DCN), Pruning (EagleEye, Network Slimming) and Quantization (MQBench) Compression Tool Box.

高效深度学习：深度学习过程加速技巧集