Skip to content

Latest commit

 

History

History
182 lines (121 loc) · 8.44 KB

PaperByConference.md

File metadata and controls

182 lines (121 loc) · 8.44 KB

2020

ICLR

Pruning

  • Comparing Fine-tuning and Rewinding in Neural Network Pruning
  • A Signal Propagation Perspective for Pruning Neural Networks at Initialization
  • Data-Independent Neural Pruning via Coresets
  • One-Shot Pruning of Recurrent Neural Networks by Jacobian Spectrum Evaluation
  • Lookahead: A Far-sighted Alternative of Magnitude-based Pruning
  • Dynamic Model Pruning with Feedback

Quantization

  • Linear Symmetric Quantization of Neural Networks for Low-precision Integer Hardware
  • AutoQ: Automated Kernel-Wise Neural Network Quantization
  • Additive Powers-of-Two Quantization: A Non-uniform Discretization for Neural Networks
  • Learned Step Size Quantization
  • Sampling-Free Learning of Bayesian Quantized Neural Networks
  • Gradient $\ell_1$ Regularization for Quantization Robustness
  • BinaryDuo: Reducing Gradient Mismatch in Binary Activation Network by Coupling Binary Activations
  • Training binary neural networks with real-to-binary convolutions
  • Critical initialisation in continuous approximations of binary neural networks

2019

NeurIPS

  • Efficient and Effective Quantization for Sparse DNNs
  • Focused Quantization for Sparse CNNs [paper]
  • Point-Voxel CNN for Efficient 3D Deep Learning [paper]
  • Model Compression with Adversarial Robustness: A Unified Optimization Framework [paper]

Quantization

  • MetaQuant: Learning to Quantize by Learning to Penetrate Non-differentiable Quantization [paper] [codes]
  • Latent Weights Do Not Exist: Rethinking Binarized Neural Network Optimization [paper]

Post Quantization

  • Post-training 4-bit quantization of convolution networks for rapid-deployment

Gradient Compression

  • Qsparse-local-SGD: Distributed SGD with Quantization, Sparsification, and Local Computations [paper]
  • PowerSGD: Practical Low-Rank Gradient Compression for Distributed Optimization [paper]

Pruning

  • AutoPrune: Automatic Network Pruning by Regularizing Auxiliary Parameters

Unstructure Pruning

  • Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask
  • Global Sparse Momentum SGD for Pruning Very Deep Neural Networks [paper][codes]

Structrue Pruning

  • Channel Gating Neural Network
  • Gate Decorator: Global Filter Pruning Method for Accelerating Deep Convolutional Neural Networks

Distillation

  • Positive-Unlabeled Compression on the Cloud [paper]

Factorization

  • Einconv: Exploring Unexplored Tensor Decompositions for Convolutional Neural Networks [paper] [codes]
  • A Tensorized Transformer for Language Modeling [paper]

Efficient Model Design

  • Shallow RNN: Accurate Time-series Classification on Resource Constrained Devices [paper]
  • CondConv: Conditionally Parameterized Convolutions for Efficient Inference [paper]

Dynamic Inference

  • SCAN: A Scalable Neural Networks Framework Towards Compact and Efficient Models [paper]

Neural Architecture Search

  • Constrained deep neural network architecture search for IoT devices accounting hardware calibration [paper]
  • DATA: Differentiable ArchiTecture Approximation [paper]
  • Efficient Forward Architecture Search [paper]

Cost (Energy, Memory, Time) Saving Training

  • Hybrid 8-bit Floating Point (HFP8) Training and Inference for Deep Neural Networks [paper]
  • E2-Train: Training State-of-the-art CNNs with Over 80% Energy Savings [paper]
  • Backprop with Approximate Activations for Memory-efficient Network Training [paper]

Theory

  • Dimension-Free Bounds for Low-Precision Training [paper]
  • A Mean Field Theory of Quantized Deep Networks: The Quantization-Depth Trade-Off [paper]

CVPR

  • Learning to Quantize Deep Networks by Optimizing Quantization Intervals with Task Loss
  • Simultaneously Optimizing Weight and Quantizer of Ternary Neural Network using Truncated Gaussian Approximation
  • Structured Pruning of Neural Networks with Budget-Aware Regularization
  • Towards Optimal Structured CNN Pruning via Generative Adversarial Learning
  • Centripetal SGD for Pruning Very Deep Convolutional Networks with Complicated Structure
  • Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration
  • ECC: Platform-Independent Energy-Constrained Deep Neural Network Compression via a Bilinear Regression Model
  • Cascaded Projection: End-to-End Network Compression and Acceleration
  • Accelerating Convolutional Neural Networks via Activation Map Compression
  • Energy-Constrained Compression for Deep Neural Networks via Weighted Sparse Projection and Layer Input Masking
  • Factorized Convolutional Neural Networks
  • Exploiting Kernel Sparsity and Entropy for Interpretable CNN Compression
  • A Main/Subsidiary Network Framework for Simplifying Binary Neural Networks
  • Binary Ensemble Neural Network: More Bits per Network or More Networks per Bit?
  • Cross Domain Model Compression by Structurally Weight Sharing

ICML

  • Improving Neural Network Quantization without Retraining using Outlier Channel Splitting
  • Same, Same But Different-Recovering Neural Network Quantization Error Through Weight Factorization
  • Parameter Efficient Training of Deep Convolutional Neural Networks by Dynamic Sparse Reparameterization
  • Variational inference for sparse network reconstruction from count data
  • Collaborative Channel Pruning for Deep Networks

ICLR

  • Understanding Straight-Through Estimator in Training Activation Quantized Neural Nets
  • Minimal Random Code Learning: Getting Bits back from Compressed Model Parameters

2018

NeurIPS

  • Scalable Methods for 8-bit Training of Neural Networks
  • Heterogeneous Bitwidth Binarization in Convolutional Neural Networks
  • HitNet: Hybrid Ternary Recurrent Neural Network

ICML

  • WSNet: Compact and Efficient Networks Through Weight Sampling

ICLR

  • Espresso: Efficient Forward Propagation for BCNNs
  • An Empirical study of Binary Neural Networks' Optimisation
  • Learning Discrete Weights Using the Local Reparameterization Trick
  • On the Universal Approximability and Complexity Bounds of Quantized ReLU Neural Networks
  • Learning To Share: Simultaneous Parameter Tying and Sparsification in Deep Learning

ECCV

  • Bi-Real Net: Enhancing the Performance of 1-bit CNNs With Improved Representational Capability and Advanced Training Algorithm
  • Value-aware Quantization for Training and Inference of Neural Networks
  • LSQ++: Lower running time and higher recall in multi-codebook quantization
  • LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks

CVPR

  • NISP: Pruning Networks using Neuron Importance Score Propagation
  • SYQ: Learning Symmetric Quantization For Efficient Deep Neural Networks

IJCAI

  • Improving Deep Neural Network Sparsity through Decorrelation Regularization

2016

ICML

  • Fixed Point Quantization of Deep Convolutional Networks

2014

NIPS

  • Expectation Backpropagation: Parameter-Free Training of Multilayer Neural Networks with Continuous or Discrete Weights