Skip to content
Change the repository type filter

All

    Repositories list

    • SDP4Bit

      Public
      The official implementation of the paper "SDP4Bit: Toward 4-bit Communication Quantization in Sharded Data Parallelism for LLM Training".
      Python
      Apache License 2.0
      5000Updated Dec 25, 2024Dec 25, 2024
    • hipSZ

      Public
      A portable implementation of SZ lossy compression for AMD GPUs and Hygon DCUs.
      C++
      Other
      0700Updated Dec 21, 2024Dec 21, 2024
    • PhoenixOS

      Public
      Fast OS-level support for GPU checkpoint and restore
      C
      Apache License 2.0
      11000Updated Dec 20, 2024Dec 20, 2024
    • Artifacts of SC'23 paper "AMRIC: A Novel In Situ Lossy Compression Framework for Efficient I/O in Adaptive Mesh Refinement Applications"
      C++
      2500Updated Aug 5, 2024Aug 5, 2024
    • Artifacts of SC'24 paper "A High-Performance Data Loading Framework for Distributed DNN Training with Remote Storage".
      Python
      0200Updated Apr 11, 2024Apr 11, 2024
    • GPULZ: Optimizing LZSS Lossless Compression for Multi-byte Data on Modern GPUs
      Cuda
      11400Updated Mar 17, 2024Mar 17, 2024
    • FCBench

      Public
      FCBench: Benchmarking of Lossless Compression for Floating-Point Data
      C
      0210Updated Feb 9, 2024Feb 9, 2024
    • Artifact of EuroSys'24 paper "Concealing Compression-accelerated I/O for HPC Applications through In Situ Task Scheduling"
      C
      1000Updated Sep 30, 2023Sep 30, 2023
    • 2000Updated Jun 24, 2023Jun 24, 2023
    • Cuda
      2520Updated Jun 15, 2023Jun 15, 2023
    • Artifacts of HPDC'22 paper "TAC: An error-bounded lossy compressor for high-dimensional adaptive mesh refinement (AMR) data"
      C++
      1100Updated Apr 11, 2023Apr 11, 2023
    • HEAT: A High-Performance Training System for Collaborative Filtering Based Recommendation on CPUs
      C++
      1100Updated Apr 11, 2023Apr 11, 2023
    • A C++ Library for Influence Maximization
      C++
      Other
      11000Updated Aug 23, 2022Aug 23, 2022
    • Artifacts of PACT’22 paper “HBMAX: Optimizing Memory Efficiency for Parallel Influence Maximization on Multicore Architectures”
      C++
      Other
      0100Updated Aug 3, 2022Aug 3, 2022
    • Artifacts of VLDB'22 paper "COMET: A Novel Memory-Efficient Deep Learning TrainingFramework by Using Error-Bounded Lossy Compression"
      C++
      Other
      2910Updated Aug 2, 2022Aug 2, 2022
    • HDF5 with SZ lossy compression
      C
      Other
      2100Updated Jul 21, 2022Jul 21, 2022
    • Highly optimized Huffman encoder and decoder
      Cuda
      1000Updated Apr 14, 2022Apr 14, 2022
    • PatternTrain: A Fast and Accurate Deep CNN Training Framework via Dynamic Fine-Grained Pattern-Based Pruning
      Python
      0200Updated Mar 1, 2022Mar 1, 2022
    • Artifacts of IPDPS '22 paper "Optimizing Huffman Decoding for Error-Bounded Lossy Compression on GPUs"
      C++
      2100Updated Mar 1, 2022Mar 1, 2022
    • cuSZ

      Public
      A GPU accelerated error-bounded lossy compression for scientific data.
      Cuda
      Other
      28300Updated Feb 23, 2022Feb 23, 2022
    • flower

      Public
      Flower - A Friendly Federated Learning Framework
      Python
      Apache License 2.0
      924000Updated Oct 31, 2021Oct 31, 2021
    • IA-SpGEMM

      Public
      An Input-aware Auto-tuning Framework for Parallel Sparse Matrix-Matrix Multiplication
      C
      16100Updated Aug 14, 2021Aug 14, 2021
    • kronmult

      Public
      C++
      6000Updated May 12, 2021May 12, 2021
    • Some gadgets implemented by Matlab.
      MATLAB
      BSD 2-Clause "Simplified" License
      0000Updated May 10, 2021May 10, 2021
    • SZ_HLS

      Public
      An implementation of SZ lossy compression in Vivado HLS for Xilinx FPGAs.
      C++
      Other
      2000Updated Jan 9, 2021Jan 9, 2021
    • Precompiled nvcomp libraries wth different CUDA versions.
      Makefile
      MIT License
      1000Updated Jan 8, 2021Jan 8, 2021
    • LCFI

      Public
      LCFI is an LLVM based lossy compression fault injection tool modified from LLFI.
      C++
      Other
      2000Updated Nov 2, 2020Nov 2, 2020
    • DeepSZ: A Novel Framework to Compress Deep Neural Networks by Using Error-Bounded Lossy Compression
      Python
      Other
      2000Updated Oct 7, 2020Oct 7, 2020
    • Implementation of TSM2L and TSM2R -- High-Performance Tall-and-Skinny Matrix-Matrix Multiplication Algorithms for CUDA
      Cuda
      MIT License
      11000Updated Jul 28, 2020Jul 28, 2020