HPDPS Lab

All

29 repositories

SDP4Bit
Public
The official implementation of the paper "SDP4Bit: Toward 4-bit Communication Quantization in Sharded Data Parallelism for LLM Training".
Python
•
Apache License 2.0
•5•0•0•0•Updated Dec 25, 2024Dec 25, 2024
hipSZ
Public
A portable implementation of SZ lossy compression for AMD GPUs and Hygon DCUs.
C++
•
Other
•0•7•0•0•Updated Dec 21, 2024Dec 21, 2024
PhoenixOS
Public
Fast OS-level support for GPU checkpoint and restore
C
•
Apache License 2.0
•11•0•0•0•Updated Dec 20, 2024Dec 20, 2024
SC23-AMRIC
Public
Artifacts of SC'23 paper "AMRIC: A Novel In Situ Lossy Compression Framework for Efficient I/O in Adaptive Mesh Refinement Applications"
C++
•2•5•0•0•Updated Aug 5, 2024Aug 5, 2024
SC24-SOLAR
Public
Artifacts of SC'24 paper "A High-Performance Data Loading Framework for Distributed DNN Training with Remote Storage".
Python
•0•2•0•0•Updated Apr 11, 2024Apr 11, 2024
ICS23-GPULZ
Public
GPULZ: Optimizing LZSS Lossless Compression for Multi-byte Data on Modern GPUs
gpu lzss lossless-data-compression
Cuda
•1•14•0•0•Updated Mar 17, 2024Mar 17, 2024
FCBench
Public
FCBench: Benchmarking of Lossless Compression for Floating-Point Data
C
•0•2•1•0•Updated Feb 9, 2024Feb 9, 2024
EuroSys24-AsyncSchedule4IO
Public
Artifact of EuroSys'24 paper "Concealing Compression-accelerated I/O for HPC Applications through In Situ Task Scheduling"
C
•1•0•0•0•Updated Sep 30, 2023Sep 30, 2023
SC23-AMRIC-Image
Public
2•0•0•0•Updated Jun 24, 2023Jun 24, 2023
PPOPP23-TDC
Public
Cuda
•2•5•2•0•Updated Jun 15, 2023Jun 15, 2023
HPDC22-TAC
Public
Artifacts of HPDC'22 paper "TAC: An error-bounded lossy compressor for high-dimensional adaptive mesh refinement (AMR) data"
C++
•1•1•0•0•Updated Apr 11, 2023Apr 11, 2023
ICS23-HEAT
Public
HEAT: A High-Performance Training System for Collaborative Filtering Based Recommendation on CPUs
C++
•1•1•0•0•Updated Apr 11, 2023Apr 11, 2023
PACT22-HBMax
Public
A C++ Library for Influence Maximization
C++
•
Other
•11•0•0•0•Updated Aug 23, 2022Aug 23, 2022
hbmax-pact
Public
Artifacts of PACT’22 paper “HBMAX: Optimizing Memory Efficiency for Parallel Influence Maximization on Multicore Architectures”
C++
•
Other
•0•1•0•0•Updated Aug 3, 2022Aug 3, 2022
VLDB22-COMET
Public
Artifacts of VLDB'22 paper "COMET: A Novel Memory-Efficient Deep Learning TrainingFramework by Using Error-Bounded Lossy Compression"
C++
•
Other
•2•9•1•0•Updated Aug 2, 2022Aug 2, 2022
SC22-HDF5-SZ
Public
HDF5 with SZ lossy compression
C
•
Other
•2•1•0•0•Updated Jul 21, 2022Jul 21, 2022
opthuffmancodec
Public
Highly optimized Huffman encoder and decoder
Cuda
•1•0•0•0•Updated Apr 14, 2022Apr 14, 2022
ICS21-ClickTrain
Public
PatternTrain: A Fast and Accurate Deep CNN Training Framework via Dynamic Fine-Grained Pattern-Based Pruning
training deep-learning dnn pruning
Python
•0•2•0•0•Updated Mar 1, 2022Mar 1, 2022
ipdps22-opthuffdec
Public
Artifacts of IPDPS '22 paper "Optimizing Huffman Decoding for Error-Bounded Lossy Compression on GPUs"
C++
•2•1•0•0•Updated Mar 1, 2022Mar 1, 2022
cuSZ
Public
A GPU accelerated error-bounded lossy compression for scientific data.
Cuda
•
Other
•28•3•0•0•Updated Feb 23, 2022Feb 23, 2022
flower
Public
Flower - A Friendly Federated Learning Framework
Python
•
Apache License 2.0
•924•0•0•0•Updated Oct 31, 2021Oct 31, 2021
IA-SpGEMM
Public
An Input-aware Auto-tuning Framework for Parallel Sparse Matrix-Matrix Multiplication
C
•16•1•0•0•Updated Aug 14, 2021Aug 14, 2021
kronmult
Public
C++
•6•0•0•0•Updated May 12, 2021May 12, 2021
matGadgets
Public
Some gadgets implemented by Matlab.
MATLAB
•
BSD 2-Clause "Simplified" License
•0•0•0•0•Updated May 10, 2021May 10, 2021
SZ_HLS
Public
An implementation of SZ lossy compression in Vivado HLS for Xilinx FPGAs.
C++
•
Other
•2•0•0•0•Updated Jan 9, 2021Jan 9, 2021
precompiled-exolibs
Public
Precompiled nvcomp libraries wth different CUDA versions.
Makefile
•
MIT License
•1•0•0•0•Updated Jan 8, 2021Jan 8, 2021
LCFI
Public
LCFI is an LLVM based lossy compression fault injection tool modified from LLFI.
C++
•
Other
•2•0•0•0•Updated Nov 2, 2020Nov 2, 2020
HPDC19-DeepSZ
Public
DeepSZ: A Novel Framework to Compress Deep Neural Networks by Using Error-Bounded Lossy Compression
Python
•
Other
•2•0•0•0•Updated Oct 7, 2020Oct 7, 2020
JPDC-TSM2X
Public
Implementation of TSM2L and TSM2R -- High-Performance Tall-and-Skinny Matrix-Matrix Multiplication Algorithms for CUDA
Cuda
•
MIT License
•11•0•0•0•Updated Jul 28, 2020Jul 28, 2020