DD-AVX Library: Library of High Precision Sparse Matrix Operations Accelerated by SIMD

About

DD-AVX_v3 is SIMD accelerated simple interface high precision BLAS Lv.1 and Sparse BLAS Library.

BLAS Lv.1 and Sparse BLAS operations can be performed by combining double and double-double precision.

This library provides an easy way to implement a fast and accurate Krylov subspace method.

OpenMP and SIMD AVX / AVX2 acceleration are available.

This library is extensions of Lis_DD_AVXKernals and DD-AVX_v2 (archived).

Interface

This library provides BLAS / Sparse BLAS functions for the following six types.

Scalar

d_real (alias of double)
dd_real (provided by the QD Library)

Vector

d_real_vector
dd_real_vector

Sparse matrix (CRS format)

d_real_SpMat

It has BLAS Lv.1 and Sparse BLAS functions for these types.

All combinations of BLAS functions are implemented. It works for both D and DD types.

See the axpy sample code and CG method sample code for more information on how to use it.

Build and Install

This library requires the QD library for scalar operations as a submodule. The QD library is downloaded and built automatically by make.

You can specify the destination directory with DDAVX_DIR and make . The QD libraries are installed in the same directory.

You can build and install the QD library with the following commands:

AVX

make avx

make install

AVX2

make avx2

make install

AVX512 (not yet implemented)

In the future, AVX512 can be built with the following command.

make avx512

make install

System Requirements

g++ 7.1 or higher
GNU make

Current Status and Restrictions

This is a beta version, and there are some restrictions and changes planned.

The detailed todo is discussed in Issue

SIMD and OpenMP cannot be disabled. (If you want to change the number of OpenMP threads, do so in the environment variable.)
The class design will be modified to implement element/row/column operations in the sparse matrix class.
(SIMD_REG class is difficult to share with Scalar, so I want to change it to REG class.)
The conversion routine to BCRS format doesn't work because I'm currently trying to rework it to make it multi-threaded.

Document

It can be generated using Doxygen.

Testing

We have a complete set of tests for each feature in the test directory. You can find them in the test directory.

cd test/

make

make test

Name		Name	Last commit message	Last commit date
Latest commit History 158 Commits
.github/workflows		.github/workflows
doc		doc
include		include
lib		lib
sample		sample
src		src
submodules		submodules
test		test
.gitignore		.gitignore
.gitlab-ci.yml		.gitlab-ci.yml
.gitmodules		.gitmodules
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DD-AVX Library: Library of High Precision Sparse Matrix Operations Accelerated by SIMD

About

Interface

Scalar

Vector

Sparse matrix (CRS format)

Build and Install

AVX

AVX2

AVX512 (not yet implemented)

System Requirements

Current Status and Restrictions

Document

Testing

About

Releases

Packages

Languages

License

doi-master/DD-AVX_v3

Folders and files

Latest commit

History

Repository files navigation

DD-AVX Library: Library of High Precision Sparse Matrix Operations Accelerated by SIMD

About

Interface

Scalar

Vector

Sparse matrix (CRS format)

Build and Install

AVX

AVX2

AVX512 (not yet implemented)

System Requirements

Current Status and Restrictions

Document

Testing

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages