Skip to content

Commit

Permalink
Update Changelog for 0.3.22 (#3964)
Browse files Browse the repository at this point in the history
  • Loading branch information
martin-frbg authored Mar 26, 2023
1 parent 1c2a60e commit c05da59
Showing 1 changed file with 76 additions and 0 deletions.
76 changes: 76 additions & 0 deletions Changelog.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,80 @@
OpenBLAS ChangeLog
====================================================================
Version 0.3.22
26-Mar-2023

general:
- Updated the included LAPACK to Reference-LAPACK release 3.11.0
plus post-release corrections and improvements
- Added initial support for processing with the EMSCRIPTEN javascript
converter (yielding a single-threaded build only)
- Added a threshold for multithreading in SYMM, SYMV and SYR2K
- Increased the threshold for multithreading in SYRK
- OpenBLAS no longer decreases the global OMP_NUM_THREADS when it
exceeds the maximum thread count the library was compiled for.
- fixed ?GETF2 potentially returning NaN with tiny matrix elements
- fixed openblas_set_num_threads to work in USE_OPENMP builds
- fixed cpu core counting in USE_OPENMP builds returning the number
of OMP "places" rather than cores
- fixed interpretation of USE_PERL=0 in build scripts
- fixed linking of the library with libm in CMAKE builds
- fixed startup delays resulting from a wrong default setting of
NO_WARMUP in CMAKE builds
- fixed inconsistent defaults for overriding of LAPACK SPMV, SPR,
SYMV, SYR functions in gmake and CMAKE builds
- fixed stride calculation in the optimized small-matrix path of
complex SYR
- fixed compilation of ReLAPACK with CMAKE
- fixed pkgconfig file contents for INTERFACE64 builds
- fixed building of Reference-LAPACK with recent gfortran
- fixed building with only a subset of precision types on Windows
- added new environment variable OPENBLAS_DEFAULT_NUM_THREADS
- added a GEMV-based implementation of GEMMT
- added support for building under QNX
- updated support for (cross-)building for ALPHA targets

x86_64:
- added autodetection of Intel Raptor Lake cpu models
- added SSCAL microkernels for Haswell and newer targets
- improved the performance of the Haswell DSCAL microkernel
- added CSCAL and ZSCAL microkernels for SkylakeX targets
- fixed detection of gfortran and Cray CCE compilers
- fixed detection of recent versions of the Intel Fortran compiler
- fixed compilation with LLVM to no longer run out of AVX512 registers
- fix cpu type option setting with recent NVIDIA HPC compiler versions
- fixed compilation for/on AMD Ryzen 4 cpus
- fixed compilation of AVX2-capable targets with Apple Clang
- fixed runtime selection of COOPERLAKE in DYNAMIC_ARCH builds
- worked around gcc/llvm using risky FMA operations in CSCAL/ZSCAL
- worked around miscompilations of GEMV, SYMV and ZDOT kernels
by gcc12's tree-vectorizer on OSX and Windows

ARM:
- fixed cross-compilation to ARMV5 and ARMV6 targets with CMAKE

ARMV8:
- fixed cross-compilation to CortexA53 with CMAKE
- fixed compilation with CMAKE and "Arm Compiler for Linux 22.1"
- added cpu autodetection for Cortex X3 and A715
- fixed conditional compilation of SVE-capable targets in DYNAMIC_ARCH
- sped up SVE kernels by removing unnecessary prefetches
- improved the GEMM performance of Neoverse V1
- added SVE kernels for SDOT and DDOT
- added an SBGEMM kernel for Neoverse N2
- improved cpu-specific compiler option selection for Neoverse cpus
- added support for setting CONSISTENT_FPCSR

MIPS64:
- improved MSA capability detection and handling
- added a MIPS64_GENERIC build target
- fixed corner cases in DNRM2

LOONGARCH64:
- fixed handling of the INTERFACE64 option

RISCV:
- fixed handling of the INTERFACE64 option

====================================================================
Version 0.3.21
07-Aug-2022
Expand Down

0 comments on commit c05da59

Please sign in to comment.