pkgsrc.se | The NetBSD package collection

./devel/libdeflate, Optimized deflate/zlib/gzip library

[ CVSweb ] [ Homepage ] [ RSS ] [ Required by ] [ Add to tracker ]

Branch: CURRENT, Version: 1.23, Package name: libdeflate-1.23, Maintainer: bsiegert

libdeflate is a library for fast, whole-buffer DEFLATE-based compression and
decompression.

The supported formats are:

- DEFLATE (raw)
- zlib (a.k.a. DEFLATE with a zlib wrapper)
- gzip (a.k.a. DEFLATE with a gzip wrapper)

libdeflate is heavily optimized. It is significantly faster than the zlib
library, both for compression and decompression, and especially on x86
processors. In addition, libdeflate provides optional high compression modes
that provide a better compression ratio than the zlib's "level 9".

libdeflate itself is a library, but the following command-line programs which
use this library are also provided:

* gzip (or gunzip), a program which mostly behaves like the standard equivalent,
except that it does not yet have good streaming support and therefore does not
yet support very large files
* benchmark, a program for benchmarking in-memory compression and decompression

Master sites:

https://github.com/ebiggers/ (Download)

Filesize: 192.89 KB

Version history: (Expand)

(2024-12-24) Updated to version: libdeflate-1.23
(2024-10-08) Updated to version: libdeflate-1.22
(2024-08-05) Updated to version: libdeflate-1.21
(2024-04-05) Updated to version: libdeflate-1.20
(2023-10-19) Updated to version: libdeflate-1.19
(2023-04-26) Updated to version: libdeflate-1.18

CVS history: (Expand)

2024-12-24 13:02:24 by Adam Ciarcinski | Files touched by this commit (2) | Package updated

Log message:
libdeflate: updated to 1.23

Version 1.23

* Fixed bug introduced in 1.20 where incorrect checksums could be calculated if
  libdeflate was compiled with clang at -O0 and run on a CPU supporting AVX512.

* Fixed bug introduced in 1.20 where incorrect checksums could be calculated in
  rare cases on macOS computers that support AVX512 and are running an older
  version of macOS that contains a bug that corrupts AVX512 registers.  This
  could occur only if code outside libdeflate enabled AVX512 in the thread.

* Fixed build error when using -mno-evex512 with clang 18+ or gcc 14+.

* Increased the minimum CMake version to 3.10.

* Further optimized the x86 CRC code.

2024-10-08 17:36:45 by Thomas Klausner | Files touched by this commit (2) | Package updated

Log message:
libdeflate: update to 1.22.

## Version 1.22

* The CMake-based build system now implements a workaround for gcc being paired
  with a too-old binutils version.  This can prevent build errors.

2024-08-14 18:00:02 by Benny Siegert | Files touched by this commit (1)

Log message:
libdeflate: needs c11

Reported by Phil Krylov in PR pkg/58474.

2024-08-05 08:47:55 by Adam Ciarcinski | Files touched by this commit (2) | Package updated

Log message:
libdeflate: updated to 1.21

Version 1.21

* Fixed build error on x86 with gcc 8.1 and gcc 8.2.
* Fixed build error on x86 when gcc 11 is paired with a binutils version that
  doesn't support AVX-VNNI, e.g. as it is on RHEL 9.
* Fixed build error on arm64 with gcc 6.
* Fixed build error on arm64 with gcc 13.1 and later with some -mcpu options.
* Enabled detection of dotprod support in Windows ARM64 builds.

2024-04-05 12:26:46 by Thomas Klausner | Files touched by this commit (2) | Package updated

Log message:
libdeflate: update to 1.20.

## Version 1.20

* Improved CRC-32 performance on recent x86 CPUs by adding
  VPCLMULQDQ-accelerated implementations using 256-bit and 512-bit vectors.

* Improved Adler-32 performance on recent x86 CPUs by adding
  VNNI-accelerated implementations using 256-bit and 512-bit vectors.

* Improved CRC-32 and Adler-32 performance on short inputs.

* Optimized the portable implementation of Adler-32.

* Added some basic optimizations for RISC-V.

* Dropped support for gcc versions older than v4.9 (released in 2014)
  and clang versions older than v3.9 (released in 2016).

* Dropped support for CRC-32 acceleration on 32-bit ARM using the ARMv8 pmull or
  crc32 instructions.  This code only worked on CPUs that also have a 64-bit
  mode, and it was already disabled on many compiler versions due to compiler
  limitations.  CRC-32 acceleration remains fully supported on 64-bit ARM.

2023-10-19 16:52:01 by Thomas Klausner | Files touched by this commit (2) | Package updated

Log message:
libdeflate: update to 1.19.

## Version 1.19

* Added new functions `libdeflate_alloc_compressor_ex()` and
  `libdeflate_alloc_decompressor_ex()`.  These functions allow specifying a
  custom memory allocator on a per-compressor basis.

* libdeflate now always generates Huffman codes with at least 2 codewords.  This
  fixes a compatibility issue where Windows Explorer's ZIP unpacker could not
  decompress DEFLATE streams created by libdeflate.  libdeflate's behavior was
  allowed by the DEFLATE RFC, but not all software was okay with it.  In rare
  cases, compression ratios can be slightly reduced by this change.

* Disabled the use of some compiler intrinsics on MSVC versions where they don't
  work correctly.

* libdeflate can now compress up to the exact size of the output buffer.

* Slightly improved compression performance at levels 1-9.

* Improved the compression ratio of very short inputs.

2023-04-26 17:56:33 by Thomas Klausner | Files touched by this commit (3) | Package updated

Log message:
libdeflate: update to 1.18.

## Version 1.18

* Fixed a bug where the build type didn't default to "Release" when using
  CMake 3.10 or earlier.

* Fixed a bug where some optimized code wasn't used when building with
  Clang 15 or later (x86), or with Clang 16 or later (aarch64).

* Fixed build errors with some architecture and compiler combos:
  * aarch64 with Clang 16
  * armv6kz or armv7e-m with gcc
  * armhf with gcc (on Debian only)

## Version 1.17

(Apologies for another release so soon after v1.16, but the bug fix listed below
needed to go out.)

* Fixed a bug introduced in v1.16 where compression at levels 10-12 would
  sometimes produce an output larger than the size that was returned by the
  corresponding `libdeflate_*_compress_bound()` function.

* Converted the fuzzing scripts to use LLVM's libFuzzer and added them to the
  GitHub Actions workflow.  (This would have detected the above bug.)

* Further improved the support for direct compilation without using the official
  build system.  The top-level source directory no longer needs to be added to
  the include path, and building the programs no longer requires that
  `_FILE_OFFSET_BITS` and `_POSIX_C_SOURCE` be defined on the command line.

## Version 1.16

* Improved the compression ratio at levels 10-12 slightly, mainly levels 11-12.
  Some inputs (such as certain PNG files) see much improved compression ratios.
  As a trade-off, compressing at levels 11-12 is now about 5-20% slower.

* For consistency with zlib, the decompressor now returns an error on some
  invalid inputs that were accepted before.

* Fixed a build error on arm64 with gcc with certain target CPUs.  (Fixes v1.12)

* Fixed a build error on arm32 with gcc 10.1-10.3 and 11.1-11.2.  (Fixes v1.15)

* Fixed a build error on arm32 with gcc in soft float mode.  (Fixes v1.15)

* Fixed a build error in programs/gzip.c with uClibc.  (Fixes v1.15)

* Fixed the install target on Windows.  (Fixes v1.15)

## Version 1.15

* libdeflate now uses CMake instead of a plain Makefile.

* Improved MSVC support.  Enabled most architecture-specific code with MSVC,
  fixed building with clang in MSVC compatibility mode, and other improvements.

* When libdeflate is built with MinGW, the static library and import library are
  now named using the MinGW convention (`*.a` and `*.dll.a`) instead of the
  Visual Studio convention.  This affects the official Windows binaries.

## Version 1.14

Significantly improved decompression performance on all platforms.  Examples
include (measuring DEFLATE only):

| Platform                           | Speedup over v1.13 |
|------------------------------------|--------------------|
| x86_64 (Intel Comet Lake), gcc     | 1.287x             |
| x86_64 (Intel Comet Lake), clang   | 1.437x             |
| x86_64 (Intel Ice Lake), gcc       | 1.332x             |
| x86_64 (Intel Ice Lake), clang     | 1.296x             |
| x86_64 (Intel Sandy Bridge), gcc   | 1.162x             |
| x86_64 (Intel Sandy Bridge), clang | 1.092x             |
| x86_64 (AMD Zen 2), gcc            | 1.263x             |
| x86_64 (AMD Zen 2), clang          | 1.259x             |
| i386 (Intel Comet Lake), gcc       | 1.570x             |
| i386 (Intel Comet Lake), clang     | 1.344x             |
| arm64 (Apple M1), clang            | 1.306x             |
| arm64 (Cortex-A76), clang          | 1.355x             |
| arm64 (Cortex-A55), clang          | 1.190x             |
| arm32 (Cortex-A76), clang          | 1.665x             |
| arm32 (Cortex-A55), clang          | 1.283x             |

Thanks to Dougall Johnson (https://dougallj.wordpress.com/) for ideas for many
of the improvements.

## Version 1.13

* Changed the 32-bit Windows build of the library to use the default calling
  convention (cdecl) instead of stdcall, reverting a change from libdeflate 1.4.

* Fixed a couple macOS compatibility issues with the gzip program.

## Version 1.12

This release focuses on improving the performance of the CRC-32 and Adler-32
checksum algorithms on x86 and ARM (both 32-bit and 64-bit).

* Build updates:

  * Fixed building libdeflate on Apple platforms.

  * For Visual Studio builds, Visual Studio 2015 or later is now required.

* CRC-32 algorithm updates:

  * Improved CRC-32 performance on short inputs on x86 and ARM.

  * Improved CRC-32 performance on Apple Silicon Macs by using a 12-way pmull
    implementation.   Performance on large inputs on M1 is now about 67 GB/s,
    compared to 8 GB/s before, or 31 GB/s with the Apple-provided zlib.

  * Improved CRC-32 performance on some other ARM CPUs by reworking the code so
    that multiple crc32 instructions can be issued in parallel.

  * Improved CRC-32 performance on some x86 CPUs by increasing the stride length
    of the pclmul implementation.

* Adler-32 algorithm updates:

  * Improved Adler-32 performance on some x86 CPUs by optimizing the AVX-2
    implementation.  E.g., performance on Zen 1 improved from 19 to 30 GB/s, and
    on Ice Lake from 35 to 41 GB/s (if the AVX-512 implementation is excluded).

  * Removed the AVX-512 implementation of Adler-32 to avoid CPU frequency
    downclocking, and because the AVX-2 implementation was made faster.

  * Improved Adler-32 performance on some ARM CPUs by optimizing the NEON
    implementation.  E.g., Apple M1 improved from about 36 to 52 GB/s.

## Version 1.11

* Library updates:

  * Improved compression performance slightly.

  * Detect arm64 CPU features on Apple platforms, which should improve
    performance in some areas such as CRC-32 computation.

* Program updates:

  * The included `gzip` and `gunzip` programs now support the `-q` option.

  * The included `gunzip` program now passes through non-gzip data when both
    the `-f` and `-c` options are used.

* Build updates:

  * Avoided a build error on arm32 with certain gcc versions, by disabling
    building `crc32_arm()` as dynamically-dispatched code when needed.

  * Support building with the LLVM toolchain on Windows.

  * Disabled the use of the "stdcall" ABI in static library builds on \ 
Windows.

  * Use the correct `install_name` in macOS builds.

  * Support Haiku builds.

2023-02-13 09:40:35 by Thomas Klausner | Files touched by this commit (1)

Log message:
libdeflate: has a shared library too, so do not default to build dependency