Subject: CVS commit: pkgsrc/graphics/libjpeg-turbo
From: Adam Ciarcinski
Date: 2021-04-26 10:18:48
Message id: 20210426081848.E4F06FA95@cvs.NetBSD.org

Log Message:
libjpeg-turbo: updated to 2.1.0

2.1.0

Significant changes relative to 2.1 beta1

Fixed a regression introduced by 2.1 beta1[6(b)] whereby attempting to \ 
decompress certain progressive JPEG images with one or more component planes of \ 
width 8 or less caused a buffer overrun.

Fixed a regression introduced by 2.1 beta1[6(b)] whereby attempting to \ 
decompress a specially-crafted malformed progressive JPEG image caused the block \ 
smoothing algorithm to read from uninitialized memory.

Fixed an issue in the Arm Neon SIMD Huffman encoders that caused the encoders to \ 
generate incorrect results when using the Clang compiler with Visual Studio.

Fixed a floating point exception (CVE-2021-20205) that occurred when attempting \ 
to compress a specially-crafted malformed GIF image with a specified image width \ 
of 0 using cjpeg.

Fixed a regression introduced by 2.0 beta1[15] whereby attempting to generate a \ 
progressive JPEG image on an SSE2-capable CPU using a scan script containing one \ 
or more scans with lengths divisible by 32 and non-zero successive approximation \ 
low bit positions would, under certain circumstances, result in an error \ 
("Missing Huffman code table entry") and an invalid JPEG image.

Introduced a new flag (TJFLAG_LIMITSCANS in the TurboJPEG C API and \ 
TJ.FLAG_LIMIT_SCANS in the TurboJPEG Java API) and a corresponding TJBench \ 
command-line argument (-limitscans) that causes the TurboJPEG decompression and \ 
transform functions/operations to return/throw an error if a progressive JPEG \ 
image contains an unreasonably large number of scans. This allows applications \ 
that use the TurboJPEG API to guard against an exploit of the progressive JPEG \ 
format described in the report "Two Issues with the JPEG Standard".

The PPM reader now throws an error, rather than segfaulting (due to a buffer \ 
overrun) or generating incorrect pixels, if an application attempts to use the \ 
tjLoadImage() function to load a 16-bit binary PPM file (a binary PPM file with \ 
a maximum value greater than 255) into a grayscale image buffer or to load a \ 
16-bit binary PGM file into an RGB image buffer.

Fixed an issue in the PPM reader that caused incorrect pixels to be generated \ 
when using the tjLoadImage() function to load a 16-bit binary PPM file into an \ 
extended RGB image buffer.

Fixed an issue whereby, if a JPEG buffer was automatically re-allocated by one \ 
of the TurboJPEG compression or transform functions and an error subsequently \ 
occurred during compression or transformation, the JPEG buffer pointer passed by \ 
the application was not updated when the function returned.

2.0.90 (2.1 beta1)

Significant changes relative to 2.0.6:

The build system, x86-64 SIMD extensions, and accelerated Huffman codec now \ 
support the x32 ABI on Linux, which allows for using x86-64 instructions with \ 
32-bit pointers. The x32 ABI is generally enabled by adding -mx32 to the \ 
compiler flags.

Caveats:

CMake 3.9.0 or later is required in order for the build system to automatically \ 
detect an x32 build.
Java does not support the x32 ABI, and thus the TurboJPEG Java API will \ 
automatically be disabled with x32 builds.
Added Loongson MMI SIMD implementations of the RGB-to-grayscale, 4:2:2 fancy \ 
chroma upsampling, 4:2:2 and 4:2:0 merged chroma upsampling/color conversion, \ 
and fast integer DCT/IDCT algorithms. Relative to libjpeg-turbo 2.0.x, this \ 
speeds up:

the compression of RGB source images into grayscale JPEG images by approximately 20%
the decompression of 4:2:2 JPEG images by approximately 40-60% when using fancy \ 
upsampling
the decompression of 4:2:2 and 4:2:0 JPEG images by approximately 15-20% when \ 
using merged upsampling
the compression of RGB source images by approximately 30-45% when using the fast \ 
integer DCT
the decompression of JPEG images into RGB destination images by approximately 2x \ 
when using the fast integer IDCT
The overall decompression speedup for RGB images is now approximately 2.3-3.7x \ 
(compared to 2-3.5x with libjpeg-turbo 2.0.x.)

32-bit (Armv7 or Armv7s) iOS builds of libjpeg-turbo are no longer supported, \ 
and the libjpeg-turbo build system can no longer be used to package such builds. \ 
32-bit iOS apps cannot run in iOS 11 and later, and the App Store no longer \ 
allows them.

32-bit (i386) OS X/macOS builds of libjpeg-turbo are no longer supported, and \ 
the libjpeg-turbo build system can no longer be used to package such builds. \ 
32-bit Mac applications cannot run in macOS 10.15 "Catalina" and \ 
later, and the App Store no longer allows them.

The SSE2 (x86 SIMD) and C Huffman encoding algorithms have been significantly \ 
optimized, resulting in a measured average overall compression speedup of 12-28% \ 
for 64-bit code and 22-52% for 32-bit code on various Intel and AMD CPUs, as \ 
well as a measured average overall compression speedup of 0-23% on platforms \ 
that do not have a SIMD-accelerated Huffman encoding implementation.

The block smoothing algorithm that is applied by default when decompressing \ 
progressive Huffman-encoded JPEG images has been improved in the following ways:

The algorithm is now more fault-tolerant. Previously, if a particular scan was \ 
incomplete, then the smoothing parameters for the incomplete scan would be \ 
applied to the entire output image, including the parts of the image that were \ 
generated by the prior (complete) scan. Visually, this had the effect of \ 
removing block smoothing from lower-frequency scans if they were followed by an \ 
incomplete higher-frequency scan. libjpeg-turbo now applies block smoothing \ 
parameters to each iMCU row based on which scan generated the pixels in that \ 
row, rather than always using the block smoothing parameters for the most recent \ 
scan.
When applying block smoothing to DC scans, a Gaussian-like kernel with a 5x5 \ 
window is used to reduce the "blocky" appearance.
Added SIMD acceleration for progressive Huffman encoding on Arm platforms. This \ 
speeds up the compression of full-color progressive JPEGs by about 30-40% on \ 
average (relative to libjpeg-turbo 2.0.x) when using modern Arm CPUs.

Added configure-time and run-time auto-detection of Loongson MMI SIMD \ 
instructions, so that the Loongson MMI SIMD extensions can be included in any \ 
MIPS64 libjpeg-turbo build.

Added fault tolerance features to djpeg and jpegtran, mainly to demonstrate \ 
methods by which applications can guard against the exploits of the JPEG format \ 
described in the report "Two Issues with the JPEG Standard".

Both programs now accept a -maxscans argument, which can be used to limit the \ 
number of allowable scans in the input file.
Both programs now accept a -strict argument, which can be used to treat all \ 
warnings as fatal.
CMake package config files are now included for both the libjpeg and TurboJPEG \ 
API libraries. This facilitates using libjpeg-turbo with CMake's find_package() \ 
function. For example:

find_package(libjpeg-turbo CONFIG REQUIRED)

add_executable(libjpeg_program libjpeg_program.c)
target_link_libraries(libjpeg_program PUBLIC libjpeg-turbo::jpeg)

add_executable(libjpeg_program_static libjpeg_program.c)
target_link_libraries(libjpeg_program_static PUBLIC
  libjpeg-turbo::jpeg-static)

add_executable(turbojpeg_program turbojpeg_program.c)
target_link_libraries(turbojpeg_program PUBLIC
  libjpeg-turbo::turbojpeg)

add_executable(turbojpeg_program_static turbojpeg_program.c)
target_link_libraries(turbojpeg_program_static PUBLIC
  libjpeg-turbo::turbojpeg-static)
Since the Unisys LZW patent has long expired, cjpeg and djpeg can now read/write \ 
both LZW-compressed and uncompressed GIF files (feature ported from jpeg-6a and \ 
jpeg-9d.)

jpegtran now includes the -wipe and -drop options from jpeg-9a and jpeg-9d, as \ 
well as the ability to expand the image size using the -crop option. Refer to \ 
jpegtran.1 or usage.txt for more details.

Added a complete intrinsics implementation of the Arm Neon SIMD extensions, thus \ 
providing SIMD acceleration on Arm platforms for all of the algorithms that are \ 
SIMD-accelerated on x86 platforms. This new implementation is significantly \ 
faster in some cases than the old GAS implementation-- depending on the \ 
algorithms used, the type of CPU core, and the compiler. GCC, as of this \ 
writing, does not provide a full or optimal set of Neon intrinsics, so for \ 
performance reasons, the default when building libjpeg-turbo with GCC is to \ 
continue using the GAS implementation of the following algorithms:

32-bit RGB-to-YCbCr color conversion
32-bit fast and accurate inverse DCT
64-bit RGB-to-YCbCr and YCbCr-to-RGB color conversion
64-bit accurate forward and inverse DCT
64-bit Huffman encoding
A new CMake variable (NEON_INTRINSICS) can be used to override this default.

Since the new intrinsics implementation includes SIMD acceleration for merged \ 
upsampling/color conversion, 1.5.1[5] is no longer necessary and has been \ 
reverted.

The Arm Neon SIMD extensions can now be built using Visual Studio.

The build system can now be used to generate a universal x86-64 + Armv8 \ 
libjpeg-turbo SDK package for both iOS and macOS.

Files:
RevisionActionfile
1.22modifypkgsrc/graphics/libjpeg-turbo/Makefile
1.7modifypkgsrc/graphics/libjpeg-turbo/PLIST
1.17modifypkgsrc/graphics/libjpeg-turbo/distinfo
1.1addpkgsrc/graphics/libjpeg-turbo/patches/patch-simd_arm__aarch32_jsimd.c
1.1addpkgsrc/graphics/libjpeg-turbo/patches/patch-simd_arm__aarch64_jsimd.c
1.2removepkgsrc/graphics/libjpeg-turbo/patches/patch-simd_arm_jsimd.c