Subject: CVS commit: pkgsrc/graphics/libjpeg-turbo
From: Thomas Klausner
Date: 2016-06-14 14:07:58
Message id: 20160614120758.3FC82FBB5@cvs.NetBSD.org

Log Message:
Updated libjpeg-turbo to 1.5.0.

1.5.0
=====

### Significant changes relative to 1.5 beta1:

1. Fixed an issue whereby a malformed motion-JPEG frame could cause the "fast
path" of libjpeg-turbo's Huffman decoder to read from uninitialized memory.

2. Added libjpeg-turbo version and build information to the global string table
of the libjpeg and TurboJPEG API libraries.  This is a common practice in other
infrastructure libraries, such as OpenSSL and libpng, because it makes it easy
to examine an application binary and determine which version of the library the
application was linked against.

3. Fixed a couple of issues in the PPM reader that would cause buffer overruns
in cjpeg if one of the values in a binary PPM/PGM input file exceeded the
maximum value defined in the file's header.  libjpeg-turbo 1.4.2 already
included a similar fix for ASCII PPM/PGM files.  Note that these issues were
not security bugs, since they were confined to the cjpeg program and did not
affect any of the libjpeg-turbo libraries.

4. Fixed an issue whereby attempting to decompress a JPEG file with a corrupt
header using the `tjDecompressToYUV2()` function would cause the function to
abort without returning an error and, under certain circumstances, corrupt the
stack.  This only occurred if `tjDecompressToYUV2()` was called prior to
calling `tjDecompressHeader3()`, or if the return value from
`tjDecompressHeader3()` was ignored (both cases represent incorrect usage of
the TurboJPEG API.)

5. Fixed an issue in the ARM 32-bit SIMD-accelerated Huffman encoder that
prevented the code from assembling properly with clang.

6. The `jpeg_stdio_src()`, `jpeg_mem_src()`, `jpeg_stdio_dest()`, and
`jpeg_mem_dest()` functions in the libjpeg API will now throw an error if a
source/destination manager has already been assigned to the compress or
decompress object by a different function or by the calling program.  This
prevents these functions from attempting to reuse a source/destination manager
structure that was allocated elsewhere, because there is no way to ensure that
it would be big enough to accommodate the new source/destination manager.

1.4.90 (1.5 beta1)
==================

### Significant changes relative to 1.4.2:

1. Added full SIMD acceleration for PowerPC platforms using AltiVec VMX
(128-bit SIMD) instructions.  Although the performance of libjpeg-turbo on
PowerPC was already good, due to the increased number of registers available
to the compiler vs. x86, it was still possible to speed up compression by about
3-4x and decompression by about 2-2.5x (relative to libjpeg v6b) through the
use of AltiVec instructions.

2. Added two new libjpeg API functions (`jpeg_skip_scanlines()` and
`jpeg_crop_scanline()`) that can be used to partially decode a JPEG image.  See
[libjpeg.txt](libjpeg.txt) for more details.

3. The TJCompressor and TJDecompressor classes in the TurboJPEG Java API now
implement the Closeable interface, so those classes can be used with a
try-with-resources statement.

4. The TurboJPEG Java classes now throw unchecked idiomatic exceptions
(IllegalArgumentException, IllegalStateException) for unrecoverable errors
caused by incorrect API usage, and those classes throw a new checked exception
type (TJException) for errors that are passed through from the C library.

5. Source buffers for the TurboJPEG C API functions, as well as the
`jpeg_mem_src()` function in the libjpeg API, are now declared as const
pointers.  This facilitates passing read-only buffers to those functions and
ensures the caller that the source buffer will not be modified.  This should
not create any backward API or ABI incompatibilities with prior libjpeg-turbo
releases.

6. The MIPS DSPr2 SIMD code can now be compiled to support either FR=0 or FR=1
FPUs.

7. Fixed additional negative left shifts and other issues reported by the GCC
and Clang undefined behavior sanitizers.  Most of these issues affected only
32-bit code, and none of them was known to pose a security threat, but removing
the warnings makes it easier to detect actual security issues, should they
arise in the future.

8. Removed the unnecessary `.arch` directive from the ARM64 NEON SIMD code.
This directive was preventing the code from assembling using the clang
integrated assembler.

9. Fixed a regression caused by 1.4.1[6] that prevented 32-bit and 64-bit
libjpeg-turbo RPMs from being installed simultaneously on recent Red Hat/Fedora
distributions.  This was due to the addition of a macro in jconfig.h that
allows the Huffman codec to determine the word size at compile time.  Since
that macro differs between 32-bit and 64-bit builds, this caused a conflict
between the i386 and x86_64 RPMs (any differing files, other than executables,
are not allowed when 32-bit and 64-bit RPMs are installed simultaneously.)
Since the macro is used only internally, it has been moved into jconfigint.h.

10. The x86-64 SIMD code can now be disabled at run time by setting the
`JSIMD_FORCENONE` environment variable to `1` (the other SIMD implementations
already had this capability.)

11. Added a new command-line argument to TJBench (`-nowrite`) that prevents the
benchmark from outputting any images.  This removes any potential operating
system overhead that might be caused by lazy writes to disk and thus improves
the consistency of the performance measurements.

12. Added SIMD acceleration for Huffman encoding on SSE2-capable x86 and x86-64
platforms.  This speeds up the compression of full-color JPEGs by about 10-15%
on average (relative to libjpeg-turbo 1.4.x) when using modern Intel and AMD
CPUs.  Additionally, this works around an issue in the clang optimizer that
prevents it (as of this writing) from achieving the same performance as GCC
when compiling the C version of the Huffman encoder
(<https://llvm.org/bugs/show_bug.cgi?id=16035>).  For the purposes of
benchmarking or regression testing, SIMD-accelerated Huffman encoding can be
disabled by setting the `JSIMD_NOHUFFENC` environment variable to `1`.

13. Added ARM 64-bit (ARMv8) NEON SIMD implementations of the commonly-used
compression algorithms (including the slow integer forward DCT and h2v2 & h2v1
downsampling algorithms, which are not accelerated in the 32-bit NEON
implementation.)  This speeds up the compression of full-color JPEGs by about
75% on average on a Cavium ThunderX processor and by about 2-2.5x on average on
Cortex-A53 and Cortex-A57 cores.

14. Added SIMD acceleration for Huffman encoding on NEON-capable ARM 32-bit
and 64-bit platforms.

    For 32-bit code, this speeds up the compression of full-color JPEGs by
about 30% on average on a typical iOS device (iPhone 4S, Cortex-A9) and by
about 6-7% on average on a typical Android device (Nexus 5X, Cortex-A53 and
Cortex-A57), relative to libjpeg-turbo 1.4.x.  Note that the larger speedup
under iOS is due to the fact that iOS builds use LLVM, which does not optimize
the C Huffman encoder as well as GCC does.

    For 64-bit code, NEON-accelerated Huffman encoding speeds up the
compression of full-color JPEGs by about 40% on average on a typical iOS device
(iPhone 5S, Apple A7) and by about 7-8% on average on a typical Android device
(Nexus 5X, Cortex-A53 and Cortex-A57), in addition to the speedup described in
[13] above.

    For the purposes of benchmarking or regression testing, SIMD-accelerated
Huffman encoding can be disabled by setting the `JSIMD_NOHUFFENC` environment
variable to `1`.

15. pkg-config (.pc) scripts are now included for both the libjpeg and
TurboJPEG API libraries on Un*x systems.  Note that if a project's build system
relies on these scripts, then it will not be possible to build that project
with libjpeg or with a prior version of libjpeg-turbo.

16. Optimized the ARM 64-bit (ARMv8) NEON SIMD decompression routines to
improve performance on CPUs with in-order pipelines.  This speeds up the
decompression of full-color JPEGs by nearly 2x on average on a Cavium ThunderX
processor and by about 15% on average on a Cortex-A53 core.

17. Fixed an issue in the accelerated Huffman decoder that could have caused
the decoder to read past the end of the input buffer when a malformed,
specially-crafted JPEG image was being decompressed.  In prior versions of
libjpeg-turbo, the accelerated Huffman decoder was invoked (in most cases) only
if there were > 128 bytes of data in the input buffer.  However, it is possible
to construct a JPEG image in which a single Huffman block is over 430 bytes
long, so this version of libjpeg-turbo activates the accelerated Huffman
decoder only if there are > 512 bytes of data in the input buffer.

18. Fixed a memory leak in tjunittest encountered when running the program
with the `-yuv` option.

1.4.2
=====

### Significant changes relative to 1.4.1:

1. Fixed an issue whereby cjpeg would segfault if a Windows bitmap with a
negative width or height was used as an input image (Windows bitmaps can have
a negative height if they are stored in top-down order, but such files are
rare and not supported by libjpeg-turbo.)

2. Fixed an issue whereby, under certain circumstances, libjpeg-turbo would
incorrectly encode certain JPEG images when quality=100 and the fast integer
forward DCT were used.  This was known to cause `make test` to fail when the
library was built with `-march=haswell` on x86 systems.

3. Fixed an issue whereby libjpeg-turbo would crash when built with the latest
& greatest development version of the Clang/LLVM compiler.  This was caused by
an x86-64 ABI conformance issue in some of libjpeg-turbo's 64-bit SSE2 SIMD
routines.  Those routines were incorrectly using a 64-bit `mov` instruction to
transfer a 32-bit JDIMENSION argument, whereas the x86-64 ABI allows the upper
(unused) 32 bits of a 32-bit argument's register to be undefined.  The new
Clang/LLVM optimizer uses load combining to transfer multiple adjacent 32-bit
structure members into a single 64-bit register, and this exposed the ABI
conformance issue.

4. Fixed a bug in the MIPS DSPr2 4:2:0 "plain" (non-fancy and non-merged)
upsampling routine that caused a buffer overflow (and subsequent segfault) when
decompressing a 4:2:0 JPEG image whose scaled output width was less than 16
pixels.  The "plain" upsampling routines are normally only used when
decompressing a non-YCbCr JPEG image, but they are also used when decompressing
a JPEG image whose scaled output height is 1.

5. Fixed various negative left shifts and other issues reported by the GCC and
Clang undefined behavior sanitizers.  None of these was known to pose a
security threat, but removing the warnings makes it easier to detect actual
security issues, should they arise in the future.

Files:
RevisionActionfile
1.14modifypkgsrc/graphics/libjpeg-turbo/Makefile
1.5modifypkgsrc/graphics/libjpeg-turbo/PLIST
1.11modifypkgsrc/graphics/libjpeg-turbo/distinfo
1.5modifypkgsrc/graphics/libjpeg-turbo/patches/patch-aa
1.2modifypkgsrc/graphics/libjpeg-turbo/patches/patch-simd_jsimd__arm.c