Subject: CVS commit: pkgsrc/parallel/pocl
From: Thomas Klausner
Date: 2021-11-20 21:05:37
Message id: 20211120200537.455F1FAEC@cvs.NetBSD.org

Log Message:
pocl: update to 1.8.

Notable User Facing Changes
---------------------------

- support for LLVM 13
- CMake: Inter-Procedural Optimization is enabled on code of runtime library
  (libpocl.so is compiled with -flto on systems that support it).
- LTTng tracing improved - more command types are traced, and also
  some synchronous API calls (like clCreateBuffer) are traced.
- poclcc, tests and examples can be disabled with CMake options
- Valgrind support improved by making Valgrind aware of pocl's
  reference counting of cl_* objects
- kernels which are called by kernels are now force-inlined
- Support for NetBSD.
- Support for Unix systems without libdl.
- PoCL can now (optionally) respond to SIGUSR2 by printing
  some live debug information.
- improved SPIR support for CUDA devices

Notable Bug Fixes
-----------------

- Fixed a potential crash on Unix systems without sysfs mounted.
- Fixed compilation errors when building on macOS.
  - Fixed POCL_FAST_INIT macro; POCL_INIT_LOCK must be invoked with only one \ 
argument.
  - Fix bin/poclcc to not depend on OpenCL 2.0 symbols
- Fixed miscompilation in kernel loops with multiple conditionals with barriers \ 
in them.

Other
-----
- Add cmake options PARALLEL_COMPILE_JOBS, PARALLEL_LINK_JOBS to
  use ninja's seperate compile and link job pools.

- Improve memory architecture, buffer migration and allocation.
  Buffers are now allocated on a device when first used
  (previously each buffer was allocated on every device in context).

- the single global LLVMContext was replaced with
  multiple LLVMContexts, one per OpenCL cl_context.
  OpenCL code can now be compiled in parallel
  when using separate cl_contexts. This feature
  is disabled by default since it significantly slowed
  down PyOpenCL. This should be resolved by separating
  LLVM compilation in their own threads in the future.

- a new OpenCL extension was added to PoCL: cl_pocl_content_size.
  The extension allows the user to give optimization hint to PoCL,
  which will be used internally by PoCL to optimize buffer transfers
  between multiple devices.

Files:
RevisionActionfile
1.6modifypkgsrc/parallel/pocl/Makefile
1.6modifypkgsrc/parallel/pocl/distinfo
1.2modifypkgsrc/parallel/pocl/patches/patch-lib_CL_devices_devices.c
1.2removepkgsrc/parallel/pocl/patches/patch-CMakeLists.txt
1.2removepkgsrc/parallel/pocl/patches/patch-config.h.in.cmake
1.2removepkgsrc/parallel/pocl/patches/patch-lib_CL_devices_basic_basic.c
1.2removepkgsrc/parallel/pocl/patches/patch-lib_CL_devices_common.c
1.2removepkgsrc/parallel/pocl/patches/patch-lib_CL_devices_cpuinfo.c
1.2removepkgsrc/parallel/pocl/patches/patch-lib_CL_devices_hsa_pocl-hsa.c
1.2removepkgsrc/parallel/pocl/patches/patch-lib_CL_devices_pthread_pthread.c
1.2removepkgsrc/parallel/pocl/patches/patch-lib_CL_pocl__timing.c