Log message:
math/py-numba: update to 0.58.1
This is the first version I tested with the re-vived py-llvmlite. This version
works with Pythons below 3.12 so far. Upstream changes since 0.55.2:
Version 0.58.1 (17 October 2023)
This is a maintenance release that adds support for NumPy 1.26 and fixes a bug.
NumPy Support
Support NumPy 1.26
Support for NumPy 1.26 is added.
(PR-#9227)
Bug Fixes
Fixed handling of float default arguments in inline closures
Float default arguments in inline closures would produce incorrect results since \
updates for Python 3.11 - these are now handled correctly again.
(PR-#9222)
Pull-Requests
PR #9220: Support passing arbitrary flags to NVVM (gmarkall)
PR #9227: Support NumPy 1.26 (PR aimed at review / merge) (Tialo gmarkall)
PR #9228: Fix #9222 - Don’t replace . with _ in func arg names in inline \
closures (gmarkall)
Authors
gmarkall
Tialo
Version 0.58.0 (20 September 2023)
Table of Contents
Version 0.58.0 (20 September 2023)
Highlights
New Features
Improvements
NumPy Support
CUDA Changes
Bug Fixes
Changes
Deprecations
Pull-Requests
Authors
This is a major Numba release. Numba now uses towncrier to create the release \
notes, so please find a summary of all noteworthy items below.
Highlights
Added towncrier
This PR adds towncrier as a GitHub workflow for checking release notes. From \
this PR onwards every PR made in Numba will require a appropriate release note \
associated with it. The reviewer may decide to skip adding release notes in \
smaller PRs with minimal impact by addition of a skip_release_notes label to the \
PR.
(PR-#8792)
The minimum supported NumPy version is 1.22.
Following NEP-0029, the minimum supported NumPy version is now 1.22.
(PR-#9093)
Add support for NumPy 1.25
Extend Numba to support new and changed features released in NumPy 1.25.
(PR-#9011)
Remove NVVM 3.4 and CTK 11.0 / 11.1 support
Support for CUDA toolkits < 11.2 is removed.
(PR-#9040)
Removal of Windows 32-bit Support
This release onwards, Numba has discontinued support for Windows 32-bit \
operating systems.
(PR-#9083)
The minimum llvmlite version is now 0.41.0.
The minimum required version of llvmlite is now version 0.41.0.
(PR-#8916)
Added RVSDG-frontend
This PR is a preliminary work on adding a RVSDG-frontend for processing \
bytecode. RVSDG (Regionalized Value-State Dependence Graph) allows us to have a \
dataflow-centric view instead of a traditional SSA-CFG view. This allows us to \
simplify the compiler in the future.
(PR-#9012)
New Features
numba.experimental.jitclass gains support for __*matmul__ methods.
numba.experimental.jitclass now has support for the following methods:
__matmul__
__imatmul__
__rmatmul__
(PR-#8892)
numba.experimental.jitclass gains support for reflected “dunder” methods.
numba.experimental.jitclass now has support for the following methods:
__radd__
__rand_
__rfloordiv__
__rlshift__
__ror_
__rmod_
__rmul_
__rpow_
__rrshift_
__rsub_
__rtruediv_
__rxor_
(PR-#8906)
Add support for value max to NUMBA_OPT.
The optimisation level that Numba applies when compiling can be set through the \
environment variable NUMBA_OPT. This has historically been a value between 0 and \
3 (inclusive). Support for the value max has now been added, this is a \
Numba-specific optimisation level which indicates that the user would like Numba \
to try running the most optimisation possible, potentially trading a longer \
compilation time for better run-time performance. In practice, use of the max \
level of optimisation may or may not benefit the run-time or compile-time \
performance of user code, but it has been added to present an easy to access \
option for users to try if they so wish.
(PR-#9094)
Improvements
Updates to numba.core.pythonapi.
Support for Python C-API functions PyBytes_AsString and PyBytes_AsStringAndSize \
is added to numba.core.pythonapi.PythonAPI as bytes_as_string and \
bytes_as_string_and_size methods respectively.
(PR-#8462)
Support for isinstance is now non-experimental.
Support for the isinstance built-in function has moved from being considered an \
experimental feature to a fully supported feature.
(PR-#8911)
NumPy Support
All modes are supported in numpy.correlate and numpy.convolve.
All values for the mode argument to numpy.correlate and numpy.convolve are now \
supported.
(PR-#7543)
@vectorize accommodates arguments implementing __array_ufunc__.
Universal functions (ufuncs) created with numba.vectorize will now respect \
arguments implementing __array_ufunc__ (NEP-13) to allow pre- and \
post-processing of arguments and return values when the ufunc is called from the \
interpreter.
(PR-#8995)
Added support for np.geomspace function.
This PR improves on #4074 by adding support for np.geomspace. The current \
implementation only supports scalar start and stop parameters.
(PR-#9068)
Added support for np.vsplit, np.hsplit, np.dsplit.
This PR improves on #4074 by adding support for np.vsplit, np.hsplit, and np.dsplit.
(PR-#9082)
Added support for np.row_stack function.
Support is added for numpy.row_stack.
(PR-#9085)
Added support for functions np.polynomial.polyutils.trimseq, as well as \
functions polyadd, polysub, polymul from np.polynomial.polynomial.
Support is added for np.polynomial.polyutils.trimseq, \
np.polynomial.polynomial.polyadd, np.polynomial.polynomial.polysub, \
np.polynomial.polynomial.polymul.
(PR-#9087)
Added support for np.diagflat function.
Support is added for numpy.diagflat.
(PR-#9113)
Added support for np.resize function.
Support is added for numpy.resize.
(PR-#9118)
Add np.trim_zeros
Support for np.trim_zeros() is added.
(PR-#9074)
CUDA Changes
Bitwise operation ufunc support for the CUDA target.
Support is added for some ufuncs associated with bitwise operation on the CUDA \
target. Namely:
numpy.bitwise_and
numpy.bitwise_or
numpy.bitwise_not
numpy.bitwise_xor
numpy.invert
numpy.left_shift
numpy.right_shift
(PR-#8974)
Add support for the latest CUDA driver codes.
Support is added for the latest set of CUDA driver codes.
(PR-#8988)
Add NumPy comparison ufunc in CUDA
this PR adds support for comparison ufuncs for the CUDA target (eg. \
numpy.greater, numpy.greater_equal, numpy.less_equal, etc.).
(PR-#9007)
Report absolute path of libcuda.so on Linux
numba -s now reports the absolute path to libcuda.so on Linux, to aid \
troubleshooting driver issues, particularly on WSL2 where a Linux driver can \
incorrectly be installed in the environment.
(PR-#9034)
Add debuginfo support to nvdisasm output.
Support is added for debuginfo (source line and inlining information) in \
functions that make calls through nvdisasm. For example the CUDA dispatcher \
.inspect_sass method output is now augmented with this information.
(PR-#9035)
Add CUDA SASS CFG Support
This PR adds support for getting the SASS CFG in dot language format. It adds an \
inspect_sass_cfg() method to CUDADispatcher and the -cfg flag to the nvdisasm \
command line tool.
(PR-#9051)
Support NVRTC using the ctypes binding
NVRTC can now be used when the ctypes binding is in use, enabling float16, and \
linking CUDA C / C++ sources without needing the NVIDIA CUDA Python bindings.
(PR-#9086)
Fix CUDA atomics tests with toolkit 12.2
CUDA 12.2 generates slightly different PTX for some atomics, so the relevant \
tests are updated to look for the correct instructions when 12.2 is used.
(PR-#9088)
Bug Fixes
Handling of different sized unsigned integer indexes are fixed in \
numba.typed.List.
An issue with the order of truncation/extension and casting of unsigned integer \
indexes in numba.typed.List has been fixed.
(PR-#7262)
Prevent invalid fusion
This PR fixes an issue in which an array first read in a parfor and later \
written in the same parfor would only be classified as used in the parfor. When \
a subsequent parfor also used the same array then fusion of the parfors was \
happening which should have been forbidden given that that the first parfor was \
also writing to the array. This PR treats such arrays in a parfor as being both \
used and defined so that fusion will be prevented.
(PR-#7582)
The numpy.allclose implementation now correctly handles default arguments.
The implementation of numpy.allclose is corrected to use TypingError to report \
typing errors.
(PR-#8885)
Add type validation to numpy.isclose.
Type validation is added to the implementation of numpy.isclose.
(PR-#8944)
Fix support for overloading dispatcher with non-compatible first-class functions
Fixes an error caused by not handling compilation error during casting of \
Dispatcher objects into first-class functions. With the fix, users can now \
overload a dispatcher with non-compatible first-class functions. Refer to \
https://github.com/numba/numba/issues/9071 for details.
(PR-#9072)
Support dtype keyword argument in numpy.arange with parallel=True
Fixes parfors transformation to support the use of dtype keyword argument in \
numpy.arange(..., dtype=dtype).
(PR-#9095)
Fix all @overloads to use parameter names that match public APIs.
Some of the Numba @overloads for functions in NumPy and Python’s built-ins \
were written using parameter names that did not match those used in API they \
were overloading. The result of this being that calling a function with such a \
mismatch using the parameter names as key-word arguments at the call site would \
result in a compilation error. This has now been universally fixed throughout \
the code base and a unit test is running with a best-effort attempt to prevent \
reintroduction of similar mistakes in the future. Fixed functions include:
From Python built-ins:
complex
From the Python random module:
random.seed
random.gauss
random.normalvariate
random.randrange
random.randint
random.uniform
random.shuffle
From the numpy module:
numpy.argmin
numpy.argmax
numpy.array_equal
numpy.average
numpy.count_nonzero
numpy.flip
numpy.fliplr
numpy.flipud
numpy.iinfo
numpy.isscalar
numpy.imag
numpy.real
numpy.reshape
numpy.rot90
numpy.swapaxes
numpy.union1d
numpy.unique
From the numpy.linalg module:
numpy.linalg.norm
numpy.linalg.cond
numpy.linalg.matrix_rank
From the numpy.random module:
numpy.random.beta
numpy.random.chisquare
numpy.random.f
numpy.random.gamma
numpy.random.hypergeometric
numpy.random.lognormal
numpy.random.pareto
numpy.random.randint
numpy.random.random_sample
numpy.random.ranf
numpy.random.rayleigh
numpy.random.sample
numpy.random.shuffle
numpy.random.standard_gamma
numpy.random.triangular
numpy.random.weibull
(PR-#9099)
Changes
Support for @numba.extending.intrinsic(prefer_literal=True)
In the high level extension API, the prefer_literal option is added to the \
numba.extending.intrinsic decorator to prioritize the use of literal types when \
available. This has the same behavior as in the prefer_literal option in the \
numba.extending.overload decorator.
(PR-#6647)
Deprecations
Deprecation of old-style NUMBA_CAPTURED_ERRORS
Added deprecation schedule of NUMBA_CAPTURED_ERRORS=old_style. \
NUMBA_CAPTURED_ERRORS=new_style will become the default in future releases. \
Details are documented at \
https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-old-style-numba-captured-errors
(PR-#9090)
Pull-Requests
PR #6647: Support prefer_literal option for intrinsic decorator \
(ashutoshvarma sklam)
PR #7262: fix order of handling and casting (esc)
PR #7543: Support for all modes in np.correlate and np.convolve (jeertmans)
PR #7582: Use get_parfor_writes to detect illegal array access that prevents \
fusion. (DrTodd13)
PR #8371: Added binomial distribution (esc kc611)
PR #8462: Add PyBytes_AsString and PyBytes_AsStringAndSize (ianna)
PR #8633: DOC: Convert vectorize and guvectorize examples to doctests (Matt711)
PR #8730: Update dev-docs (sgbaird esc)
PR #8792: Added towncrier as a github workflow (kc611)
PR #8854: Updated mk_alloc to support Numba-Dpex compute follows data. \
(mingjie-intel)
PR #8861: CUDA: Don’t add device kwarg for jit registry (gmarkall)
PR #8871: Don’t return the function in CallConv.decorate_function() (gmarkall)
PR #8885: Fix np.allclose not handling default args (guilhermeleobas)
PR #8892: Add support for __*matmul__ methods in jitclass (louisamand)
PR #8895: CUDA: Enable caching functions that use CG (gmarkall)
PR #8906: Add support for reflected dunder methods in jitclass (louisamand)
PR #8911: Remove isinstance experimental feature warning (guilhermeleobas)
PR #8916: Bump llvmlite requirement to 0.41.0dev0 (sklam)
PR #8925: Update release checklist template (sklam)
PR #8937: Remove old Website development documentation (esc gmarkall)
PR #8944: Add exceptions to np.isclose (guilhermeleobas)
PR #8974: CUDA: Add binary ufunc support (Matt711)
PR #8976: Fix index URL for ptxcompiler/cubinlinker packages. (bdice)
PR #8978: Import MVC packages when using MVCLinker. (bdice)
PR #8983: Fix typo in deprecation.rst (dsgibbons)
PR #8988: support for latest CUDA driver codes #8363 (s1Sharp)
PR #8995: Allow libraries that implement __array_ufunc__ to override \
DUFunc.__c… (jpivarski)
PR #9007: CUDA: Add comparison ufunc support (Matt711)
PR #9012: RVSDG-frontend (sklam)
PR #9021: update the release checklist following 0.57.1rc1 (esc)
PR #9022: fix: update the C++ ABI repo reference (emmanuel-ferdman)
PR #9028: Replace use of imp module removed in 3.12 (hauntsaninja)
PR #9034: CUDA libs test: Report the absolute path of the loaded libcuda.so \
on Linux, + other improvements (gmarkall)
PR #9035: CUDA: Allow for debuginfo in nvdisasm output (Matt711)
PR #9037: Recognize additional functions as being pure or not having side \
effects. (DrTodd13)
PR #9039: Correct git clone link in installation instructions. (ellifteria)
PR #9040: Remove NVVM 3.4 and CTK 11.0 / 11.1 support (gmarkall)
PR #9046: copy the change log changes for 0.57.1 to main (esc)
PR #9050: Update CODEOWNERS (sklam)
PR #9051: Add CUDA CFG support (Matt711)
PR #9056: adding weekly meeting notes script (esc)
PR #9068: Adding np.geomspace (KrisMinchev)
PR #9069: Fix towncrier error due to importlib_resources upgrade (sklam)
PR #9072: Fix support for overloading dispatcher with non-compatible \
first-class functions (gmarkall sklam)
PR #9074: Add np.trim_zeros (sungraek guilhermeleobas)
PR #9082: Add np.vsplit, np.hsplit, and np.dsplit (KrisMinchev)
PR #9083: Removed windows 32 references from code and documentation (kc611)
PR #9085: Add tests for np.row_stack (KrisMinchev)
PR #9086: Support NVRTC using ctypes binding (testhound gmarkall)
PR #9087: Add trimseq from np.polynomial.polyutils and polyadd, polysub, \
polymul from np.polynomial.polynomial (KrisMinchev)
PR #9088: Fix: Issue 9063 - CUDA atomics tests failing with CUDA 12.2 (gmarkall)
PR #9090: Add deprecation notice for old_style error capturing. (esc sklam)
PR #9094: Add support for a ‘max’ level to NUMBA_OPT environment \
variable. (stuartarchibald)
PR #9095: Support dtype keyword in arange_parallel_impl (DrTodd13 sklam)
PR #9105: NumPy 1.25 support (PR #9011) continued (gmarkall apmasell)
PR #9111: Fixes ReST syntax error in PR#9099 (stuartarchibald gmarkall sklam \
apmasell)
PR #9112: Fixups for PR#9100 (stuartarchibald sklam)
PR #9113: Add support for np.diagflat (KrisMinchev)
PR #9114: update np min to 122 (stuartarchibald esc)
PR #9117: Fixed towncrier template rendering (kc611)
PR #9118: Add support for np.resize() (KrisMinchev)
PR #9120: Update conda-recipe for numba-rvsdg (sklam)
PR #9127: Fix accidental cffi test deps, refactor cffi skipping (gmarkall)
PR #9128: Merge rvsdg_frontend branch to main (esc sklam)
PR #9152: Fix old_style error capturing deprecation warnings (sklam)
PR #9159: Fix uncaught exception in find_file() (gmarkall)
PR #9173: Towncrier fixups (Continue #9158 and retarget to main branch) (sklam)
PR #9181: Remove extra decrefs in RNG (sklam)
PR #9190: Fix issue with incompatible multiprocessing context in test. \
(stuartarchibald)
Authors
apmasell
ashutoshvarma
bdice
DrTodd13
dsgibbons
ellifteria
emmanuel-ferdman
esc
gmarkall
guilhermeleobas
hauntsaninja
ianna
jeertmans
jpivarski
jtilly
kc611
KrisMinchev
louisamand
Matt711
mingjie-intel
s1Sharp
sgbaird
sklam
stuartarchibald
sungraek
testhound
Version 0.57.1 (21 June, 2023)
Pull-Requests:
PR #8964: fix missing nopython keyword in cuda random module (esc)
PR #8965: fix return dtype for np.angle (guilhermeleobas esc)
PR #8982: Don’t do the parfor diagnostics pass for the parfor gufunc. \
(DrTodd13)
PR #8996: adding a test for 8940 (esc)
PR #8958: resurrect the import, this time in the registry initialization (esc)
PR #8947: Introduce internal _isinstance_no_warn (guilhermeleobas esc)
PR #8998: Fix 8939 (second attempt) (esc)
PR #8978: Import MVC packages when using MVCLinker. (bdice)
PR #8895: CUDA: Enable caching functions that use CG (gmarkall)
PR #8976: Fix index URL for ptxcompiler/cubinlinker packages. (bdice)
PR #9004: Skip MVC test when libraries unavailable (gmarkall esc)
PR #9006: link to version support table instead of using explicit versions (esc)
PR #9005: Fix: Issue #8923 - avoid spurious device-to-host transfers in CUDA \
ufuncs (gmarkall)
Authors:
bdice
DrTodd13
esc
gmarkall
Version 0.57.0 (1 May, 2023)
This release continues to add new features, bug fixes and stability improvements \
to Numba. Please note that this release contains a significant number of both \
deprecation and pending-deprecation notices with view of making it easier to \
develop new technology for Numba in the future. Also note that this will be the \
last release to support Windows 32-bit packages produced by the Numba team.
Highlights of core dependency upgrades:
Support for Python 3.11 (minimum is moved to 3.8)
Support for NumPy 1.24 (minimum is moved to 1.21)
Python language support enhancements:
Exception classes now support arguments that are not compile time constant.
The built-in functions hasattr and getattr are supported for compile time \
constant attributes.
The built-in functions str and repr are now implemented similarly to their \
Python implementations. Custom __str__ and __repr__ functions can be associated \
with types and work as expected.
Numba’s unicode functionality in str.startswith now supports kwargs start \
and end.
min and max now support boolean types.
Support is added for the dict(iterable) constructor.
NumPy features/enhancements:
The largest set of new features is within the numpy.random.Generator \
support, the vast majority of commonly used distributions are now supported. \
Namely:
Generator.beta
Generator.chisquare
Generator.exponential
Generator.f
Generator.gamma
Generator.geometric
Generator.integers
Generator.laplace
Generator.logistic
Generator.lognormal
Generator.logseries
Generator.negative_binomial
Generator.noncentral_chisquare
Generator.noncentral_f
Generator.normal
Generator.pareto
Generator.permutation
Generator.poisson
Generator.power
Generator.random
Generator.rayleigh
Generator.shuffle
Generator.standard_cauchy
Generator.standard_exponential
Generator.standard_gamma
Generator.standard_normal
Generator.standard_t
Generator.triangular
Generator.uniform
Generator.wald
Generator.weibull
Generator.zipf
The nbytes property on NumPy ndarray types is implemented.
Nesting of nested-array types is now supported.
datetime and timedelta types can be cast to int.
F-order iteration is supported in ufunc generation for increased performance \
when using combinations of predominantly F-order arrays.
The following functions are also now supported:
np.argpartition
np.isclose
np.nan_to_num
np.new_axis
np.union1d
Highlights of core changes:
A large amount of refactoring has taken place to convert many of Numba’s \
internal implementations, of both Python and NumPy functions, from the low-level \
extension API to the high-level extension API (numba.extending).
The __repr__ method is supported for Numba types.
The default target for applicable functions in the extension API \
(numba.extending) is now "generic". This means that @overload* and \
@intrinsic functions will by default be accepted by both the CPU and CUDA \
targets.
The use of __getitem__ on Numba types is now supported in compiled code. \
i.e. types.float64[:, ::1] is now compilable.
Performance:
The performance of str.find() and str.rfind() has been improved.
Unicode support for __getitem__ now avoids allocation and returns a view.
The numba.typed.Dict dictionary now accepts an n_keys option to enable \
allocating the dictionary instance to a predetermined initial size (useful to \
avoid resizes!).
The Numba Run-time (NRT) has been improved in terms of performance and safety:
The NRT internal statistics counters are now off by default (removes \
atomic lock contentions).
Debug cache line filling is off by default.
The NRT is only compiled once a compilation starts opposed to at \
function decoration time, this improves import speed.
The NRT allocation calls are all made through a “checked” layer by \
default.
CUDA:
New NVIDIA hardware and software compatibility / support:
Toolkits: CUDA 11.8 and 12, with Minor Version Compatibility for 11.x.
Packaging: NVIDIA-packaged CUDA toolkit conda packages.
Hardware: Hopper, Ada Lovelace, and AGX Orin.
float16 support:
Arithmetic operations are now fully supported.
A new method, is_fp16_supported(), and device property, \
supports_float16, for checking the availability of float16 support.
Functionality:
The high-level extension API is now fully-supported in the CUDA target.
Eager compilation of multiple signatures, multiple outputs from \
generalized ufuncs, and specifying the return type of ufuncs are now supported.
A limited set of NumPy ufuncs (trigonometric functions) can now be \
called inside kernels.
Lineinfo quality improvement: enabling lineinfo no longer results in any \
changes to generated code.
Deprecations:
The numba.pycc module and everything in it is now pending deprecation.
The long awaited full deprecation of object mode fall-back is underway. This \
change means @jit with no keyword arguments will eventually alias @njit.
The @generated_jit decorator is deprecated as the Numba extension API \
provides a better supported superset of the same functionality, particularly \
through @numba.extending.overload.
Version support/dependency changes:
The setuptools package is now an optional run-time dependency opposed to a \
required run-time dependency.
The TBB threading-layer now requires version 2021.6 or later.
LLVM 14 is now supported on all platforms via llvmlite.
Pull-Requests:
PR #5113: Fix error handling in the Interval extending example (esc eric-wieser)
PR #5544: Add support for np.union1d (shangbol gmarkall)
PR #7009: Add writable args (dmbelov)
PR #7067: Implement np.isclose (guilhermeleobas)
PR #7255: CUDA: Support CUDA Toolkit conda packages from NVIDIA (gmarkall)
PR #7622: Support fortran loop ordering for ufunc generation (sklam)
PR #7733: fix for /tmp/tmp access issues (ChiCheng45)
PR #7884: Implement getattr builtin. (stuartarchibald)
PR #7885: Adds CUDA FP16 arithmetic operators (testhound)
PR #7920: Drop pre-3.7 code path (CPU only) (sklam)
PR #8001: CUDA fp16 math functions (testhound gmarkall)
PR #8010: Add support for fp16 comparison native operators (testhound)
PR #8024: Allow converting NumPy datetimes to int (apmasell)
PR #8038: Support for Numpy BitGenerators PR#2: Standard Distributions \
support (kc611)
PR #8040: Support for Numpy BitGenerators PR#3: Advanced Distributions \
Support. (kc611)
PR #8041: Support for Numpy BitGenerators PR#4: Generator().integers() \
Support. (kc611)
PR #8042: Support for NumPy BitGenerators PR#5: Generator Shuffling Methods. \
(kc611)
PR #8061: Migrate random glue_lowering to overload where easy (apmasell)
PR #8106: Remove injection of atomic JIT functions into NRT memsys. \
(stuartarchibald)
PR #8120: Support nesting of nested array types (gmarkall)
PR #8134: Support non-constant exception values in JIT (guilhermeleobas sklam)
PR #8147: Adds size variable at runtime for arrays that cannot be inferred \
(njriasan)
PR #8154: Testhound/native cast 8138 (testhound)
PR #8158: adding -pthread for linux-ppc64le in setup.py (esc)
PR #8164: remove myself from automatic reviewer assignment (esc)
PR #8167: CUDA: Facilitate and document passing arrays / pointers to foreign \
functions (gmarkall)
PR #8180: CUDA: Initial support for Minor Version Compatibility (gmarkall)
PR #8183: Add n_keys option to Dict.empty() (stefanfed gmarkall)
PR #8198: Update the release template to include updating the version table. \
(stuartarchibald)
PR #8200: Make the NRT use the “unsafe” allocation API by default. \
(stuartarchibald)
PR #8201: Bump llvmlite dependency to 0.40.dev0 for Numba 0.57.0dev0 \
(stuartarchibald)
PR #8207: development tag should be in monofont (esc)
PR #8212: release checklist: include a note to ping @RC_testers on discourse \
(esc)
PR #8216: chore: Set permissions for GitHub actions (naveensrinivasan)
PR #8217: Fix syntax in docs (jorgepiloto)
PR #8220: Added the interval example as doctest (kc611)
PR #8221: CUDA stubs docstring: Replace illegal escape sequence (gmarkall)
PR #8228: Fix typo in @vectorize docstring and a NumPy spelling. \
(stuartarchibald)
PR #8229: Remove mk_unique_var in inline_closurecall.py (sklam)
PR #8234: Replace @overload_glue by @overload for 20 NumPy functions \
(guilhermeleobas)
PR #8235: Make the NRT stats counters optional. (stuartarchibald)
PR #8238: Advanced Indexing Support #1 (kc611)
PR #8240: Add get_shared_mem_per_block method to Dispatcher (testhound)
PR #8241: Reorder typeof checks to avoid infinite loops on StructrefProxy \
__hash__ (DannyWeitekamp)
PR #8243: Add a note to reference/numpysupported.rst ()
PR #8245: Fix links in CONTRIBUTING.md ()
PR #8247: Fix issue 8127 (bszollosinagy)
PR #8250: Fix issue 8161 (bszollosinagy)
PR #8253: CUDA: Verify NVVM IR prior to compilation (gmarkall)
PR #8255: CUDA: Make numba.cuda.tests.doc_examples.ffi a module to fix #8252 \
(gmarkall)
PR #8256: Migrate linear algebra functions from glue_lowering (apmasell)
PR #8258: refactor np.where to use overload (guilhermeleobas)
PR #8259: Add np.broadcast_to(scalar_array, ()) (guilhermeleobas)
PR #8264: remove mk_unique_var from parfor_lowering_utils.py (guilhermeleobas)
PR #8265: Remove mk_unique_var from array_analysis.py (guilhermeleobas)
PR #8266: Remove mk_unique_var in untyped_passes.py (guilhermeleobas)
PR #8267: Fix segfault for invalid axes in np.split (aseyboldt)
PR #8271: Implement some CUDA intrinsics with @overload, \
@overload_attribute, and @intrinsic (gmarkall)
PR #8274: Update version support table doc for 0.56. (stuartarchibald)
PR #8275: Update CHANGE_LOG for 0.56.0 final (stuartarchibald)
PR #8283: Clean up / remove support for old NumPy versions (gmarkall)
PR #8287: Drop CUDA 10.2 (gmarkall)
PR #8289: Revert #8265. (stuartarchibald)
PR #8290: CUDA: Replace use of deprecated NVVM IR features, questionable \
constructs (gmarkall)
PR #8292: update checklist (esc)
PR #8294: CUDA: Add trig ufunc support (gmarkall)
PR #8295: Add get_const_mem_size method to Dispatcher (testhound gmarkall)
PR #8297: Add __name__ attribute to CUDAUFuncDispatcher and test case (testhound)
PR #8299: Fix build for mingw toolchain (Biswa96)
PR #8302: CUDA: Revert numba_nvvm intrinsic name workaround (gmarkall)
PR #8308: CUDA: Support for multiple signatures (gmarkall)
PR #8315: Add get_local_mem_per_thread method to Dispatcher (testhound)
PR #8319: Bump minimum supported Python version to 3.8 (esc stuartarchibald \
jamesobutler)
PR #8320: Add __name__ support for GUFuncs (testhound)
PR #8321: Fix literal_unroll pass erroneously exiting on non-conformant \
loop. (stuartarchibald)
PR #8325: Remove use of mk_unique_var in stencil.py (bszollosinagy)
PR #8326: Remove mk_unique_var from parfor_lowering.py (guilhermeleobas)
PR #8331: Extend docs with info on how to call C functions from Numba \
(guilhermeleobas)
PR #8334: Add dict(*iterable) constructor (guilhermeleobas)
PR #8335: Remove deprecated pycc script and related source. (stuartarchibald)
PR #8336: Fix typos of “Generalized” in GUFunc-related code (gmarkall)
PR #8338: Calculate reductions before fusion so that use of reduction vars \
can stop fusion. (DrTodd13)
PR #8339: Fix #8291 parfor leak of redtoset variable (sklam)
PR #8341: CUDA: Support multiple outputs for Generalized Ufuncs (gmarkall)
PR #8343: Eliminate references to type annotation in compile_ptx (testhound)
PR #8348: Add get_max_threads_per_block method to Dispatcher (testhound)
PR #8354: pin setuptools to < 65 and switch from mamba to conda on RTD \
(esc gmarkall)
PR #8357: Clean up the buildscripts directory. (stuartarchibald)
PR #8359: adding warnings about cache behaviour (luk-f-a)
PR #8368: Remove glue_lowering in random math that requires IR (apmasell)
PR #8376: Fix issue 8370 (bszollosinagy)
PR #8387: Add support for compute capability in IR Lowering (testhound)
PR #8388: Remove more references to the pycc binary. (stuartarchibald)
PR #8389: Make C++ extensions compile with correct compiler (apmasell)
PR #8390: Use NumPy logic for lessthan in sort to move NaNs to the back. (sklam)
PR #8401: Remove Cuda toolkit version check (testhound)
PR #8415: Refactor numba.np.arraymath methods from lower_builtins to \
overloads (kc611)
PR #8418: Fixes ravel failure on 1d arrays (#5229) (cako)
PR #8421: Update release checklist: add a task to check dependency pinnings \
on subsequent releases (e.g. PATCH) (esc)
PR #8422: Switch public CI builds to use gdb from conda packages. \
(stuartarchibald)
PR #8423: Remove public facing and CI references to 32 bit linux support. \
(stuartarchibald, in addition, we are grateful for the contribution of \
jamesobutler towards a similar goal in PR #8319)
PR #8425: Post 0.56.2 cleanup (esc)
PR #8427: Shorten the time to verify test discovery. (stuartarchibald)
PR #8429: changelog generator script (esc)
PR #8431: Replace @overload_glue by @overload for np.linspace and np.take \
(guilhermeleobas)
PR #8432: Refactor carray/farray to use @overload (guilhermeleobas)
PR #8435: Migrate np.atleast_? functions from glue_lowering to overload \
(apmasell)
PR #8438: Make the initialisation of the NRT more lazy for the njit \
decorator. (stuartarchibald)
PR #8439: Update the contributing docs to include a policy on formatting \
changes. (stuartarchibald)
PR #8440: [DOC]: Replaces icc_rt with intel-cmplr-lib-rt (oleksandr-pavlyk)
PR #8442: Implement hasattr(), str() and repr(). (stuartarchibald)
PR #8446: add version info in ImportError’s (raybellwaves)
PR #8450: remove GitHub username from changelog generation script (esc)
PR #8467: Convert implementations using generated_jit to overload (gmarkall)
PR #8468: Reference test suite in installation documentation (apmasell)
PR #8469: Correctly handle optional types in parfors lowering (apmasell)
PR #8473: change the include style in _pymodule.h and remove unused or \
duplicate headers in two header files ()
PR #8476: Make setuptools optional at runtime. (stuartarchibald)
PR #8490: Restore installing SciPy from defaults instead of conda-forge on \
public CI (esc)
PR #8494: Remove context.compile_internal where easy on \
numba/cpython/cmathimpl.py (guilhermeleobas)
PR #8495: Removes context.compile_internal where easy on \
numba/cpython/listobj.py (guilhermeleobas)
PR #8496: Rewrite most of the set API to use overloads (guilhermeleobas)
PR #8499: Deprecate numba.generated_jit (stuartarchibald)
PR #8508: This updates the release checklists to capture some more checks. \
(stuartarchibald)
PR #8513: Added support for numpy.newaxis (kc611)
PR #8517: make some typedlist C-APIs public ()
PR #8518: Adjust stencil tests to use hardcoded python source opposed to \
AST. (stuartarchibald)
PR #8520: Added noncentral-chisquared, noncentral-f and logseries \
distributions (kc611)
PR #8522: Import jitclass from numba.experimental in jitclass documentation \
(armgabrielyan)
PR #8524: Fix grammar in stencil.rst (armgabrielyan)
PR #8525: Making CUDA specific datamodel manager (sklam)
PR #8526: Fix broken url (Nimrod0901)
PR #8527: Fix grammar in troubleshoot.rst (armgabrielyan)
PR #8532: Vary NumPy version on gpuCI (gmarkall)
PR #8535: LLVM14 (apmasell)
PR #8536: Fix fusion bug. (DrTodd13)
PR #8539: Fix #8534, np.broadcast_to should update array size attr. \
(stuartarchibald)
PR #8541: Remove restoration of “free” channel in Azure CI windows \
builds. (stuartarchibald)
PR #8542: CUDA: Make arg optional for Stream.add_callback() (gmarkall)
PR #8544: Remove reliance on npy_<impl> ufunc loops. (stuartarchibald)
PR #8545: Py3.11 basic support (esc sklam)
PR #8547: [Unicode] Add more string view usages for unicode operations ()
PR #8549: Fix rstcheck in Azure CI builds, update sphinx dep and docs to \
match (stuartarchibald)
PR #8550: Changes how tests are split between test instances (apmasell)
PR #8554: Make target for @overload have ‘generic’ as default. \
(stuartarchibald gmarkall)
PR #8557: [Unicode] support startswith with args, start and end. ()
PR #8566: Update workqueue abort message on concurrent access. (stuartarchibald)
PR #8572: CUDA: Reduce memory pressure from local memory tests (gmarkall)
PR #8579: CUDA: Add CUDA 11.8 / Hopper support and required fixes (gmarkall)
PR #8580: adding note about doing a wheel test build prior to tagging (esc)
PR #8583: Skip tests that contribute to M1 RuntimeDyLd Assertion error (sklam)
PR #8587: Remove unused refcount removal code, clean core/cpu.py module. \
(stuartarchibald)
PR #8588: Remove lowering extension hooks, replace with pass infrastructure. \
(stuartarchibald)
PR #8590: Py3.11 support continues (sklam)
PR #8592: fix failure of test_cache_invalidate due to read-only install \
(tpwrules)
PR #8593: Adjusted ULP precesion for noncentral distribution test (kc611)
PR #8594: Fix various CUDA lineinfo issues (gmarkall)
PR #8597: Prevent use of NumPy’s MaskedArray. (stuartarchibald)
PR #8598: Setup Azure CI to test py3.11 (sklam)
PR #8600: Chrome trace timestamp should be in microseconds not seconds. (sklam)
PR #8602: Throw error for unsupported dunder methods (apmasell)
PR #8605: Support for CUDA fp16 math functions (part 1) (testhound)
PR #8606: [Doc] Make the RewriteArrayExprs doc more precise ()
PR #8619: Added flat iteration logic for random distributions (kc611)
PR #8623: Adds support for np.nan_to_num (thomasjpfan)
PR #8624: DOC: Add guvectorize scalar return example (Matt711)
PR #8625: Refactor test_ufuncs (gmarkall)
PR #8626: [unicode-PERF]: use optmized BM algorithm to replace the \
brute-force finder (dlee992)
PR #8630: Fix #8628: Don’t test math.trunc with non-float64 NumPy scalars \
(gmarkall)
PR #8634: Add new method is_fp16_supported (testhound)
PR #8636: CUDA: Skip test_ptds on Windows (gmarkall)
PR #8639: Python 3.11 - fix majority of remaining test failures. \
(stuartarchibald)
PR #8644: Fix bare reraise support (sklam)
PR #8649: Remove numba.core.overload_glue module. (apmasell)
PR #8659: Preserve module name of jitted class (neilflood)
PR #8661: Make external compiler discovery lazy in the test suite. \
(stuartarchibald)
PR #8662: Add support for .nbytes accessor for numpy arrays (alanhdu)
PR #8666: Updates for Python 3.8 baseline/Python 3.11 migration (stuartarchibald)
PR #8673: Enable the CUDA simulator tests on Windows builds in Azure CI. \
(stuartarchibald)
PR #8675: Make always_run test decorator a tag and improve shard tests. \
(stuartarchibald)
PR #8677: Add support for min and max on boolean types. (DrTodd13)
PR #8680: Adjust flake8 config to be compatible with flake8=6.0.0 (thomasjpfan)
PR #8685: Implement __repr__ for numba types (luk-f-a)
PR #8691: NumPy 1.24 (gmarkall)
PR #8697: Close stale issues after 7 days (gmarkall)
PR #8701: Relaxed ULP testing precision for NumPy Generator tests across all \
systems (kc611)
PR #8702: Supply concrete timeline for objmode fallback deprecation/removal. \
(stuartarchibald)
PR #8706: Fix doctest for @vectorize (sklam)
PR #8711: Python 3.11 tracing support (continuation of #8670). \
(AndrewVallette sklam)
PR #8716: CI: Use set -e in “Before Install” step and fix install (gmarkall)
PR #8720: Enable coverage for subprocess testing (sklam)
PR #8723: Check for void return type in cuda.compile_ptx (brandonwillard)
PR #8726: Make Numba dependency check run ahead of Numba internal imports. \
(stuartarchibald)
PR #8728: Fix flake8 checks since upgrade to flake8=6.x (stuartarchibald)
PR #8729: Run flake8 CI step in multiple processes. (stuartarchibald)
PR #8732: Add numpy argpartition function support ()
PR #8735: Update bot to close PRs waiting on authors for more than 3 months \
(guilhermeleobas)
PR #8736: Implement np.lib.stride_tricks.sliding_window_view ()
PR #8744: Update CtypesLinker::add_cu error message to include fp16 usage \
(testhound gmarkall)
PR #8746: Fix failing test_dispatcher test case (testhound)
PR #8748: Suppress known test failures for py3.11 (sklam)
PR #8751: Recycle test runners more aggressively (apmasell)
PR #8752: Flake8 fixes for py311 branch (esc sklam)
PR #8760: Bump llvmlite PR in py3.11 branch testing (sklam)
PR #8764: CUDA tidy-up: remove some unneeded methods (gmarkall)
PR #8765: BLD: remove distutils (fangchenli)
PR #8766: Stale bot: Use abandoned - stale label for closed PRs (gmarkall)
PR #8771: Update vendored Versioneer from 0.14 to 0.28 (oscargus gmarkall)
PR #8775: Revert PR#8751 for buildfarm stability (sklam)
PR #8780: Improved documentation for Atomic CAS (MiloniAtal)
PR #8781: Ensure gc.collect() is called before checking refcount in tests. \
(sklam)
PR #8782: Changed wording of the escape error (MiloniAtal)
PR #8786: Upgrade stale GitHub action (apmasell)
PR #8788: CUDA: Fix returned dtype of vectorized functions (Issue #8400) \
(gmarkall)
PR #8790: CUDA compare and swap with index (ianthomas23)
PR #8795: Add pending-deprecation warnings for numba.pycc (stuartarchibald)
PR #8802: Move the minimum supported NumPy version to 1.21 (stuartarchibald)
PR #8803: Attempted fix to #8789 by changing compile_ptx to accept a \
signature instead of argument tuple (KyanCheung)
PR #8804: Split parfor pass into 3 parts (DrTodd13)
PR #8809: Update LLVM versions for 0.57 release (apmasell)
PR #8810: Fix llvmlite dependency in meta.yaml (sklam)
PR #8816: Fix some buildfarm test failures (sklam)
PR #8819: Support “static” __getitem__ on Numba types in @njit code. \
(stuartarchibald)
PR #8822: Merge py3.11 branch to main (esc AndrewVallette stuartarchibald sklam)
PR #8826: CUDA CFFI test: conditionally require cffi module (gmarkall)
PR #8831: Redo py3.11 sync branch with main (sklam)
PR #8833: Fix typeguard import hook location. (stuartarchibald)
PR #8836: Fix failing typeguard test. (stuartarchibald)
PR #8837: Update AzureCI matrix for Python 3.11/NumPy 1.21..1.24 \
(stuartarchibald)
PR #8839: Add Dynamic Shared Memory example. (k1m190r)
PR #8842: Fix buildscripts, setup.py, docs for setuptools becoming optional. \
(stuartarchibald)
PR #8843: Pin typeguard to 3.0.1 in AzureCI. (stuartarchibald)
PR #8848: added lifted loops to glossary term (cherieliu)
PR #8852: Disable SLP vectorisation due to miscompilations. (stuartarchibald)
PR #8855: DOC: pip into double backticks in installing.rst (F3eQnxN3RriK)
PR #8856: Update TBB to use >= 2021.6 by default. (kozlov-alexey \
stuartarchibald)
PR #8858: Update deprecation notice for objmode fallback RE @jit use. \
(stuartarchibald)
PR #8864: Remove obsolete deprecation notices (gmarkall)
PR #8866: Revise CUDA deprecation notices (gmarkall)
PR #8869: Update CHANGE_LOG for 0.57.0rc1 (stuartarchibald esc gmarkall)
PR #8870: Fix opcode “spelling” change since Python 3.11 in CUDA debug \
test. (stuartarchibald)
PR #8879: Remove use of compile_isolated from generator tests. (stuartarchibald)
PR #8880: Fix missing dependency guard on pyyaml in test_azure_config. \
(stuartarchibald)
PR #8881: Replace use of compile_isolated in test_obj_lifetime (sklam)
PR #8884: Pin llvmlite and NumPy on release branch (sklam)
PR #8887: Update PyPI supported version tags (bryant1410)
PR #8896: Remove codecov install (now deleted from PyPI) (gmarkall)
PR #8902: Enable CALL_FUNCTION_EX fix for py3.11 (sklam)
PR #8907: Work around issue #8898. Defer exp2 (and log2) calls to Numba \
internal symbols. (stuartarchibald)
PR #8909: Fix #8903. NumbaDeprecationWarning``s raised from \
``@{gu,}vectorize. (stuartarchibald)
PR #8929: Update CHANGE_LOG for 0.57.0 final. (stuartarchibald)
PR #8930: Fix year in change log (jtilly)
PR #8932: Fix 0.57 release changelog (sklam)
Authors:
alanhdu
AndrewVallette
apmasell
armgabrielyan
aseyboldt
Biswa96
brandonwillard
bryant1410
bszollosinagy
cako
cherieliu
ChiCheng45
DannyWeitekamp
dlee992
dmbelov
DrTodd13
eric-wieser
esc
F3eQnxN3RriK
fangchenli
gmarkall
guilhermeleobas
ianthomas23
jamesobutler
jorgepiloto
jtilly
k1m190r
kc611
kozlov-alexey
KyanCheung
luk-f-a
Matt711
MiloniAtal
naveensrinivasan
neilflood
Nimrod0901
njriasan
oleksandr-pavlyk
oscargus
raybellwaves
shangbol
sklam
stefanfed
stuartarchibald
testhound
thomasjpfan
tpwrules
Version 0.56.4 (3 November, 2022)
This is a bugfix release to fix a regression in the CUDA target in relation to \
the .view() method on CUDA device arrays that is present when using NumPy \
version 1.23.0 or later.
Pull-Requests:
PR #8537: Make ol_compatible_view accessible on all targets (gmarkall)
PR #8552: Update version support table for 0.56.4. (stuartarchibald)
PR #8553: Update CHANGE_LOG for 0.56.4 (stuartarchibald)
PR #8570: Release 0.56 branch: Fix overloads with target="generic" \
for CUDA (gmarkall)
PR #8571: Additional update to CHANGE_LOG for 0.56.4 (stuartarchibald)
Authors:
gmarkall
stuartarchibald
Version 0.56.3 (13 October, 2022)
This is a bugfix release to remove the version restriction applied to the \
setuptools package and to fix a bug in the CUDA target in relation to copying \
zero length device arrays to zero length host arrays.
Pull-Requests:
PR #8475: Remove setuptools version pin (gmarkall)
PR #8482: Fix #8477: Allow copies with different strides for 0-length data \
(gmarkall)
PR #8486: Restrict the TBB development package to supported version in \
Azure. (stuartarchibald)
PR #8503: Update version support table for 0.56.3 (stuartarchibald)
PR #8504: Update CHANGE_LOG for 0.56.3 (stuartarchibald)
Authors:
gmarkall
stuartarchibald
Version 0.56.2 (1 September, 2022)
This is a bugfix release that supports NumPy 1.23 and fixes CUDA function caching.
Pull-Requests:
PR #8239: Add decorator to run a test in a subprocess (stuartarchibald)
PR #8276: Move Azure to use macos-11 (stuartarchibald)
PR #8310: CUDA: Fix Issue #8309 - atomics don’t work on complex components \
(Graham Markall)
PR #8342: Upgrade to ubuntu-20.04 for azure pipeline CI (jamesobutler)
PR #8356: Update setup.py, buildscripts, CI and docs to require \
setuptools<60 (stuartarchibald)
PR #8374: Don’t pickle LLVM IR for CUDA code libraries (Graham Markall)
PR #8377: Add support for NumPy 1.23 (stuartarchibald)
PR #8384: Move strace() check into tests that actually need it (stuartarchibald)
PR #8386: Fix the docs for numba.get_thread_id (stuartarchibald)
PR #8407: Pin NumPy version to 1.18-1.24 (Andre Masella)
PR #8411: update version support table for 0.56.1 (esc)
PR #8412: Create changelog for 0.56.1 (Andre Masella)
PR #8413: Fix Azure CI for NumPy 1.23 and use conda-forge scipy (Siu Kwan Lam)
PR #8414: Hotfix for 0.56.2 (Siu Kwan Lam)
Authors:
Andre Masella
esc
Graham Markall
jamesobutler
Siu Kwan Lam
stuartarchibald
Version 0.56.1 (NO RELEASE)
The release was skipped due to issues during the release process.
Version 0.56.0 (25 July, 2022)
This release continues to add new features, bug fixes and stability improvements \
to Numba. Please note that this will be the last release that has support for \
Python 3.7 as the next release series (Numba 0.57) will support Python 3.11! \
Also note that, this will be the last release to support linux-32 packages \
produced by the Numba team.
Python language support enhancements:
Previously missing support for large, in-line dictionaries and internal \
calls to functions with large numbers of keyword arguments in Python 3.10 has \
been added.
operator.mul now works for list s.
Literal slices, e.g. slice(1, 10, 2) can be returned from nopython mode \
functions.
The len function now works on dict_keys, dict_values and dict_items .
Numba’s set implementation now supports reference counted items e.g. strings.
Numba specific feature enhancements:
The experimental jitclass feature gains support for a large number of \
builtin methods e.g. declaring __hash__ or __getitem__ for a jitclass type.
It’s now possible to use @vectorize on an already @jit family decorated \
function.
Name mangling has been updated to emit compiled function names that exactly \
match the function name in Python. This means debuggers, like GDB, can be set to \
break directly on Python function names.
A GDB “pretty printing” support module has been added, when loaded into \
GDB Numba’s internal representations of Python/NumPy types are rendered inside \
GDB as they would be in Python.
An experimental option is added to the @jit family decorators to entirely \
turn off LLVM’s optimisation passes for a given function (see _dbg_optnone \
kwarg in the @jit decorator family).
A new environment variable is added NUMBA_EXTEND_VARIABLE_LIFETIMES, which \
if set will extend the lifetime of variables to the end of their basic block, \
this to permit a debugging experience in GDB similar to that found in compiled \
C/C++/Fortran code.
NumPy features/enhancements:
Initial support for passing, using and returning numpy.random.Generator \
instances has been added, this currently includes support for the random \
distribution.
The broadcasting functions np.broadcast_shapes and np.broadcast_arrays are \
now supported.
The min and max functions now work with np.timedelta64 and np.datetime64 types.
Sorting multi-dimensional arrays along the last axis is now supported in \
np.sort().
The np.clip function is updated to accept NumPy arrays for the a_min and \
a_max arguments.
The NumPy allocation routines (np.empty , np.ones etc.) support shape \
arguments specified using members of enum.IntEnum s.
The function np.random.noncentral_chisquare is now supported.
The performance of functions np.full and np.ones has been improved.
Parallel Accelerator enhancements:
The parallel=True functionality is enhanced through the addition of the \
functions numba.set_parallel_chunksize and numba.get_parallel_chunksize to \
permit a more fine grained scheduling of work defined in a parallel region. \
There is also support for adjusting the chunksize via a context manager.
The ID of a thread is now defined to be predictable and within a known \
range, it is available through calling the function numba.get_thread_id.
The performance of @stencil s has been improved in both serial and parallel \
execution.
CUDA enhancements:
New functionality:
Self-recursive device functions.
Vector type support (float4, int2, etc.).
Shared / local arrays of extension types can now be created.
Support for linking CUDA C / C++ device functions into Python kernels.
PTX generation for Compute Capabilities 8.6 and 8.7 - e.g. RTX A series, \
GTX 3000 series.
Comparison operations for float16 types.
Performance improvements:
Context queries are no longer made during launch configuration.
Launch configurations are now LRU cached.
On-disk caching of CUDA kernels is now supported.
Documentation: many new examples added.
Docs:
Numba now has an official “mission statement”.
There’s now a “version support table” in the documentation to act as \
an easy to use, single reference point, for looking up information about Numba \
releases and their required/supported dependencies.
General Enhancements:
Numba imports more quickly in environments with large numbers of packages as \
it now uses importlib-metadata for querying other packages.
Emission of chrome tracing output is now supported for the internal \
compilation event handling system.
This release is tested and known to work when using the Pyston Python \
interpreter.
Pull-Requests:
PR #5209: Use importlib to load numba extensions (Stepan Rakitin Graham \
Markall stuartarchibald)
PR #5877: Jitclass builtin methods (Ethan Pronovost Graham Markall)
PR #6490: Stencil output allocated with np.empty now and new code to \
initialize the borders. (Todd A. Anderson)
PR #7005: Make numpy.searchsorted match NumPy when first argument is \
unsorted (Brandon T. Willard)
PR #7363: Update cuda.local.array to clarify “simple constant \
expression” (e.g. no NumPy ints) (Sterling Baird)
PR #7364: Removes an instance of signed integer overflow undefined \
behaviour. (Tobias Sargeant)
PR #7537: Add chrome tracing (Hadia Ahmed Siu Kwan Lam)
PR #7556: Testhound/fp16 comparison (Michael Collison Graham Markall)
PR #7586: Support for len on dict.keys, dict.values, and dict.items (Nick \
Riasanovsky)
PR #7617: Numba gdb-python extension for printing (stuartarchibald)
PR #7619: CUDA: Fix linking with PTX when compiling lazily (Graham Markall)
PR #7621: Add support for linking CUDA C / C++ with @cuda.jit kernels \
(Graham Markall)
PR #7625: Combined parfor chunking and caching PRs. (stuartarchibald Todd A. \
Anderson Siu Kwan Lam)
PR #7651: DOC: pypi and conda-forge badges (Ray Bell)
PR #7660: Add support for np.broadcast_arrays (Guilherme Leobas)
PR #7664: Flatten mangling dicts into a single dict (Graham Markall)
PR #7680: CUDA Docs: include example calling slow matmul (Graham Markall)
PR #7682: performance improvements to np.full and np.ones (Rishi Kulkarni)
PR #7684: DOC: remove incorrect warning in np.random reference (Rishi Kulkarni)
PR #7685: Don’t convert setitems that have dimension mismatches to \
parfors. (Todd A. Anderson)
PR #7690: Implemented np.random.noncentral_chisquare for all size arguments \
(Rishi Kulkarni)
PR #7695: IntEnumMember support for np.empty, np.zeros, and np.ones \
(Benjamin Graham)
PR #7699: CUDA: Provide helpful error if the return type is missing for \
declare_device (Graham Markall)
PR #7700: Support for scalar arguments in Np.ascontiguousarray (Dhruv Patel)
PR #7703: Ignore unsupported types in ShapeEquivSet._getnames() (Benjamin Graham)
PR #7704: Move the type annotation pass to post legalization. (stuartarchibald)
PR #7709: CUDA: Fixes missing type annotation pass following #7704 \
(stuartarchibald)
PR #7712: Fixing issue 7693 (stuartarchibald Graham Markall luk-f-a)
PR #7714: Support for boxing SliceLiteral type (Nick Riasanovsky)
PR #7718: Bump llvmlite dependency to 0.39.0dev0 for Numba 0.56.0dev0 \
(stuartarchibald)
PR #7724: Update URLs in error messages to refer to RTD docs. (stuartarchibald)
PR #7728: Document that AOT-compiled functions do not check arg types \
(Graham Markall)
PR #7729: Handle Omitted/OmittedArgDataModel in DI generation. (stuartarchibald)
PR #7732: update release checklist following 0.55.0 RC1 (esc)
PR #7736: Update CHANGE_LOG for 0.55.0 final. (stuartarchibald)
PR #7740: CUDA Python 11.6 support (Graham Markall)
PR #7744: Fix issues with locating/parsing source during DebugInfo emission. \
(stuartarchibald)
PR #7745: Fix the release year for Numba 0.55 change log entry. (stuartarchibald)
PR #7748: Fix #7713: Ensure _prng_random_hash return has correct bitwidth \
(Graham Markall)
PR #7749: Refactor threading layer priority tests to not use stdout/stderr \
(stuartarchibald)
PR #7752: Fix #7751: Use original filename for array exprs (Graham Markall)
PR #7755: CUDA: Deprecate support for CC < 5.3 and CTK < 10.2 (Graham \
Markall)
PR #7763: Update Read the Docs configuration (automatic) (readthedocs-assistant)
PR #7764: Add dbg_optnone and dbg_extend_lifetimes flags (Siu Kwan Lam)
PR #7771: Move function unique ID to abi-tags (stuartarchibald Siu Kwan Lam)
PR #7772: CUDA: Add Support to Creating StructModel Array (Michael Wang)
PR #7776: Updates coverage.py config (stuartarchibald)
PR #7777: Remove reference existing issue from GH template. (stuartarchibald)
PR #7778: Remove long deprecated flags from the CLI. (stuartarchibald)
PR #7780: Fix sets with reference counted items (Benjamin Graham)
PR #7782: adding reminder to check on deprecations (esc)
PR #7783: remove upper limit on Python version (esc)
PR #7786: Remove dependency on intel-openmp for OSX (stuartarchibald)
PR #7788: Avoid issue with DI gen for arrayexprs. (stuartarchibald)
PR #7796: update change-log for 0.55.1 (esc)
PR #7797: prune README (esc)
PR #7799: update the release checklist post 0.55.1 (esc)
PR #7801: add sdist command and umask reminder (esc)
PR #7804: update local references from master -> main (esc)
PR #7805: Enhance source line finding logic for debuginfo (Siu Kwan Lam)
PR #7809: Updates the gdb configuration to accept a binary name or a path. \
(stuartarchibald)
PR #7813: Extend parfors test timeout for aarch64. (stuartarchibald)
PR #7814: CUDA Dispatcher refactor (Graham Markall)
PR #7815: CUDA Dispatcher refactor 2: inherit from dispatcher.Dispatcher \
(Graham Markall)
PR #7817: Update intersphinx URLs for NumPy and llvmlite. (stuartarchibald)
PR #7823: Add renamed vars to callee scope such that it is self consistent. \
(stuartarchibald)
PR #7829: CUDA: Support Enum/IntEnum in Kernel (Michael Wang)
PR #7833: Add version support information table to docs. (stuartarchibald)
PR #7835: Fix pickling error when module cannot be imported (idorrington)
PR #7836: min() and max() support for np.datetime and np.timedelta (Benjamin \
Graham)
PR #7837: Initial refactoring of parfor reduction lowering (Siu Kwan Lam)
PR #7845: change time.time() to time.perf_counter() in docs (Nopileos2)
PR #7846: Fix CUDA enum vectorize test on Windows (Graham Markall)
PR #7848: Support for int * list (Nick Riasanovsky)
PR #7850: CUDA: Pass fastmath compiler flag down to compile_ptx and \
compile_device; Improve fastmath tests (Michael Wang)
PR #7855: Ensure np.argmin/no.argmax return type is intp (stuartarchibald)
PR #7858: CUDA: Deprecate ptx Attribute and Update Tests (Graham Markall \
Michael Wang)
PR #7861: Fix a spelling mistake in README (Zizheng Guo)
PR #7864: Fix cross_iter_dep check. (Todd A. Anderson)
PR #7865: Remove add_user_function (Graham Markall)
PR #7866: Support for large numbers of args/kws with Python 3.10 (Nick \
Riasanovsky)
PR #7878: CUDA: Remove some deprecated support, add CC 8.6 and 8.7 (Graham \
Markall)
PR #7893: Use uuid.uuid4() as the key in serialization. (stuartarchibald)
PR #7895: Remove use of llvmlite.llvmpy (Andre Masella)
PR #7898: Skip test_ptds under cuda-memcheck (Graham Markall)
PR #7901: Pyston compatibility for the test suite (Kevin Modzelewski)
PR #7904: Support m1 (esc)
PR #7911: added sys import (Nightfurex)
PR #7915: CUDA: Fix test checking debug info rendering. (stuartarchibald)
PR #7918: Add JIT examples to CUDA docs (brandon-b-miller Graham Markall)
PR #7919: Disallow //= reductions in pranges. (Todd A. Anderson)
PR #7924: Retain non-modified index tuple components. (Todd A. Anderson)
PR #7939: Fix rendering in feature request template. (stuartarchibald)
PR #7940: Implemented np.allclose in numba/np/arraymath.py (Gagandeep Singh)
PR #7941: Remove debug dump output from closure inlining pass. (stuartarchibald)
PR #7946: instructions for creating a build environment were outdated (esc)
PR #7949: Add Cuda Vector Types (Michael Wang)
PR #7950: mission statement (esc)
PR #7956: Stop using pip for 3.10 on public ci (Revert “start testing \
Python 3.10 on public CI”) (esc)
PR #7957: Use cloudpickle for disk caches (Siu Kwan Lam)
PR #7958: numpy.clip accept numpy.array for a_min, a_max (Gagandeep Singh)
PR #7959: Permit a new array model to have a super set of array model \
fields. (stuartarchibald)
PR #7961: numba.typed.typeddict.Dict.get uses castedkey to avoid returning \
default value even if the key is present (Gagandeep Singh)
PR #7963: remove the roadmap from the sphinx based docs (esc)
PR #7964: Support for large constant dictionaries in Python 3.10 (Nick \
Riasanovsky)
PR #7965: Use uuid4 instead of PID in cache temp name to prevent collisions. \
(stuartarchibald)
PR #7971: lru cache for configure call (Tingkai Liu)
PR #7972: Fix fp16 support for cuda shared array (Michael Collison Graham \
Markall)
PR #7986: Small caching refactor to support target cache implementations \
(Graham Markall)
PR #7994: Supporting multidimensional arrays in quick sort (Gagandeep Singh \
Siu Kwan Lam)
PR #7996: Fix binding logic in @overload_glue. (stuartarchibald)
PR #7999: Remove @overload_glue for NumPy allocators. (stuartarchibald)
PR #8003: Add np.broadcast_shapes (Guilherme Leobas)
PR #8004: CUDA fixes for Windows (Graham Markall)
PR #8014: Fix support for {real,imag} array attrs in Parfors. (stuartarchibald)
PR #8016: [Docs] [Very Minor] Make numba.jit boundscheck doc line consistent \
(Kyle Martin)
PR #8017: Update FAQ to include details about using debug-only option \
(Guilherme Leobas)
PR #8027: Support for NumPy 1.22 (stuartarchibald)
PR #8031: Support for Numpy BitGenerators PR#1 - Core Generator Support \
(Kaustubh)
PR #8035: Fix a couple of typos RE implementation (stuartarchibald)
PR #8037: CUDA self-recursion tests (Graham Markall)
PR #8044: Make Python 3.10 kwarg peephole less restrictive (Nick Riasanovsky)
PR #8046: Fix caching test failures (Siu Kwan Lam)
PR #8049: support str(bool) syntax (LI Da)
PR #8052: Ensure pthread is linked in when building for ppc64le. (Siu Kwan Lam)
PR #8056: Move caching tests from test_dispatcher to test_caching (Graham \
Markall)
PR #8057: Fix coverage checking (Graham Markall)
PR #8064: Rename “nb:run_pass” to “numba:run_pass” and document it. \
(Siu Kwan Lam)
PR #8065: Fix PyLowering mishandling starargs (Siu Kwan Lam)
PR #8068: update changelog for 0.55.2 (esc)
PR #8077: change return type of np.broadcast_shapes to a tuple (Guilherme Leobas)
PR #8080: Fix windows test failure due to timeout when the machine is slow \
poss… (Siu Kwan Lam)
PR #8081: Fix erroneous array count in parallel gufunc kernel generation. \
(stuartarchibald)
PR #8089: Support on-disk caching in the CUDA target (Graham Markall)
PR #8097: Exclude libopenblas 0.3.20 on osx-arm64 (esc)
PR #8099: Fix Py_DECREF use in case of error state (for devicearray). \
(stuartarchibald)
PR #8102: Combine numpy run_constrained in meta.yaml to the run requirements \
(Siu Kwan Lam)
PR #8109: Pin TBB support with respect to incompatible 2021.6 API. \
(stuartarchibald)
PR #8118: Update release checklists post 0.55.2 (esc)
PR #8123: Fix CUDA print tests on Windows (Graham Markall)
PR #8124: Add explicit checks to all allocators in the NRT. (stuartarchibald)
PR #8126: Mark gufuncs as having mutable outputs (Andre Masella)
PR #8133: Fix #8132. Regression in Record.make_c_struct for handling \
nestedarray (Siu Kwan Lam)
PR #8137: CUDA: Fix #7806, Division by zero stops the kernel (Graham Markall)
PR #8142: CUDA: Fix some missed changes from dropping 9.2 (Graham Markall)
PR #8144: Fix NumPy capitalisation in docs. (stuartarchibald)
PR #8145: Allow ufunc builder to use previously JITed function (Andre Masella)
PR #8151: pin NumPy to build 0 of 1.19.2 on public CI (esc)
PR #8163: CUDA: Remove context query in launch config (Graham Markall)
PR #8165: Restrict strace based tests to be linux only via support feature. \
(stuartarchibald)
PR #8170: CUDA: Fix missing space in low occupancy warning (Graham Markall)
PR #8175: make build and upload order consistent (esc)
PR #8181: Fix various typos (luzpaz)
PR #8187: Update CHANGE_LOG for 0.55.2 (stuartarchibald esc)
PR #8189: updated version support information for 0.55.2/0.57 (esc)
PR #8191: CUDA: Update deprecation notes for 0.56. (Graham Markall)
PR #8192: Update CHANGE_LOG for 0.56.0 (stuartarchibald esc Siu Kwan Lam)
PR #8195: Make the workqueue threading backend once again fork safe. \
(stuartarchibald)
PR #8196: Fix numerical tolerance in parfors caching test. (stuartarchibald)
PR #8197: Fix isinstance warning check test. (stuartarchibald)
PR #8203: pin llvmlite 0.39 for public CI builds (esc)
PR #8255: CUDA: Make numba.cuda.tests.doc_examples.ffi a module to fix #8252 \
(Graham Markall)
PR #8274: Update version support table doc for 0.56. (stuartarchibald)
PR #8275: Update CHANGE_LOG for 0.56.0 final (stuartarchibald)
Authors:
Andre Masella
Benjamin Graham
brandon-b-miller
Brandon T. Willard
Gagandeep Singh
Dhruv Patel
LI Da
Todd A. Anderson
Ethan Pronovost
esc
Tobias Sargeant
Graham Markall
Guilherme Leobas
Zizheng Guo
Hadia Ahmed
idorrington
Michael Wang
Kaustubh
Kevin Modzelewski
luk-f-a
luzpaz
Kyle Martin
Nightfurex
Nick Riasanovsky
Nopileos2
Ray Bell
readthedocs-assistant
Rishi Kulkarni
Sterling Baird
Siu Kwan Lam
stuartarchibald
Stepan Rakitin
Michael Collison
Tingkai Liu
|