Log message:
Update to GMP 5.0.1
Changes in GMP 5.0.1
BUGS FIXED
- Fat builds fixed.
- Fixed crash for huge multiplies when old FFT_TABLE2 type of parameter \
selection tables' sentinel was smaller than multiplied
operands.
- The solib numbers now reflect the removal of the documented but preliminary \
mpn_bdivmod function; we correctly flag
incompatibility with GMP 4.3. GMP 5.0.0 has this wrong, and should perhaps \
be uninstalled to avoid confusion.
SPEEDUPS
- Multiplication of large numbers has indirectly been sped up through better \
FFT tuning and processor recognition. Since many
operations depend on multiplication, there will be a general speedup.
FEATURES
- More Core i3, i5 an Core i7 processor models are recognised.
- Fixes and workarounds for Mac OS quirks should make this GMP version build \
using many of the different versions of "Xcode".
MISC
- The amount of scratch memory needed for multiplication of huge numbers have \
been reduced substantially (but is still larger
than in GMP 4.3.)
- Likewise, the amount of scratch memory needed for division of large numbers \
have been reduced substantially.
- The FFT tuning code of tune/tuneup.c has been completely rewritten, and new, \
large FFT parameter selection tables are provided
for many machines.
- Upgraded to the latest autoconf, automake, libtool.
Changes in GMP 5.0.0
BUGS FIXED
- None (contains the same fixes as release 4.3.2).
SPEEDUPS
- Multiplication has been overhauled:
1. Multiplication of larger same size operands has been improved with the \
addition of two new Toom functions and a new
internal function mpn_mulmod_bnm1 (computing U * V mod (B^n-1), B being \
the word base. This latter function is used for the
largest products, waiting for a better Schoenhage-Strassen U * V mod \
(B^n+1) implementation.
2. Likewise for squaring.
3. Multiplication of different size operands has been improved with the \
addition of many new Toom function, and by selecting
underlying functions better from the main multiply functions.
- Division and mod have been overhauled:
1. Plain "schoolbook" division is reimplemented using faster \
quotient approximation.
2. Division Q = N/D, R = N mod D where both the quotient and remainder are \
needed now runs in time O(M(log(N))). This is an
improvement of a factor log(log(N))
3. Division where just the quotient is needed is now O(M(log(Q))) on average.
4. Modulo operations using Montgomery REDC form now take time O(M(n)).
5. Exact division Q = N/D by means of mpz_divexact has been improved for \
all sizes, and now runs in time O(M(log(N))).
- The function mpz_powm is now faster for all sizes. Its complexity has gone \
from O(M(n)log(n)m) to O(M(n)m) where n is the size
of the modulo argument and m is the size of the exponent. It is also \
radically faster for even modulus, since it now partially
factors such modulus and performs two smaller modexp operations, then uses CRT.
- The internal support for multiplication yielding just the lower n limbs has \
been improved by using Mulders' algorithm.
- Computation of inverses, both plain 1/N and 1/N mod B^n have been improved \
by using well-tuned Newton iterations, and
wrap-around multiplication using mpn_mulmod_bnm1.
- A new algorithm makes mpz_perfect_power_p asymptotically faster.
- The function mpz_remove uses a much faster algorithm, is better tuned, and \
also benefits from the division improvements.
- Intel Atom and VIA Nano specific optimisations.
- Plus hundreds of smaller improvements and tweaks!
FEATURES
- New mpz function: mpz_powm_sec for side-channel quiet modexp computations.
- New mpn functions: mpn_sqr, mpn_and_n, mpn_ior_n, mpn_xor_n, mpn_nand_n, \
mpn_nior_n, mpn_xnor_n, mpn_andn_n, mpn_iorn_n,
mpn_com, mpn_neg, mpn_copyi, mpn_copyd, mpn_zero.
- The function mpn_tdiv_qr now allows certain argument overlap.
- Support for fat binaries for 64-bit x86 processors has been added.
- A new type, mp_bitcnt_t for bignum bit counts, has been introduced.
- Support for Windows64 through mingw64 has been added.
- The cofactors of mpz_gcdext and mpn_gcdext are now more strictly normalised, \
returning to how GMP 4.2 worked. (Note that also
release 4.3.2 has this change.)
MISC
- The mpn_mul function should no longer be used for squaring, instead use the \
new mpn_sqr.
- The algorithm selection has been improved, the number of thresholds have \
more than doubled, and the tuning and use of existing
thresholds have been improved.
- The tune/speed program can measure many of new functions.
- The mpn_bdivmod function has been removed. We do not consider this an \
incompatible change, since the function was marked as
preliminary.
- The testsuite has been enhanced in various ways.
Changes in GMP 4.3.2
Bugs:
- Fixed bug in mpf_eq.
- Fixed overflow issues in mpz_set_str, mpz_inp_str, mpf_set_str, and mpf_get_str.
- Avoid unbounded stack allocation for unbalanced multiplication.
- Fixed bug in FFT multiplication.
Speedups:
- None, except that improved processor recognition helps affected processors.
Features:
- Recognise more "Core 2" processor variants.
- The cofactors of mpz_gcdext and mpn_gcdext are now more strictly normalised, \
returning to how GMP 4.2 worked.
|
Log message:
Update to 4.3.1. Add gnu-gpl-v3 to LICENSE, since README claims it
also affects some files.
Changes between GMP version 4.3.0 and 4.3.1
Bugs:
* Fixed bug in mpn_gcdext, affecting also mpz_gcdext and mpz_invert.
The bug could cause a cofactor to have a leading zero limb, which
could lead to crashes or miscomputation later on.
* Fixed some minor documentation issues.
Features:
* Workarounds for various issues with Mac OS X's build tools.
* Recognise more IBM "POWER" processor variants.
Changes between GMP version 4.2.X and 4.3.0
Bugs:
* Fixed bug in mpz_perfect_power_p with recognition of negative perfect
powers that can be written both as an even and odd power.
* We might accidentally have added bugs since there is a large amount of
new code in this release.
Speedups:
* Vastly improved assembly code for x86-64 processors from AMD and Intel.
* Major improvements also for many other processor families, such as
Alpha, PowerPC, and Itanium.
* New sub-quadratic mpn_gcd and mpn_gcdext, as well as improved basecase
gcd code.
* The multiply FFT code has been slightly improved.
* Balanced multiplication now uses 4-way Toom in addition to schoolbook,
Karatsuba, 3-way Toom, and FFT.
* Unbalanced multiplication has been vastly improved.
* Improved schoolbook division by means of faster quotient approximation.
* Several new algorithms for division and mod by single limbs, giving
many-fold speedups.
* Improved nth root computations.
* The mpz_nextprime function uses sieving and is much faster.
* Countless minor tweaks.
Features:
* Updated support for fat binaries for x86_32 include current processors
* Lots of new mpn internal interfaces. Some of them will become public
in a future GMP release.
* Support for the 32-bit ABI under x86-apple-darwin.
* x86 CPU recognition code should now default better for future
processors.
* The experimental nails feature does not work in this release, but
it might be re-enabled in the future.
Misc:
* The gmp_version variable now always contains three parts. For this
release, it is "4.3.0".
|