Copyright 1996, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006 Free Software Foundation, Inc. Copyright 2009 William Hart This file is part of the MPIR Library. The MPIR Library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version. The MPIR Library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details. You should have received a copy of the GNU Lesser General Public License along with the GNU MP Library; see the file COPYING.LIB. If not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. Changes between MPIR 1.3.0 and MPIR 2.0.0 License: * Switched to overall LGPL v3+ Bug Fixes: * Fixed a bug in the probable prime code (reported by Xiangyu Liu) * Fixed a buld issue on 32 bit p6 Apples * Fixed demos/pollard_rho * Numerous tuning bug fixes Speedups: * Sped up squaring code * Minor speedup to toom4 code * Sped up x86_64 divrem_1 when divisor is 64 bits * Sped up x86_64 divrem_2 * Sped up GCD and GCDEXT by an improved nhgcd2.c * Sped up addmul code for Itanium (by Jason Martin) * Large number of new and sped up Itanium assembly functions (by Torbjorn Granlund) Features: * Toom8.5 code (by Marco Bodrato) * Schoolbook Euclidean division code (by Torbjorn Granlund) * Divide and conquer Euclidean division code (by Torbjorn Granlund) and Marco Bodrato (adapted to use David Harvey's middle product based approximate quotient code) * Asymptotically fast division code (by William Hart), based on Paul Zimmermann's mpn_invert and some reuse of the divide and conquer code. * New mpn_tdiv_q and mpn_tdiv_qr code (by Torbjorn Granlund) * Schoolbook Hensel division code, (largely by Niels Moller) * Divide and conquer Hensel division code (by Niels Moller, Torbjorn Granlund and David Harvey) * New mpn_divexact code and mpz_divexact to match (by Torbjorn Granlund) * New mpn_rootrem, mpz_rootrem and mpz_root code (by Paul Zimmermann and Torbjorn Granlund) * New mpn_neg, mpn_sqr, mpn_zero, mpn_and_n, mpn_ior_n, mpn_xor_n, mpn_xnor_n, mpn_nand_n, etc (by Torbjorn Granlund) * New string input/output code (by Torbjorn Granlund) * New mp_bitcnt_t type for multiple precision bit counts Changes: * Removed benchmark 0.1 code from tarball * Updated GMP_VERSION to "5.0.1" Changes between MPIR 1.2.0 and MPIR 1.3.0 Bug Fixes: * Fixes to the build system to better support MinGW * Fixed a memory leak in lehmer GCD code * Fixed a CPU misidentification on BSD * Fixed a BSD install issue * Fixed a make try warning on Solaris * Fixed make distclean to clean up properly after a fat binary build * Fixed a bug in make distcheck * Fixed mpf_eq bug (reported on GMP list) * Fixed non-uniformness of mpz_urandomm * Fixed mpf exponent printing issue (reported on GMP list) * Fixed bug in sparc32/v9 add/sub code * Fixed bug in rootrem code Speedups: * Unbalanced Toom 4 multiplication * Toom 53 multiplication * New fast single limb gcd and gcdext routines * Switched on ngcd based Lehmer GCD routine * Strassen multiplication for 2x2 matrices to speed up ngcd and ngcdext * Switched on new MPN_ZERO and mpn_store assembly routines in FFT code * Left and right shift assembly code for x86_64 * Rewrote generic mullow and mulhi functions * New mpz factorial code and tuning (contributed by Robert Gerbicz) * Updating of 32 bit Windows support for AMD64, p3 and p4 * Core2/penryn and nehalem mpn_store assembly code * Core2/penryn copyi assembly code * Better 32 bit k8/k10 and Nehalem assembly code * Initial support for via Nano * New mpn_rootrem code * Select better assembly code for Atom 64 bit * New faster mpz_tdiv_q code * Faster division and exact division by a single limb on x86_64 * Core2/penryn and nehalem addlsh_n assembly code * K8/k10 addlsh_n, sublsh_n assembly functions, including carry in variants * K8/k10 inclsh_n, declsh_n assembly code Features: * Middle product multiplication (by David Harvey) * Optimised k8/k10 and Nehalem assembly code for add_err1_n, sub_err1_n used by mulmid * Speed program accepts lines of data from a text file * A batch script to build MPIR using MSVC using a configure/make like syntax * Complete rewrite of the benchmark program in C by Brian Gladman * New primality test code written by T. R. Nicely used as a benchmark case, adapted with the help of Jeff Gilchrist * mpn_lshift2 and mpn_rshift2 assembly functions * Latest Yasm assembler * sb_divappr_q, schoolbook approximate quotient * dc_divappr_q, divide and conquer approximate quotient (by David Harvey) * Script for setting all version numbers automatically when doing a release * mpn_neg_n function * New mpn_mulmod_2expp1 and mpn_mulmod_2expm1 functions * Benchmark for mpn functions * New k8 mpn_lshiftc assembler function * Macro functions inclsh1, declsh1 * The try program now tests macro functions * Macros for memory managers to determine when reallocations are likely to occur * New function mpz_nthroot * New mpz_next_likely_prime, mpz_probable_prime_p and mpz_likely_prime functions * BPSW primality test code for integers up to GMP_LIMB_BITS, contributed by Peter Shrimpton * Factor out trial division function from primality test code * New mpf_rrandomb without global state * New mpn_urandomb, mpn_urandomm, mpn_rrandom and mpn_randomb functions without global state * New mpn_invert code (contributed by Paul Zimmermann), used in division code * New generic divrem_hensel functions * Implement Peter Montgomery's mpn_mod_1_k algorithms * Optimised AMD, core2/penryn, atom, nehalem assembly functions for mpn_mod_1_? * New assembly code for AMD divrem_hensel_qr_1, divrem_hensel_r_1 * New AMD, core2/penryn, atom, nehalem assembly functions mpn_rsh_divrem_hensel_qr_1_2 * New optimised AMD, core2/penryn, atom, nehalem assembly functions mpn_divrem_hensel_qr_1_2 * New generic functions mpn_rsh_divrem_hensel_qr_1_? * New generic mpn_tdiv_q function (based on mulmid/dc_divappr_q code) * Improved Windows timing code * Support for new Intel family 6, model 30 Changes: * Removed requirement to type make install-gmpcompat * Make check tests both static and dynamic libraries where code differs * Changed library version numbers from x.y to x.y.z when doing a new minor release * Removed numerous extremely old deprecated functions * Removed mpbsd support from MPIR * Removed ancient ansi2knr conversion * Added architecture directory k102 for Phenom II assembly code Changes between MPIR 1.1.0 and MPIR 1.2.0 Bugs: * None Speedups: * Add new FFT code written by Paul Zimmermann as revised by Paul Zimmermann, Pierrick Gaudry, Alexander Kruppa and Torbjorn Granlund, with numerous bug fixes due to William Hart * Add tuning parameters for new FFT for most modern processors * Write tuning code for new FFT * Implement Toom32, unbalanced Toom3, Toom42 * Optimise Toom3 and Toom3 squaring code using better sequences * Factor out Toom4/7 interpolate sequences and switch to twos complement * Optimise memory usage in Toom 3, 4 and 7 routines * Many new highly optimised assembly routines for x86_64 architectures * Fast XGCD based on Moller's ngcd algorithm Features: * Modified speed program to be able to add values from columns together Changes: * None Changes between MPIR 1.0.0 and MPIR 1.1.0 Bugs: * Work around a linker bug in Apple Darwin Tiger * Resolve an issue causing a build failure on recent Cygwin32's * Fixed development test code to do proper overlap tests for functions with three source operands Speedups: * Added numerous assembly optimised linear division functions (Jason Moxham) * Optimised mul_2 and addmul_2 (Jason Moxham) * Added Toom 4 and Toom 7 multiplication for balanced operands (William Hart) * Small speedup for mpz_mul for small operands when not aliased Features: * Complete rearrangement of cpu detection code to explicitly support k8, k10, pentium4, prescott, netburst, netburstlahf, core2, core, penryn, atom, nehalem * factored out x86/x86_64 detection for both ordinary and fat builds into cpuid.c * Distribute mpirbench with mpir (new make bench option) * Added __GMP_CC and __GMP_CFLAGS, __MPIR_CC and __MPIR_CFLAGS to gmp/mpir.h * Report when CPU is not identified (try sensible defaults) * Support Pentium 4's that do not support LAHF/SAHF instructions * Support Pathscale gcc on MIPS64 * Addition of assembly optimised subadd_n function Changes: * Re-enabled mpbsd functionality Changes between MPIR 0.9.0 and MPIR 1.0.0 Bugs: * Building outside the source tree is now possible * Bug removed from Windows Assembler file dive_1.asm * Fat binary support for Core 2 64 bit fixed * x86_64 fat binary support on Sun machines with gcc fixed * Build failure on Sun machines using later versions of gcc fixed * Aliasing bug in mpz_urandomm fixed * Fixed numerous build bugs on OSX (reported by Michael Abshoff) Speedups: * Dramatic speedups for K8 assembly code (due primarily to Jason Moxham) * Assembly support for K10 * Significant speedups for Core 2 assembly (due primarily to Jason Moxham) * Some mpn assembler functions were not being used in mpz layer due to missing HAVE_NATIVE flags * Nocona processors now use Core 2 assembly functions instead of generic C Features: * Emit mpir binaries and mpir.h and offer support for gmp compatibility * Build support for Intel Atom * Unrecognised Intel 64 machines now default to Core 2 assembly support * Some new, undocumented mpn functions * Try, speed and tune now available for Windows MSVC build Changes between GMP 4.2.1 and MPIR 0.9.0 Bugs: * Sun CC support * C99 support in gmp.h * Build fixes for Apple GCC compiler * Numerous bug fixes posted to gmp-devel for GMP 4.2.1 * Corrections in documentation including function prototypes * Build fix (-fast) for cc on sparc-solaris * Support for Core 2 Solaris * Support for SiCortex MIPS * Distinguish and detect P4, Nocona, Prescott * Support numerous recent Intel family 6 and AMD Dunnington prcessors * Fixed bugs in perfect power detection Speedups: * Jason Martin's Core 2 assembly patches * Niels Möhler's GCD patches * Pierrick Gaudry's AMD64 assembly patches * Tuning flags for P4, Prescott, Nocona and Core 2 Features: * x86_64 code to Yasm format (Yasm supplied with MPIR) * Support for building on MSVC * x86_64 fat binary support Changes: * Disabled nails support * Removed macos port Changes between GMP version 4.2 and 4.2.1 Bugs: * Shared library numbers corrected. * Broken support for 32-bit AIX fixed. * Misc minor fixes. Speedups: * Exact division (mpz_divexact) now falls back to plain division for large operands. Features: * Support for some new systems. Changes between GMP version 4.1.4 and 4.2 Bugs: * Minor bug fixes and code generalizations. * Expanded and improved test suite. Speedups: * Many minor optimizations, too many to mention here. * Division now always subquadratic. * Computation of n-factorial much faster. * Added basic x86-64 assembly code. * Floating-point output is now subquadratic for all bases. * FFT multiply code now about 25% faster. * Toom3 multiply code faster. Features: * Much improved configure. * Workarounds for many more compiler bugs. * Temporary allocations are now made on the stack only if small. * New systems supported: HPPA-2.0 gcc, IA-64 HP-UX, PowerPC-64 Darwin, Sparc64 GNU/Linux. * New i386 fat binaries, selecting optimised code at runtime (--enable-fat). * New build option: --enable-profiling=instrument. * New memory function: mp_get_memory_functions. * New Mersenne Twister random numbers: gmp_randinit_mt, also now used for gmp_randinit_default. * New random functions: gmp_randinit_set, gmp_urandomb_ui, gmp_urandomm_ui. * New integer functions: mpz_combit, mpz_rootrem. * gmp_printf etc new type "M" for mp_limb_t. * gmp_scanf and friends now accept C99 hex floats. * Numeric input and output can now be in bases up to 62. * Comparisons mpz_cmp_d, mpz_cmpabs_d, mpf_cmp_d recognise infinities. * Conversions mpz_get_d, mpq_get_d, mpf_get_d truncate towards zero, previously their behaviour was unspecified. * Fixes for overflow issues with operands >= 2^31 bits. Caveats: * mpfr is gone, and will from now on be released only separately. Please see www.mpfr.org. Changes between GMP version 4.1.3 and 4.1.4 * Bug fix to FFT multiplication code (crash for huge operands). * Bug fix to mpf_sub (miscomputation). * Support for powerpc64-gnu-linux. * Better support for AMD64 in 32-bit mode. * Upwardly binary compatible with 4.1.3, 4.1.2, 4.1.1, 4.1, 4.0.1, 4.0, and 3.x versions. Changes between GMP version 4.1.2 and 4.1.3 * Bug fix for FFT multiplication code (miscomputation). * Bug fix to K6 assembly code for gcd. * Bug fix to IA-64 assembly code for population count. * Portability improvements, most notably functional AMD64 support. * mpz_export allows NULL for countp parameter. * Many minor bug fixes. * mpz_export allows NULL for countp parameter. * Upwardly binary compatible with 4.1.2, 4.1.1, 4.1, 4.0.1, 4.0, and 3.x versions. Changes between GMP version 4.1.1 and 4.1.2 * Bug fixes. Changes between GMP version 4.1 and 4.1.1 * Bug fixes. * New systems supported: NetBSD and OpenBSD sparc64. Changes between GMP version 4.0.1 and 4.1 * Bug fixes. * Speed improvements. * Upwardly binary compatible with 4.0, 4.0.1, and 3.x versions. * Asymptotically fast conversion to/from strings (mpz, mpq, mpn levels), but also major speed improvements for tiny operands. * mpn_get_str parameter restrictions relaxed. * Major speed improvments for HPPA 2.0 systems. * Major speed improvments for UltraSPARC systems. * Major speed improvments for IA-64 systems (but still sub-optimal code). * Extended test suite. * mpfr is back, with many bug fixes and portability improvements. * New function: mpz_ui_sub. * New functions: mpz_export, mpz_import. * Optimization for nth root functions (mpz_root, mpz_perfect_power_p). * Optimization for extended gcd (mpz_gcdext, mpz_invert, mpn_gcdext). * Generalized low-level number format, reserving a `nails' part of each limb. (Please note that this is really experimental; some functions are likely to compute garbage when nails are enabled.) * Nails-enabled Alpha 21264 assembly code, allowing up to 75% better performance. (Use --enable-nails=4 to enable it.) Changes between GMP version 4.0 and 4.0.1 * Bug fixes. Changes between GMP version 3.1.1 and 4.0 * Bug fixes. * Speed improvements. * Upwardly binary compatible with 3.x versions. * New CPU support: IA-64, Pentium 4. * Improved CPU support: 21264, Cray vector systems. * Support for all MIPS ABIs: o32, n32, 64. * New systems supported: Darwin, SCO, Windows DLLs. * New divide-and-conquer square root algorithm. * New algorithms chapter in the manual. * New malloc reentrant temporary memory method. * New C++ class interface by Gerardo Ballabio (beta). * Revamped configure, featuring ABI selection. * Speed improvements for mpz_powm and mpz_powm_ui (mainly affecting small operands). * mpz_perfect_power_p now properly recognizes 0, 1, and negative perfect powers. * mpz_hamdist now supports negative operands. * mpz_jacobi now accepts non-positive denominators. * mpz_powm now supports negative exponents. * mpn_mul_1 operand overlap requirements relaxed. * Float input and output uses locale specific decimal point where available. * New gmp_printf, gmp_scanf and related functions. * New division functions: mpz_cdiv_q_2exp, mpz_cdiv_r_2exp, mpz_divexact_ui. * New divisibility tests: mpz_divisible_p, mpz_divisible_ui_p, mpz_divisible_2exp_p, mpz_congruent_p, mpz_congruent_ui_p, mpz_congruent_2exp_p. * New Fibonacci function: mpz_fib2_ui. * New Lucas number functions: mpz_lucnum_ui, mpz_lucnum2_ui. * Other new integer functions: mpz_cmp_d, mpz_cmpabs_d, mpz_get_d_2exp, mpz_init2, mpz_kronecker, mpz_lcm_ui, mpz_realloc2. * New rational I/O: mpq_get_str, mpq_inp_str, mpq_out_str, mpq_set_str. * Other new rational functions: mpq_abs, mpq_cmp_si, mpq_div_2exp, mpq_mul_2exp, mpq_set_f. * New float tests: mpf_integer_p, mpf_fits_sint_p, mpf_fits_slong_p, mpf_fits_sshort_p, mpf_fits_uint_p, mpf_fits_ulong_p, mpf_fits_ushort_p. * Other new float functions: mpf_cmp_d, mpf_get_default_prec, mpf_get_si, mpf_get_ui, mpf_get_d_2exp. * New random functions: gmp_randinit_default, gmp_randinit_lc_2exp_size. * New demo expression string parser (see demos/expr). * New preliminary perl interface (see demos/perl). * Tuned algorithm thresholds for many more CPUs. Changes between GMP version 3.1 and 3.1.1 * Bug fixes for division (rare), mpf_get_str, FFT, and miscellaneous minor things. Changes between GMP version 3.0 and 3.1 * Bug fixes. * Improved `make check' running more tests. * Tuned algorithm cutoff points for many machines. This will improve speed for a lot of operations, in some cases by a large amount. * Major speed improvments: Alpha 21264. * Some speed improvments: Cray vector computers, AMD K6 and Athlon, Intel P5 and Pentium Pro/II/III. * The mpf_get_prec function now works as it did in GMP 2. * New utilities for auto-tuning and speed measuring. * Multiplication now optionally uses FFT for very large operands. (To enable it, pass --enable-fft to configure.) * Support for new systems: Solaris running on x86, FreeBSD 5, HP-UX 11, Cray vector computers, Rhapsody, Nextstep/Openstep, MacOS. * Support for shared libraries on 32-bit HPPA. * New integer functions: mpz_mul_si, mpz_odd_p, mpz_even_p. * New Kronecker symbol functions: mpz_kronecker_si, mpz_kronecker_ui, mpz_si_kronecker, mpz_ui_kronecker. * New rational functions: mpq_out_str, mpq_swap. * New float functions: mpf_swap. * New mpn functions: mpn_divexact_by3c, mpn_tdiv_qr. * New EXPERIMENTAL function layer for accurate floating-point arithmetic, mpfr. To try it, pass --enable-mpfr to configure. See the mpfr subdirectory for more information; it is not documented in the main GMP manual. Changes between GMP version 3.0 and 3.0.1 * Memory leaks in gmp_randinit and mpz_probab_prime_p fixed. * Documentation for gmp_randinit fixed. Misc documentation errors fixed. Changes between GMP version 2.0 and 3.0 * Source level compatibility with past releases (except mpn_gcd). * Bug fixes. * Much improved speed thanks to both host independent and host dependent optimizations. * Switch to autoconf/automake/libtool. * Support for building libgmp as a shared library. * Multiplication and squaring using 3-way Toom-Cook. * Division using the Burnikel-Ziegler method. * New functions computing binomial coefficients: mpz_bin_ui, mpz_bin_uiui. * New function computing Fibonacci numbers: mpz_fib_ui. * New random number generators: mpf_urandomb, mpz_rrandomb, mpz_urandomb, mpz_urandomm, gmp_randclear, gmp_randinit, gmp_randinit_lc_2exp, gmp_randseed, gmp_randseed_ui. * New function for quickly extracting limbs: mpz_getlimbn. * New functions performing integer size tests: mpz_fits_sint_p, mpz_fits_slong_p, mpz_fits_sshort_p, mpz_fits_uint_p, mpz_fits_ulong_p, mpz_fits_ushort_p. * New mpf functions: mpf_ceil, mpf_floor, mpf_pow_ui, mpf_trunc. * New mpq function: mpq_set_d. * New mpz functions: mpz_addmul_ui, mpz_cmpabs, mpz_cmpabs_ui, mpz_lcm, mpz_nextprime, mpz_perfect_power_p, mpz_remove, mpz_root, mpz_swap, mpz_tdiv_ui, mpz_tstbit, mpz_xor. * New mpn function: mpn_divexact_by3. * New CPU support: DEC Alpha 21264, AMD K6 and Athlon, HPPA 2.0 and 64, Intel Pentium Pro and Pentium-II/III, Sparc 64, PowerPC 64. * Almost 10 times faster mpz_invert and mpn_gcdext. * The interface of mpn_gcd has changed. * Better support for MIPS R4x000 and R5000 under Irix 6. * Improved support for SPARCv8 and SPARCv9 processors. Changes between GMP version 2.0 and 2.0.2 * Many bug fixes. Changes between GMP version 1.3.2 and 2.0 * Division routines in the mpz class have changed. There are three classes of functions, that rounds the quotient to -infinity, 0, and +infinity, respectively. The first class of functions have names that begin with mpz_fdiv (f is short for floor), the second class' names begin with mpz_tdiv (t is short for trunc), and the third class' names begin with mpz_cdiv (c is short for ceil). The old division routines beginning with mpz_m are similar to the new mpz_fdiv, with the exception that some of the new functions return useful values. The old function names can still be used. All the old functions names will now do floor division, not trunc division as some of them used to. This was changed to make the functions more compatible with common mathematical practice. The mpz_mod and mpz_mod_ui functions now compute the mathematical mod function. I.e., the sign of the 2nd argument is ignored. * The mpq assignment functions do not canonicalize their results. A new function, mpq_canonicalize must be called by the user if the result is not known to be canonical. * The mpn functions are now documented. These functions are intended for very time critical applications, or applications that need full control over memory allocation. Note that the mpn interface is irregular and hard to use. * New functions for arbitrary precision floating point arithmetic. Names begin with `mpf_'. Associated type mpf_t. * New and improved mpz functions, including much faster GCD, fast exact division (mpz_divexact), bit scan (mpz_scan0 and mpz_scan1), and number theoretical functions like Jacobi (mpz_jacobi) and multiplicative inverse (mpz_invert). * New variable types (mpz_t and mpq_t) are available that makes syntax of mpz and mpq calls nicer (no need for & before variables). The MP_INT and MP_RAT types are still available for compatibility. * Uses GNU configure. This makes it possible to choose target architecture and CPU variant, and to compile into a separate object directory. * Carefully optimized assembly for important inner loops. Support for DEC Alpha, Amd 29000, HPPA 1.0 and 1.1, Intel Pentium and generic x86, Intel i960, Motorola MC68000, MC68020, MC88100, and MC88110, Motorola/IBM PowerPC, National NS32000, IBM POWER, MIPS R3000, R4000, SPARCv7, SuperSPARC, generic SPARCv8, and DEC VAX. Some support also for ARM, Clipper, IBM ROMP (RT), and Pyramid AP/XP. * Faster. Thanks to the assembler code, new algorithms, and general tuning. In particular, the speed on machines without GCC is improved. * Support for machines without alloca. * Now under the LGPL. INCOMPATIBILITIES BETWEEN GMP 1 AND GMP 2 * mpq assignment functions do not canonicalize their results. * mpz division functions round differently. * mpz mod functions now really compute mod. * mpz_powm and mpz_powm_ui now really use mod for reduction. ---------------- Local variables: mode: text fill-column: 76 End: