mpir/NEWS

Copyright 1996, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006 Free Software
Foundation, Inc.

Copyright 2009 William Hart

This file is part of the MPIR Library.

The MPIR Library is free software; you can redistribute it and/or modify
it under the terms of the GNU Lesser General Public License as published by
the Free Software Foundation; either version 2.1 of the License, or (at your
option) any later version.

The MPIR Library is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU Lesser General Public
License for more details.

You should have received a copy of the GNU Lesser General Public License
along with the GNU MP Library; see the file COPYING.LIB.  If not, write to
the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
MA 02110-1301, USA.

Changes between MPIR 1.2.0 and MPIR 1.3.0

Bug Fixes:

 * Fixes to the build system to better support MinGW
 * Fixed a memory leak in lehmer GCD code
 * Fixed a CPU misidentification on BSD
 * Fixed a BSD install issue
 * Fixed a make try warning on Solaris
 * Fixed make distclean to clean up properly after a fat binary build
 * Fixed a bug in make distcheck
 * Fixed mpf_eq bug (reported on GMP list)
 * Fixed non-uniformness of mpz_urandomm
 * Fixed mpf exponent printing issue (reported on GMP list)
 * Fixed bug in sparc32/v9 add/sub code
 * Fixed bug in rootrem code

Speedups:

 * Unbalanced Toom 4 multiplication
 * Toom 53 multiplication
 * New fast single limb gcd and gcdext routines
 * Switched on ngcd based Lehmer GCD routine
 * Strassen multiplication for 2x2 matrices to speed up ngcd and ngcdext
 * Switched on new MPN_ZERO and mpn_store assembly routines in FFT code
 * Left and right shift assembly code for x86_64
 * Rewrote generic mullow and mulhi functions
 * New mpz factorial code and tuning (contributed by Robert Gerbicz)
 * Updating of 32 bit Windows support for AMD64, p3 and p4
 * Core2/penryn and nehalem mpn_store assembly code
 * Core2/penryn copyi assembly code
 * Better 32 bit k8/k10 and Nehalem assembly code
 * Initial support for via Nano
 * New mpn_rootrem code
 * Select better assembly code for Atom 64 bit
 * New faster mpz_tdiv_q code
 * Faster division and exact division by a single limb on x86_64
 * Core2/penryn and nehalem addlsh_n assembly code
 * K8/k10 addlsh_n, sublsh_n assembly functions, including carry in variants
 * K8/k10 inclsh_n, declsh_n assembly code

Features:

 * Middle product multiplication (by David Harvey)
 * Optimised k8/k10 and Nehalem assembly code for add_err1_n, sub_err1_n used by mulmid
 * Speed program accepts lines of data from a text file
 * A batch script to build MPIR using MSVC using a configure/make like syntax
 * Complete rewrite of the benchmark program in C by Brian Gladman
 * New primality test code written by T. R. Nicely used as a benchmark case, adapted with
   the help of Jeff Gilchrist
 * mpn_lshift2 and mpn_rshift2 assembly functions
 * Latest Yasm assembler
 * sb_divappr_q, schoolbook approximate quotient
 * dc_divappr_q, divide and conquer approximate quotient (by David Harvey)
 * Script for setting all version numbers automatically when doing a release
 * mpn_neg_n function
 * New mpn_mulmod_2expp1 and mpn_mulmod_2expm1 functions
 * Benchmark for mpn functions
 * New k8 mpn_lshiftc assembler function
 * Macro functions inclsh1, declsh1
 * The try program now tests macro functions
 * Macros for memory managers to determine when reallocations are likely to occur
 * New function mpz_nthroot
 * New mpz_next_likely_prime, mpz_probable_prime_p and mpz_likely_prime functions
 * BPSW primality test code for integers up to GMP_LIMB_BITS, contributed by Peter Shrimpton
 * Factor out trial division function from primality test code
 * New mpf_rrandomb without global state
 * New mpn_urandomb, mpn_urandomm, mpn_rrandom and mpn_randomb functions without global state
 * New mpn_invert code (contributed by Paul Zimmermann), used in division code
 * New generic divrem_hensel functions
 * Implement Peter Montgomery's mpn_mod_1_k algorithms
 * Optimised AMD, core2/penryn, atom, nehalem assembly functions for mpn_mod_1_?
 * New assembly code for AMD divrem_hensel_qr_1, divrem_hensel_r_1
 * New AMD, core2/penryn, atom, nehalem assembly functions  mpn_rsh_divrem_hensel_qr_1_2
 * New optimised AMD, core2/penryn, atom, nehalem assembly functions mpn_divrem_hensel_qr_1_2
 * New generic functions mpn_rsh_divrem_hensel_qr_1_?
 * New generic mpn_tdiv_q function (based on mulmid/dc_divappr_q code)
 * Improved Windows timing code
 * Support for new Intel family 6, model 30

Changes:

 * Removed requirement to type make install-gmpcompat
 * Make check tests both static and dynamic libraries where code differs
 * Changed library version numbers from x.y to x.y.z when doing a new minor release
 * Removed numerous extremely old deprecated functions
 * Removed mpbsd support from MPIR
 * Removed ancient ansi2knr conversion
 * Added architecture directory k102 for Phenom II assembly code


Changes between MPIR 1.1.0 and MPIR 1.2.0

  Bugs:
  * None

  Speedups:
  * Add new FFT code written by Paul Zimmermann as revised by Paul Zimmermann,
    Pierrick Gaudry, Alexander Kruppa and Torbjorn Granlund, with numerous bug
    fixes due to William Hart
  * Add tuning parameters for new FFT for most modern processors
  * Write tuning code for new FFT
  * Implement Toom32, unbalanced Toom3, Toom42
  * Optimise Toom3 and Toom3 squaring code using better sequences
  * Factor out Toom4/7 interpolate sequences and switch to twos complement
  * Optimise memory usage in Toom 3, 4 and 7 routines
  * Many new highly optimised assembly routines for x86_64 architectures
  * Fast XGCD based on Moller's ngcd algorithm

  Features:
  * Modified speed program to be able to add values from columns together

  Changes:
  * None

Changes between MPIR 1.0.0 and MPIR 1.1.0

  Bugs:
  * Work around a linker bug in Apple Darwin Tiger
  * Resolve an issue causing a build failure on recent Cygwin32's
  * Fixed development test code to do proper overlap tests for functions with
    three source operands

  Speedups:
  * Added numerous assembly optimised linear division functions (Jason Moxham)
  * Optimised mul_2 and addmul_2 (Jason Moxham)
  * Added Toom 4 and Toom 7 multiplication for balanced operands (William Hart)
  * Small speedup for mpz_mul for small operands when not aliased

  Features:
  * Complete rearrangement of cpu detection code to explicitly support k8, k10,
    pentium4, prescott, netburst, netburstlahf, core2, core, penryn, atom, nehalem
  * factored out x86/x86_64 detection for both ordinary and fat builds into cpuid.c
  * Distribute mpirbench with mpir (new make bench option)
  * Added __GMP_CC and __GMP_CFLAGS, __MPIR_CC and __MPIR_CFLAGS to gmp/mpir.h
  * Report when CPU is not identified (try sensible defaults)
  * Support Pentium 4's that do not support LAHF/SAHF instructions
  * Support Pathscale gcc on MIPS64
  * Addition of assembly optimised subadd_n function

  Changes:
  * Re-enabled mpbsd functionality

Changes between MPIR 0.9.0 and MPIR 1.0.0

  Bugs:
  * Building outside the source tree is now possible
  * Bug removed from Windows Assembler file dive_1.asm
  * Fat binary support for Core 2 64 bit fixed
  * x86_64 fat binary support on Sun machines with gcc fixed
  * Build failure on Sun machines using later versions of gcc fixed
  * Aliasing bug in mpz_urandomm fixed
  * Fixed numerous build bugs on OSX (reported by Michael Abshoff)

  Speedups:
  * Dramatic speedups for K8 assembly code (due primarily to Jason Moxham)
  * Assembly support for K10
  * Significant speedups for Core 2 assembly (due primarily to Jason Moxham)
  * Some mpn assembler functions were not being used in mpz layer due to
    missing HAVE_NATIVE flags
  * Nocona processors now use Core 2 assembly functions instead of generic C

  Features:
  * Emit mpir binaries and mpir.h and offer support for gmp compatibility
  * Build support for Intel Atom
  * Unrecognised Intel 64 machines now default to Core 2 assembly support
  * Some new, undocumented mpn functions
  * Try, speed and tune now available for Windows MSVC build

Changes between GMP 4.2.1 and MPIR 0.9.0

  Bugs:
  * Sun CC support
  * C99 support in gmp.h
  * Build fixes for Apple GCC compiler
  * Numerous bug fixes posted to gmp-devel for GMP 4.2.1
  * Corrections in documentation including function prototypes
  * Build fix (-fast) for cc on sparc-solaris
  * Support for Core 2 Solaris
  * Support for SiCortex MIPS
  * Distinguish and detect P4, Nocona, Prescott
  * Support numerous recent Intel family 6 and AMD Dunnington prcessors
  * Fixed bugs in perfect power detection

  Speedups:
  * Jason Martin's Core 2 assembly patches
  * Niels M<>hler's GCD patches
  * Pierrick Gaudry's AMD64 assembly patches
  * Tuning flags for P4, Prescott, Nocona and Core 2

  Features:
  * x86_64 code to Yasm format (Yasm supplied with MPIR)
  * Support for building on MSVC
  * x86_64 fat binary support

  Changes:
  * Disabled nails support
  * Removed macos port

Changes between GMP version 4.2 and 4.2.1

  Bugs:
  * Shared library numbers corrected.
  * Broken support for 32-bit AIX fixed.
  * Misc minor fixes.

  Speedups:
  * Exact division (mpz_divexact) now falls back to plain division for large
    operands.

  Features:
  * Support for some new systems.


Changes between GMP version 4.1.4 and 4.2

  Bugs:
  * Minor bug fixes and code generalizations.
  * Expanded and improved test suite.

  Speedups:
  * Many minor optimizations, too many to mention here.
  * Division now always subquadratic.
  * Computation of n-factorial much faster.
  * Added basic x86-64 assembly code.
  * Floating-point output is now subquadratic for all bases.
  * FFT multiply code now about 25% faster.
  * Toom3 multiply code faster.

  Features:
  * Much improved configure.
  * Workarounds for many more compiler bugs.
  * Temporary allocations are now made on the stack only if small.
  * New systems supported: HPPA-2.0 gcc, IA-64 HP-UX, PowerPC-64 Darwin,
    Sparc64 GNU/Linux.
  * New i386 fat binaries, selecting optimised code at runtime (--enable-fat).
  * New build option: --enable-profiling=instrument.
  * New memory function: mp_get_memory_functions.
  * New Mersenne Twister random numbers: gmp_randinit_mt, also now used for
    gmp_randinit_default.
  * New random functions: gmp_randinit_set, gmp_urandomb_ui, gmp_urandomm_ui.
  * New integer functions: mpz_combit, mpz_rootrem.
  * gmp_printf etc new type "M" for mp_limb_t.
  * gmp_scanf and friends now accept C99 hex floats.
  * Numeric input and output can now be in bases up to 62.
  * Comparisons mpz_cmp_d, mpz_cmpabs_d, mpf_cmp_d recognise infinities.
  * Conversions mpz_get_d, mpq_get_d, mpf_get_d truncate towards zero,
    previously their behaviour was unspecified.
  * Fixes for overflow issues with operands >= 2^31 bits.

  Caveats:
  * mpfr is gone, and will from now on be released only separately.  Please see
    www.mpfr.org.


Changes between GMP version 4.1.3 and 4.1.4

* Bug fix to FFT multiplication code (crash for huge operands).
* Bug fix to mpf_sub (miscomputation).
* Support for powerpc64-gnu-linux.
* Better support for AMD64 in 32-bit mode.
* Upwardly binary compatible with 4.1.3, 4.1.2, 4.1.1, 4.1, 4.0.1, 4.0,
  and 3.x versions.


Changes between GMP version 4.1.2 and 4.1.3

* Bug fix for FFT multiplication code (miscomputation).
* Bug fix to K6 assembly code for gcd.
* Bug fix to IA-64 assembly code for population count.
* Portability improvements, most notably functional AMD64 support.
* mpz_export allows NULL for countp parameter.
* Many minor bug fixes.
* mpz_export allows NULL for countp parameter.
* Upwardly binary compatible with 4.1.2, 4.1.1, 4.1, 4.0.1, 4.0, and 3.x
  versions.


Changes between GMP version 4.1.1 and 4.1.2

* Bug fixes.


Changes between GMP version 4.1 and 4.1.1

* Bug fixes.
* New systems supported: NetBSD and OpenBSD sparc64.


Changes between GMP version 4.0.1 and 4.1

* Bug fixes.
* Speed improvements.
* Upwardly binary compatible with 4.0, 4.0.1, and 3.x versions.
* Asymptotically fast conversion to/from strings (mpz, mpq, mpn levels), but
  also major speed improvements for tiny operands.
* mpn_get_str parameter restrictions relaxed.
* Major speed improvments for HPPA 2.0 systems.
* Major speed improvments for UltraSPARC systems.
* Major speed improvments for IA-64 systems (but still sub-optimal code).
* Extended test suite.
* mpfr is back, with many bug fixes and portability improvements.
* New function: mpz_ui_sub.
* New functions: mpz_export, mpz_import.
* Optimization for nth root functions (mpz_root, mpz_perfect_power_p).
* Optimization for extended gcd (mpz_gcdext, mpz_invert, mpn_gcdext).
* Generalized low-level number format, reserving a `nails' part of each
  limb.  (Please note that this is really experimental; some functions
  are likely to compute garbage when nails are enabled.)
* Nails-enabled Alpha 21264 assembly code, allowing up to 75% better
  performance.  (Use --enable-nails=4 to enable it.)


Changes between GMP version 4.0 and 4.0.1

* Bug fixes.


Changes between GMP version 3.1.1 and 4.0

* Bug fixes.
* Speed improvements.
* Upwardly binary compatible with 3.x versions.
* New CPU support: IA-64, Pentium 4.
* Improved CPU support: 21264, Cray vector systems.
* Support for all MIPS ABIs: o32, n32, 64.
* New systems supported: Darwin, SCO, Windows DLLs.
* New divide-and-conquer square root algorithm.
* New algorithms chapter in the manual.
* New malloc reentrant temporary memory method.
* New C++ class interface by Gerardo Ballabio (beta).
* Revamped configure, featuring ABI selection.
* Speed improvements for mpz_powm and mpz_powm_ui (mainly affecting small
  operands).
* mpz_perfect_power_p now properly recognizes 0, 1, and negative perfect
  powers.
* mpz_hamdist now supports negative operands.
* mpz_jacobi now accepts non-positive denominators.
* mpz_powm now supports negative exponents.
* mpn_mul_1 operand overlap requirements relaxed.
* Float input and output uses locale specific decimal point where available.
* New gmp_printf, gmp_scanf and related functions.
* New division functions: mpz_cdiv_q_2exp, mpz_cdiv_r_2exp, mpz_divexact_ui.
* New divisibility tests: mpz_divisible_p, mpz_divisible_ui_p,
  mpz_divisible_2exp_p, mpz_congruent_p, mpz_congruent_ui_p,
  mpz_congruent_2exp_p.
* New Fibonacci function: mpz_fib2_ui.
* New Lucas number functions: mpz_lucnum_ui, mpz_lucnum2_ui.
* Other new integer functions: mpz_cmp_d, mpz_cmpabs_d, mpz_get_d_2exp,
  mpz_init2, mpz_kronecker, mpz_lcm_ui, mpz_realloc2.
* New rational I/O: mpq_get_str, mpq_inp_str, mpq_out_str, mpq_set_str.
* Other new rational functions: mpq_abs, mpq_cmp_si, mpq_div_2exp,
  mpq_mul_2exp, mpq_set_f.
* New float tests: mpf_integer_p, mpf_fits_sint_p, mpf_fits_slong_p,
  mpf_fits_sshort_p, mpf_fits_uint_p, mpf_fits_ulong_p, mpf_fits_ushort_p.
* Other new float functions: mpf_cmp_d, mpf_get_default_prec, mpf_get_si,
  mpf_get_ui, mpf_get_d_2exp.
* New random functions: gmp_randinit_default, gmp_randinit_lc_2exp_size.
* New demo expression string parser (see demos/expr).
* New preliminary perl interface (see demos/perl).
* Tuned algorithm thresholds for many more CPUs.


Changes between GMP version 3.1 and 3.1.1

* Bug fixes for division (rare), mpf_get_str, FFT, and miscellaneous minor
  things.


Changes between GMP version 3.0 and 3.1

* Bug fixes.
* Improved `make check' running more tests.
* Tuned algorithm cutoff points for many machines.  This will improve speed for
  a lot of operations, in some cases by a large amount.
* Major speed improvments: Alpha 21264.
* Some speed improvments: Cray vector computers, AMD K6 and Athlon, Intel P5
  and Pentium Pro/II/III.
* The mpf_get_prec function now works as it did in GMP 2.
* New utilities for auto-tuning and speed measuring.
* Multiplication now optionally uses FFT for very large operands.  (To enable
  it, pass --enable-fft to configure.)
* Support for new systems: Solaris running on x86, FreeBSD 5, HP-UX 11, Cray
  vector computers, Rhapsody, Nextstep/Openstep, MacOS.
* Support for shared libraries on 32-bit HPPA.
* New integer functions: mpz_mul_si, mpz_odd_p, mpz_even_p.
* New Kronecker symbol functions: mpz_kronecker_si, mpz_kronecker_ui,
  mpz_si_kronecker, mpz_ui_kronecker.
* New rational functions: mpq_out_str, mpq_swap.
* New float functions: mpf_swap.
* New mpn functions: mpn_divexact_by3c, mpn_tdiv_qr.
* New EXPERIMENTAL function layer for accurate floating-point arithmetic, mpfr.
  To try it, pass --enable-mpfr to configure.  See the mpfr subdirectory for
  more information; it is not documented in the main GMP manual.


Changes between GMP version 3.0 and 3.0.1

* Memory leaks in gmp_randinit and mpz_probab_prime_p fixed.
* Documentation for gmp_randinit fixed.  Misc documentation errors fixed.


Changes between GMP version 2.0 and 3.0

* Source level compatibility with past releases (except mpn_gcd).
* Bug fixes.
* Much improved speed thanks to both host independent and host dependent
  optimizations.
* Switch to autoconf/automake/libtool.
* Support for building libgmp as a shared library.
* Multiplication and squaring using 3-way Toom-Cook.
* Division using the Burnikel-Ziegler method.
* New functions computing binomial coefficients: mpz_bin_ui, mpz_bin_uiui.
* New function computing Fibonacci numbers: mpz_fib_ui.
* New random number generators: mpf_urandomb, mpz_rrandomb, mpz_urandomb,
  mpz_urandomm, gmp_randclear, gmp_randinit, gmp_randinit_lc_2exp, gmp_randseed,
  gmp_randseed_ui.
* New function for quickly extracting limbs: mpz_getlimbn.
* New functions performing integer size tests: mpz_fits_sint_p,
  mpz_fits_slong_p, mpz_fits_sshort_p, mpz_fits_uint_p, mpz_fits_ulong_p,
  mpz_fits_ushort_p.
* New mpf functions: mpf_ceil, mpf_floor, mpf_pow_ui, mpf_trunc.
* New mpq function: mpq_set_d.
* New mpz functions: mpz_addmul_ui, mpz_cmpabs, mpz_cmpabs_ui, mpz_lcm,
  mpz_nextprime, mpz_perfect_power_p, mpz_remove, mpz_root, mpz_swap,
  mpz_tdiv_ui, mpz_tstbit, mpz_xor.
* New mpn function: mpn_divexact_by3.
* New CPU support: DEC Alpha 21264, AMD K6 and Athlon, HPPA 2.0 and 64,
  Intel Pentium Pro and Pentium-II/III, Sparc 64, PowerPC 64.
* Almost 10 times faster mpz_invert and mpn_gcdext.
* The interface of mpn_gcd has changed.
* Better support for MIPS R4x000 and R5000 under Irix 6.
* Improved support for SPARCv8 and SPARCv9 processors.


Changes between GMP version 2.0 and 2.0.2

* Many bug fixes.


Changes between GMP version 1.3.2 and 2.0

* Division routines in the mpz class have changed.  There are three classes of
  functions, that rounds the quotient to -infinity, 0, and +infinity,
  respectively.  The first class of functions have names that begin with
  mpz_fdiv (f is short for floor), the second class' names begin with mpz_tdiv
  (t is short for trunc), and the third class' names begin with mpz_cdiv (c is
  short for ceil).

  The old division routines beginning with mpz_m are similar to the new
  mpz_fdiv, with the exception that some of the new functions return useful
  values.

  The old function names can still be used.  All the old functions names will
  now do floor division, not trunc division as some of them used to.  This was
  changed to make the functions more compatible with common mathematical
  practice.

  The mpz_mod and mpz_mod_ui functions now compute the mathematical mod
  function.  I.e., the sign of the 2nd argument is ignored.

* The mpq assignment functions do not canonicalize their results.  A new
  function, mpq_canonicalize must be called by the user if the result is not
  known to be canonical.
* The mpn functions are now documented.  These functions are intended for
  very time critical applications, or applications that need full control over
  memory allocation.  Note that the mpn interface is irregular and hard to
  use.
* New functions for arbitrary precision floating point arithmetic.  Names
  begin with `mpf_'.  Associated type mpf_t.
* New and improved mpz functions, including much faster GCD, fast exact
  division (mpz_divexact), bit scan (mpz_scan0 and mpz_scan1), and number
  theoretical functions like Jacobi (mpz_jacobi) and multiplicative inverse
  (mpz_invert).
* New variable types (mpz_t and mpq_t) are available that makes syntax of
  mpz and mpq calls nicer (no need for & before variables).  The MP_INT and
  MP_RAT types are still available for compatibility.
* Uses GNU configure.  This makes it possible to choose target architecture
  and CPU variant, and to compile into a separate object directory.
* Carefully optimized assembly for important inner loops.  Support for DEC
  Alpha, Amd 29000, HPPA 1.0 and 1.1, Intel Pentium and generic x86, Intel
  i960, Motorola MC68000, MC68020, MC88100, and MC88110, Motorola/IBM
  PowerPC, National NS32000, IBM POWER, MIPS R3000, R4000, SPARCv7,
  SuperSPARC, generic SPARCv8, and DEC VAX.  Some support also for ARM,
  Clipper, IBM ROMP (RT), and Pyramid AP/XP.
* Faster.  Thanks to the assembler code, new algorithms, and general tuning.
  In particular, the speed on machines without GCC is improved.
* Support for machines without alloca.
* Now under the LGPL.

INCOMPATIBILITIES BETWEEN GMP 1 AND GMP 2

* mpq assignment functions do not canonicalize their results.
* mpz division functions round differently.
* mpz mod functions now really compute mod.
* mpz_powm and mpz_powm_ui now really use mod for reduction.


----------------
Local variables:
mode: text
fill-column: 76
End: