mpir/mpn/x86_64/amd64
2008-07-23 17:11:29 +00:00
..
add_n.as Moved intel format versions of add_n.as and sub_n.as to the /mpn/x86_64/amd64 directory to make way for intel format versions of original add_n.as and sub_n.as files from GMP 4.2.1. 2008-07-04 01:36:11 +00:00
addmul_1.as These files were moved in error, so I'm moving them back. See the comment in trac #59. 2008-07-04 01:43:48 +00:00
copyd.as Set native line endings for all .c, .h, as, .asm, .s, .in, .m4, .cc, am 2008-06-25 07:33:36 +00:00
copyi.as Set native line endings for all .c, .h, as, .asm, .s, .in, .m4, .cc, am 2008-06-25 07:33:36 +00:00
gmp-mparam.h Fixed the speed issues with a static library vs Pierrick Gaudry's 2008-06-04 03:47:49 +00:00
mul_basecase.as Put macros instances in all yasm assembly files for global symbol 2008-06-15 22:00:33 +00:00
README Basic GMP files with a new core2 directory and amd_64 directory with Martin's and Gaudry's patches. 2008-04-17 21:03:07 +00:00
sqr_basecase.as Set native line endings for all .c, .h, as, .asm, .s, .in, .m4, .cc, am 2008-06-25 07:33:36 +00:00
sub_n.as Moved intel format versions of add_n.as and sub_n.as to the /mpn/x86_64/amd64 directory to make way for intel format versions of original add_n.as and sub_n.as files from GMP 4.2.1. 2008-07-04 01:36:11 +00:00
submul_1.as These files were moved in error, so I'm moving them back. See the comment in trac #59. 2008-07-04 01:43:48 +00:00
udiv.as Set native line endings for all .c, .h, as, .asm, .s, .in, .m4, .cc, am 2008-06-25 07:33:36 +00:00
umul.as Set native line endings for all .c, .h, as, .asm, .s, .in, .m4, .cc, am 2008-06-25 07:33:36 +00:00
x86_64-defs.m4 Basic GMP files with a new core2 directory and amd_64 directory with Martin's and Gaudry's patches. 2008-04-17 21:03:07 +00:00

Some assembly routines for AMD64 architecture (Opteron, Athlon64)

Author:    P. Gaudry
Date:      April 2005 -- March 2006
Copyright: LGPL

Purpose:
========

This is a patch to gmp-4.2 for AMD64 architecture. The 4.2 version comes
with basic assembly support. This patch gives substantial speed-up.

Only a few functions have been written:
  add_n
  sub_n
  addmul_1
  submul_1
  mul_basecase
  sqr_basecase

The assembly code is mostly a 64 bit translation of the k7 assembly code
that is available in GMP. The main modifications are:
* The ABI for function calls is not the same: up to 6 parameters
  are passed in registers, not on the stack.
* Change movl to movq, eax to rax, etc... That's the easy part.
* In an unrolled loop, the size of the unrolled code is not the same, so
  the computation of the jump is different.

Changes:
========

There is almost no change compared to the patch for 4.1.4. The
multiplication has been slighlty improved (around 3.15 cyc/limb) but most
of the improvement in the gmpbench score comes from modifications in the
C code of GMP between the 2 versions. 

Disclaimer:
===========

The code has been reasonnably well tested. I used the program tests/devel/try
that tests quite a few bug possibilities. Nonetheless, there is no
warranty whatsoever. 

Bugs:
=====

Please send comments and bugs to gaudry@lix.polytechnique.fr and *not* to
the official GMP developpers: they have nothing to do with this code.

Performance:
============

I've got a multiply bench of around 55000 on a 2.4 GHz Opteron (was 41500
with the plain 4.2). The whole gmpbench score is about 10000 (was
8200 before patch).

Install:
========

1) Get the gmp-4.2 archive and unpack it, thus creating a
   directory   /path_to_gmp/gmp-4.2/
2) In the directory of mpn_amd64.42, run
    ./install /path_to_gmp/gmp-4.2
3) cd /path_to_gmp/gmp-4.2
4) ./configure with your favorite options
5) make && make check && make install