mpir/mpn/x86_64
2008-06-04 04:04:49 +00:00
..
amd64 These are the old versions of addmul and submul written by Brian 2008-06-04 04:04:49 +00:00
core2 Move calling conventions for core2 into right directory. 2008-05-30 09:27:42 +00:00
add_n.as Sped up add and sub when the loop unrolling code is used. 2008-06-02 23:08:59 +00:00
dive_1.brg Roughly speaking mpir should now build on an AMD64. At the present moment the config.guess doesn't distinguish a Core 2 from an AMD64 and so the same code is probably built on both. 2008-05-26 22:11:40 +00:00
gmp-mparam.h Fixed the speed issues with a static library vs Pierrick Gaudry's 2008-06-04 03:47:49 +00:00
lshift.as Fixed the speed issues with a static library vs Pierrick Gaudry's 2008-06-04 03:47:49 +00:00
mode1o.as Attempt to fix assembler file names. 2008-05-29 23:55:41 +00:00
mode1o.brg Roughly speaking mpir should now build on an AMD64. At the present moment the config.guess doesn't distinguish a Core 2 from an AMD64 and so the same code is probably built on both. 2008-05-26 22:11:40 +00:00
mul_1.as Slight speedup by getting alignment right. 2008-06-01 06:36:56 +00:00
README Basic GMP files with a new core2 directory and amd_64 directory with Martin's and Gaudry's patches. 2008-04-17 21:03:07 +00:00
rshift.as Fixed the speed issues with a static library vs Pierrick Gaudry's 2008-06-04 03:47:49 +00:00
sub_n.as Sped up add and sub when the loop unrolling code is used. 2008-06-02 23:08:59 +00:00
x86_64-defs.m4 Fixed the speed issues with a static library vs Pierrick Gaudry's 2008-06-04 03:47:49 +00:00

Copyright 2003, 2004, 2006 Free Software Foundation, Inc.

This file is part of the GNU MP Library.

The GNU MP Library is free software; you can redistribute it and/or modify
it under the terms of the GNU Lesser General Public License as published by
the Free Software Foundation; either version 2.1 of the License, or (at your
option) any later version.

The GNU MP Library is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU Lesser General Public
License for more details.

You should have received a copy of the GNU Lesser General Public License
along with the GNU MP Library; see the file COPYING.LIB.  If not, write to
the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
02110-1301, USA.





			AMD64 MPN SUBROUTINES


This directory contains mpn functions for AMD64 chips.  It might also be
useful for 64-bit Pentiums, but that chip's poor carry handling makes it
unlikely.  We'll need completely separate code (in a subdirectory).


		     RELEVANT OPTIMIZATION ISSUES

The only AMD64 core as of this writing is the AMD Hammer, sold under the
names Opteron and Athlon64.  The Hammer can sustain up to 3 instructions per
cycle, but in practice that is only possible for integer instructions.  But
almost any three integer instructions can issue simultaneously, including
any 3 ALU operations, including shifts.  Up to two memory operations can
issue each cycle.

Scheduling typically requires that load-use instructions are split into
separate load and use instructions.  That requires more decode resources,
and it is rarely a win.  Hammer is a deep out-of-order core.


REFERENCES

"System V Application Binary Interface AMD64 Architecture Processor
Supplement", draft version 0.90, April 2003.