147 lines
5.0 KiB
Plaintext
147 lines
5.0 KiB
Plaintext
|
Copyright 2000, 2001, 2002, 2003, 2004, 2005 Free Software Foundation, Inc.
|
||
|
|
||
|
This file is part of the GNU MP Library.
|
||
|
|
||
|
The GNU MP Library is free software; you can redistribute it and/or modify
|
||
|
it under the terms of the GNU Lesser General Public License as published by
|
||
|
the Free Software Foundation; either version 2.1 of the License, or (at your
|
||
|
option) any later version.
|
||
|
|
||
|
The GNU MP Library is distributed in the hope that it will be useful, but
|
||
|
WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
|
||
|
or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public
|
||
|
License for more details.
|
||
|
|
||
|
You should have received a copy of the GNU Lesser General Public License
|
||
|
along with the GNU MP Library; see the file COPYING.LIB. If not, write to
|
||
|
the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
|
||
|
02110-1301, USA.
|
||
|
|
||
|
|
||
|
|
||
|
IA-64 MPN SUBROUTINES
|
||
|
|
||
|
|
||
|
This directory contains mpn functions for the IA-64 architecture.
|
||
|
|
||
|
|
||
|
CODE ORGANIZATION
|
||
|
|
||
|
mpn/ia64 itanium-2, and generic ia64
|
||
|
|
||
|
The code here has been optimized primarily for Itanium 2. Very few Itanium 1
|
||
|
chips were ever sold, and Itanium 2 is more powerful, so the latter is what
|
||
|
we concentrate on.
|
||
|
|
||
|
|
||
|
|
||
|
CHIP NOTES
|
||
|
|
||
|
The IA-64 ISA keeps instructions three and three in 128 bit bundles.
|
||
|
Programmers/compilers need to put explicit breaks `;;' when there are WAW or
|
||
|
RAW dependencies, with some notable exceptions. Such "breaks" are typically
|
||
|
at the end of a bundle, but can be put between operations within some bundle
|
||
|
types too.
|
||
|
|
||
|
The Itanium 1 and Itanium 2 implementations can under ideal conditions
|
||
|
execute two bundles per cycle. The Itanium 2 allows 4 of these instructions
|
||
|
to do integer operations, while the Itanium 2 allows all 6 to be integer
|
||
|
operations.
|
||
|
|
||
|
Taken cloop branches seem to insert a bubble into the pipeline most of the
|
||
|
time on Itanium 1.
|
||
|
|
||
|
Loads to the fp registers bypass the L1 cache and thus get extremely long
|
||
|
latencies, 9 cycles on the Itanium 1 and 6 cycles on the Itanium 2.
|
||
|
|
||
|
The software pipeline stuff using br.ctop instruction causes delays, since
|
||
|
many issue slots are taken up by instructions with zero predicates, and
|
||
|
since many extra instructions are needed to set things up. These features
|
||
|
are clearly designed for code density, not speed.
|
||
|
|
||
|
Misc pipeline limitations (Itanium 1):
|
||
|
* The getf.sig instruction can only execute in M0.
|
||
|
* At most four integer instructions/cycle.
|
||
|
* Nops take up resources like any plain instructions.
|
||
|
|
||
|
Misc pipeline limitations (Itanium 2):
|
||
|
* The getf.sig instruction can only execute in M0.
|
||
|
* Nops take up resources like any plain instructions.
|
||
|
|
||
|
|
||
|
ASSEMBLY SYNTAX
|
||
|
|
||
|
.align pads with nops in a text segment, but gas 2.14 and earlier
|
||
|
incorrectly byte-swaps its nop bundle in big endian mode (eg. hpux), making
|
||
|
it come out as break instructions. We use the ALIGN() macro in
|
||
|
mpn/ia64/ia64-defs.m4 when it might be executed across. That macro
|
||
|
suppresses any .align if the problem is detected by configure. Lack of
|
||
|
alignment might hurt performance but will at least be correct.
|
||
|
|
||
|
foo:: to create a global symbol is not accepted by gas. Use separate
|
||
|
".global foo" and "foo:" instead.
|
||
|
|
||
|
.global is the standard global directive. gas accepts .globl, but hpux "as"
|
||
|
doesn't.
|
||
|
|
||
|
.proc / .endp generates the appropriate .type and .size information for ELF,
|
||
|
so the latter directives don't need to be given explicitly.
|
||
|
|
||
|
.pred.rel "mutex"... is standard for annotating predicate register
|
||
|
relationships. gas also accepts .pred.rel.mutex, but hpux "as" doesn't.
|
||
|
|
||
|
.pred directives can't be put on a line with a label, like
|
||
|
".Lfoo: .pred ...", the HP assembler on HP-UX 11.23 rejects that.
|
||
|
gas is happy with it, and past versions of HP had seemed ok.
|
||
|
|
||
|
// is the standard comment sequence, but we prefer "C" since it inhibits m4
|
||
|
macro expansion. See comments in ia64-defs.m4.
|
||
|
|
||
|
|
||
|
REGISTER USAGE
|
||
|
|
||
|
Special:
|
||
|
r0: constant 0
|
||
|
r1: global pointer (gp)
|
||
|
r8: return value
|
||
|
r12: stack pointer (sp)
|
||
|
r13: thread pointer (tp)
|
||
|
Caller-saves: r8-r11 r14-r31 f6-f15 f32-f127
|
||
|
Caller-saves but rotating: r32-
|
||
|
|
||
|
|
||
|
REFERENCES
|
||
|
|
||
|
Intel Itanium Architecture Software Developer's Manual, volumes 1 to 3,
|
||
|
Intel document 245317-004, 245318-004, 245319-004 October 2002. Volume 1
|
||
|
includes an Itanium optimization guide.
|
||
|
|
||
|
Intel Itanium Processor-specific Application Binary Interface (ABI), Intel
|
||
|
document 245370-003, May 2001. Describes C type sizes, dynamic linking,
|
||
|
etc.
|
||
|
|
||
|
Intel Itanium Architecture Assembly Language Reference Guide, Intel document
|
||
|
248801-004, 2000-2002. Describes assembly instruction syntax and other
|
||
|
directives.
|
||
|
|
||
|
Itanium Software Conventions and Runtime Architecture Guide, Intel document
|
||
|
245358-003, May 2001. Describes calling conventions, including stack
|
||
|
unwinding requirements.
|
||
|
|
||
|
Intel Itanium Processor Reference Manual for Software Optimization, Intel
|
||
|
document 245473-003, November 2001.
|
||
|
|
||
|
Intel Itanium-2 Processor Reference Manual for Software Development and
|
||
|
Optimization, Intel document 251110-003, May 2004.
|
||
|
|
||
|
All the above documents can be found online at
|
||
|
|
||
|
http://developer.intel.com/design/itanium/manuals.htm
|
||
|
|
||
|
|
||
|
----------------
|
||
|
Local variables:
|
||
|
mode: text
|
||
|
fill-column: 76
|
||
|
End:
|