Commit Graph

274 Commits

Author SHA1 Message Date
jasonmoxham
1ff359d28a divrem_2 speedup 2009-05-27 11:21:58 +00:00
gladman
2bf84f6a74 2009-05-20 20:58:09 +00:00
gladman
dea89e4760 Update windows build for latest assembler 2009-05-20 20:09:06 +00:00
jasonmoxham
61cd1223a3 New asm function mpn_divrem_euclidean_qr_2 for X86_64 2009-05-20 17:58:41 +00:00
jasonmoxham
5e1c28e1df divrem_2 generic correction 2009-05-20 17:34:55 +00:00
jasonmoxham
e788f8b9b4 mul_basecase to yasm 2009-05-20 13:03:53 +00:00
jasonmoxham
5fb3fafe42 mul_basecase for GAS, so I know what the yasm conversion is of 2009-05-20 00:01:35 +00:00
gladman
659df23bbc 1. Improve g2y.py
2. Add divrem changes to Windows build
3. Add new x64 mpn_mul_basecase assembler to Windows
2009-05-19 19:53:15 +00:00
jasonmoxham
01da1f0caa divrem update 2009-05-19 12:48:09 +00:00
jasonmoxham
483fe4ed21 divrem updates 2009-05-19 12:27:10 +00:00
jasonmoxham
b5980c95b1 divrem bits 2009-05-19 09:51:30 +00:00
gladman
0af849c1f2 1. Fix minor bug in Windows build
2. change gcdext.c to avoid C99 features
2009-05-19 09:44:06 +00:00
jasonmoxham
9c68614d77 Add new function generic mpn_divrem_euclidean_qr_2 2009-05-19 09:30:34 +00:00
wbhart
a49b4e2337 More tuning values, not that they make any difference. 2009-05-19 05:44:48 +00:00
wbhart
3d66bd5322 Added tuning values for core2 and fixed a minor bug in mul_n.c. 2009-05-19 04:03:49 +00:00
wbhart
d96ef5e5d9 Reverted a change which slows things down on k8. 2009-05-19 02:04:09 +00:00
wbhart
e330cc79bc Fixed some bugs related to tuning gcdext and added tuning for toom4 and
toom7 squaring code.
2009-05-19 00:57:17 +00:00
wbhart
cfb5f9c0b2 Added copyright notice. 2009-05-18 21:00:51 +00:00
wbhart
2bf28e3c63 Added toom7 squaring and sped up multiplication slightly by better use
of the FFT.
2009-05-18 20:55:01 +00:00
wbhart
de64002818 Slight memory usage improvement for toom4 squaring code. 2009-05-18 07:28:48 +00:00
wbhart
8d8a26e60b Added toom4 squaring code - no tuning code yet! 2009-05-18 07:13:00 +00:00
wbhart
6b1c6afdbe Added half gcd implementation based on the original ngcd implementation
of Niels Mohler.
2009-05-16 21:57:44 +00:00
gladman
2cd6ee00c5 2009-05-14 20:35:12 +00:00
jasonmoxham
2768eeaaf0 New asm functions mpn_store MPN_ZERO for k8/k10/nehalem 2009-05-14 20:30:27 +00:00
jasonmoxham
a1b14414c6 New generic mpn_store and tests/tune etc 2009-05-14 19:29:28 +00:00
jasonmoxham
3b7c555c8e New generic functions/macros mpn_lshift2 mpn_rshift2 and tests/speed etc 2009-05-14 02:44:19 +00:00
jasonmoxham
ccf3200d93 mul_basecase tweeks 2009-05-13 22:51:35 +00:00
jasonmoxham
6ee4e35940 New asm function mpn_mul_basecase for K8/K10/Core2/Penryn/Nehalem 2009-05-13 19:49:42 +00:00
gladman
d6962d575f 1. Add new/changed Core2 assembler files to the Windows build
2. Workaround VC++ optimisation bug in mul_fft.c
2009-05-13 09:54:24 +00:00
gladman
47c7c6b832 Further update to Windows K8 build 2009-05-12 21:33:52 +00:00
gladman
9918886c2f Update Windows K8 build to add new assembler 2009-05-12 19:37:47 +00:00
wbhart
21f51a706c Added toom32 for unbalanced multiplications. 2009-05-12 18:28:20 +00:00
wbhart
c6881fa3a9 Fixed bugs in Toom3 code. 2009-05-12 09:22:27 +00:00
wbhart
fb914ab4ac Fixed some buglets in toom4. 2009-05-11 15:16:53 +00:00
wbhart
ae48f90e2f Fix speed regression in mul.c, switch unbalanced toom back on. Add
missing toom3_interpolate prototype.
2009-05-11 12:30:17 +00:00
wbhart
4babcebbfa Turned off unbalanced multiplications as they slow things down.
-This line, and those below, will be ignored--

M    mpn/generic/mul.c
2009-05-11 11:06:38 +00:00
wbhart
9e56c61071 Added toom42 and code to handle unbalanced multiplication. 2009-05-11 10:09:09 +00:00
jasonmoxham
0a1d07af4e New asm function mpn_sublsh1_n for K8/K10 2009-05-10 20:03:47 +00:00
jasonmoxham
574f3be308 New asm function mpn_divexact_byff for K8/K10/Core2/penryn/nehalem 2009-05-10 19:35:54 +00:00
jasonmoxham
359fab42b5 New asm functions mpn_rsh1add_n mpn_rsh1sub_n for K8/K10/Core2/penryn/nehalem 2009-05-10 18:46:48 +00:00
jasonmoxham
428e43b40e New asm functions mpn_addadd_n mpn_addsub_n mpn_subadd_n for K8/K10 2009-05-10 16:25:01 +00:00
wbhart
1b58a8b49e Speed toom4 up by passing some arguments to the interpolate code in the
output space so they don't have to be moved at the end.
2009-05-10 13:45:27 +00:00
wbhart
4f99bbe9fc Added missing toom3 file. 2009-05-10 07:15:02 +00:00
wbhart
9c79e0a98b Factored out mpn_toom3_sqr_n and mpn_toom3_mul_n and removed duplication
of mpn_toom3_interpolate. Rewrote mpn_toom3_sqr_n.
2009-05-10 07:12:38 +00:00
wbhart
c8aa69c789 Added toom3_mul_n with better memory usage. 2009-05-10 04:24:39 +00:00
jasonmoxham
90d8207a80 New functions mpn_sumdiff for core2/penryn/nehalem , or rather faster to do separate add and sub 2009-05-10 03:39:43 +00:00
jasonmoxham
b07549802a New asm functions mpn_add_n mpn_sub_n for Core2/penryn/nehalem 2009-05-10 01:26:52 +00:00
jasonmoxham
0c3c909910 New asm functions for mpn_copyi mpn_copyd for k8,k10,core2,penryn,nehalem 2009-05-10 00:20:44 +00:00
wbhart
1cc8b35cfe Another slight speedup. 2009-05-09 21:51:40 +00:00
wbhart
96e8e4e410 Added my copyright info. 2009-05-09 21:38:06 +00:00