Commit Graph

238 Commits

Author SHA1 Message Date
wbhart
9e56c61071 Added toom42 and code to handle unbalanced multiplication. 2009-05-11 10:09:09 +00:00
jasonmoxham
0a1d07af4e New asm function mpn_sublsh1_n for K8/K10 2009-05-10 20:03:47 +00:00
jasonmoxham
574f3be308 New asm function mpn_divexact_byff for K8/K10/Core2/penryn/nehalem 2009-05-10 19:35:54 +00:00
jasonmoxham
359fab42b5 New asm functions mpn_rsh1add_n mpn_rsh1sub_n for K8/K10/Core2/penryn/nehalem 2009-05-10 18:46:48 +00:00
jasonmoxham
428e43b40e New asm functions mpn_addadd_n mpn_addsub_n mpn_subadd_n for K8/K10 2009-05-10 16:25:01 +00:00
wbhart
1b58a8b49e Speed toom4 up by passing some arguments to the interpolate code in the
output space so they don't have to be moved at the end.
2009-05-10 13:45:27 +00:00
wbhart
4f99bbe9fc Added missing toom3 file. 2009-05-10 07:15:02 +00:00
wbhart
9c79e0a98b Factored out mpn_toom3_sqr_n and mpn_toom3_mul_n and removed duplication
of mpn_toom3_interpolate. Rewrote mpn_toom3_sqr_n.
2009-05-10 07:12:38 +00:00
wbhart
c8aa69c789 Added toom3_mul_n with better memory usage. 2009-05-10 04:24:39 +00:00
jasonmoxham
90d8207a80 New functions mpn_sumdiff for core2/penryn/nehalem , or rather faster to do separate add and sub 2009-05-10 03:39:43 +00:00
jasonmoxham
b07549802a New asm functions mpn_add_n mpn_sub_n for Core2/penryn/nehalem 2009-05-10 01:26:52 +00:00
jasonmoxham
0c3c909910 New asm functions for mpn_copyi mpn_copyd for k8,k10,core2,penryn,nehalem 2009-05-10 00:20:44 +00:00
wbhart
1cc8b35cfe Another slight speedup. 2009-05-09 21:51:40 +00:00
wbhart
96e8e4e410 Added my copyright info. 2009-05-09 21:38:06 +00:00
wbhart
6ed1dd6474 Whoops I screwed up toom4 and toom7, putting them back now. 2009-05-09 21:23:15 +00:00
wbhart
7a0e036d36 Fixed toom4 and toom7 issues and added k8 tuning code. 2009-05-09 21:12:13 +00:00
wbhart
72f93a085c Added new toom3 code. 2009-05-09 20:56:34 +00:00
gladman
d942415a1c 1. Update Windows Powershell scripts in mpirbench to refer to MPIR rather than GMP
2. Update MPIR version number in Windows config files to 1.1.2 
3. Add an MSVC inline definition in in gmp-h.in
4. Correct locale test (as per GMP correction)
5. Add Windows x64 set/copy intrinsics to mul_fft.c (improves FFT speed score by 2%)
2009-05-09 13:26:27 +00:00
wbhart
0ba06242c6 Fixed some bugs in best_k code used by FFT. 2009-05-09 02:54:08 +00:00
wbhart
911916ce7e Fixed a carry issue with tc*_addmul which created a requirement for extra memory in toom code. 2009-05-08 14:12:47 +00:00
wbhart
5624d9a6fc New toom4 and toom7 code.
* Don't make copies before basecase multiplications
* Factor out interpolation code
* Convert interpolation code to twos complement
* Optimise code using new assembly functions where available
2009-05-08 13:21:14 +00:00
gladman
bd34c0bfc5 1. Update g2y.py, the GAS to YASM Python script
2. Provide tuning for new FFT code
3. Add some documentation to YASM assembler macros for Windows
2009-05-06 18:20:52 +00:00
wbhart
2ad5066cea Tried to clean up a little. 2009-05-05 23:52:02 +00:00
wbhart
49441a5e20 Fixed bug in mul_fft.c 2009-05-05 22:18:16 +00:00
gladman
eeaca671af Remove C99 features in mul_fft.c 2009-05-05 20:41:29 +00:00
wbhart
041df82e0d Added Zimmermann et al's FFT (after making a bug fix). 2009-05-05 12:27:29 +00:00
gladman
57f06bfe7e 2009-05-01 19:03:56 +00:00
gladman
c503ef2397 commit missing Windows assembler file to trunk 2009-04-22 08:01:40 +00:00
gladman
0e21f1f351 Commit missing asm files 2009-04-17 20:36:47 +00:00
wbhart
fd32e5fb9c Credit Bodrato in the way he requested. 2009-04-15 22:03:24 +00:00
wbhart
79141ad994 Removed some broken asserts from toom code. 2009-04-15 00:48:33 +00:00
gladman
df55ce913a Windows Core2 tune 2009-04-14 19:42:07 +00:00
jasonmoxham
255c8c255c assertion correction in divrem_euclidean 2009-04-14 17:52:38 +00:00
gladman
121ab16108 Add Windows assembler files to trunk 2009-04-14 17:35:03 +00:00
jasonmoxham
3599d92433 converted addmul_2 to yasm 2009-04-14 17:00:30 +00:00
gladman
19faf1830c Update Windows build for latest code 2009-04-14 16:39:30 +00:00
jasonmoxham
f32b00d850 div update 2009-04-13 22:50:46 +00:00
jasonmoxham
b5aef8ffc3 fat bits 2009-04-13 22:15:45 +00:00
jasonmoxham
996bd50496 add divrem_euclidean_qr_1 , divexact_byBm1of to fat structure 2009-04-13 21:42:55 +00:00
jasonmoxham
38072364ee duplicate x86_64 mul_2.as to overcome fat issues 2009-04-13 20:32:16 +00:00
jasonmoxham
ec89cb8c61 removed divrem_hensel.asm 2009-04-13 20:14:03 +00:00
jasonmoxham
25ada3c3a6 make dist-hook to change yasm/Makefile.in to stop install/check 2009-04-11 20:51:08 +00:00
wbhart
577614e228 Fixed mpbsd and got rid of mocked up mpz version of tc4_divexact_ui and
tc7_divexact_ui.
2009-04-11 11:36:35 +00:00
jasonmoxham
044eb665ec removed divrem_hensel_1 , save it for a rainy day :) 2009-04-11 04:08:14 +00:00
jasonmoxham
875685f2a1 update gmp-mparam's 2009-04-11 04:03:34 +00:00
jasonmoxham
057df2db7c add header 2009-04-10 23:07:41 +00:00
jasonmoxham
d2038f6348 new x86_64 addmul_2.asm :note convert to yasm , update netburst gmp-mparam.h 2009-04-10 22:58:42 +00:00
wbhart
d594e53c6b Changed athlon to k8 in line with cpuid.c. 2009-04-09 07:47:01 +00:00
jasonmoxham
525b748078 change gmplink to gmpcompat and update docs etc 2009-04-08 13:41:51 +00:00
jasonmoxham
87af550e1c copy more core2 asm function to netburst 2009-04-06 20:59:23 +00:00