Commit Graph

336 Commits

Author SHA1 Message Date
(no author)
1d88fcbc0f Fixed a tuning bug and tuned for K102. 2010-02-19 16:53:01 +00:00
(no author)
2175d5423a Retuned for core2/penryn. 2010-02-19 16:20:55 +00:00
(no author)
596ed07fbc Added some tuning for core2/penryn and some additional functions to be
built by tuning code.
2010-02-19 16:09:02 +00:00
(no author)
f50d7f04ff Hopefully fixed tuning of division routines. 2010-02-19 15:22:22 +00:00
(no author)
f444a2bf6c Attempt to tune some of the division functions. 2010-02-19 12:54:56 +00:00
(no author)
2e2976dd0e Added tuning info for core2/penryn. 2010-02-18 16:03:50 +00:00
(no author)
19b37fb7c1 Added speed, tune and try code for toom8_mul/sqr and tuning for K102. 2010-02-18 14:50:40 +00:00
(no author)
fea5f0c849 Added a missing mpir.h and some proxy tuning values for core2/penryn
toom8 squaring.
2010-02-18 13:46:24 +00:00
(no author)
9cb500d53b Turned on toom8 squaring code. 2010-02-18 13:40:38 +00:00
(no author)
3fad25a740 Added toom8 files. 2010-02-16 23:47:07 +00:00
(no author)
a2c42da38f Fiddled with tuning value. 2010-02-13 21:20:12 +00:00
(no author)
20ae4de5b1 Ran autoconf, connnected up rootrem code and changed tuning value. 2010-02-13 21:18:11 +00:00
(no author)
1758461822 Another minor change to core2 tuning values. Seems to slightly improve
timings (though almost imperceptibly).
2010-02-09 00:43:06 +00:00
(no author)
9639773959 Hand tuned SQR thresholds for core2/penryn. 2010-02-07 15:43:19 +00:00
(no author)
2edb3c830a Added a clarification to the copyright. 2010-02-07 14:07:42 +00:00
(no author)
0c7b48b1fd Convert divrem_2.asm to yasm format. 2010-02-07 14:02:39 +00:00
(no author)
a54d23d652 Convert divem_euclidean_qr_1.asm to yasm format. 2010-02-07 13:32:16 +00:00
(no author)
0f5bb75287 New tuning values for core2/penryn. 2010-02-07 03:28:55 +00:00
(no author)
06d4a32943 Added divrem_1 and divrem_2 x86_64 assembly code from GMP, replaced
divrem_euclidean_qr_1 with divrem_1.
2010-02-06 13:28:23 +00:00
(no author)
0bd0a87f7e Temporarily committing these so they can be converted to yasm format. 2010-02-06 03:16:22 +00:00
wbhart
c8d33128b5 Made a change to hopefully support unaligned memory allocation as requested by
Dan Grayson. Passes make check and try mpn_lshift.
2010-01-10 23:54:55 +00:00
wbhart
0200f63f3b Added tuning params for K102. Added new fft code to main directory. 2010-01-01 14:06:41 +00:00
jasonmoxham
829dacd87a New core2/penryn asm functions popcount hamdist 2009-11-19 10:53:45 +00:00
jasonmoxham
afc620f493 new K8 asm functions mpn_popcount mpn_hamdist 2009-11-19 10:38:40 +00:00
jasonmoxham
367f00f4fe New K8 asm functions mpn_and,ior,xor 2009-11-19 10:35:17 +00:00
jasonmoxham
beb4d5b735 New K8 asm funtions mpn_lshift2,3,4,5,6 2009-11-19 10:32:03 +00:00
jasonmoxham
7bd0558974 New atom asm funtion mpn-copyi 2009-11-19 09:09:50 +00:00
jasonmoxham
ef2b3db6ef remove # comments 2009-11-19 08:49:14 +00:00
jasonmoxham
8bcfe2975a New nehalem asm mpn_copyi mpn_copyd 2009-11-19 08:45:27 +00:00
jasonmoxham
579e36f2b1 New nehalem asm mpn_com 2009-11-19 08:44:50 +00:00
jasonmoxham
59bf8d86e4 new nehalem asm logic mpn fn's 2009-11-19 08:37:54 +00:00
jasonmoxham
dcf3afa567 convert addlsh from gas to yasm format 2009-11-18 17:43:25 +00:00
jasonmoxham
c6af9fbfc2 some more masm? movq/movd mixups 2009-10-16 00:45:14 +00:00
jasonmoxham
4ba747128d change movq to movd for old masm assembler 2009-10-15 18:21:27 +00:00
jasonmoxham
be135c7347 change asm #comment to C comment 2009-10-15 18:13:19 +00:00
jasonmoxham
c9f16233b8 add back in old fft tuning values , better than nothing 2009-10-08 22:50:40 +00:00
jasonmoxham
518226d914 atom params 2009-10-08 22:06:01 +00:00
jasonmoxham
93688a18b6 k10 params 2009-10-08 18:01:18 +00:00
jasonmoxham
2f138f7a16 core2 params 2009-10-08 15:48:27 +00:00
jasonmoxham
e49eccab57 k8 params 2009-10-08 15:30:59 +00:00
jasonmoxham
4053a62930 k102 params 2009-10-08 15:28:51 +00:00
jasonmoxham
2821267426 nehalem 64 params 2009-10-07 12:00:35 +00:00
jasonmoxham
ff4fc75bb8 New core2/penryn asm fns mod_1_? divrem_hensel_qr_1_2 rsh_divrem_hensel_qr_1_2 2009-10-05 15:02:50 +00:00
jasonmoxham
a10fd0d649 New atom asm functions mod_1_? rsh_divrem_hensel_qr_1_1 divrem_hensel_qr_1_1 2009-10-05 14:50:05 +00:00
jasonmoxham
dc1fc39381 New nehalem asm functions mod_1_? rsh_divrem_hensel_qr_1_2 divrem_hensel_qr_1_2 2009-10-05 14:19:08 +00:00
jasonmoxham
ad4a181469 New AMD asm function mpn_mod_1_3 2009-10-04 23:49:02 +00:00
jasonmoxham
bf0f5c4e6f add carry limb into the existing rsh_divrem_hensel 2009-10-04 02:16:25 +00:00
jasonmoxham
cdaad5bffc New AMD asm function mpn_rsh_divrem_hensel_qr_1_2 2009-10-03 22:21:44 +00:00
jasonmoxham
90d8b76405 New amd asm function mpn_mod_1_2 2009-10-03 00:04:27 +00:00
jasonmoxham
520fec686d tweek mod_1_1 amd asm to full speed 2009-10-02 02:57:31 +00:00
jasonmoxham
adfded6fe5 split out mpn_mod_1_? from divrem_euclidean_r and add New asm function for AMD for mod_1_1 2009-10-01 22:04:24 +00:00
jasonmoxham
07f97a0963 new AMD asm function mpn_divrem_hensel_qr_1_2 2009-09-30 23:41:09 +00:00
jasonmoxham
24d1b6c39b rename divrem_hensel amd asm to match 2009-09-30 02:52:41 +00:00
jasonmoxham
56801786a7 New asm functions for AMD divrem_hensel_qr_1 divrem_hensel_r_1 2009-09-29 23:52:09 +00:00
jasonmoxham
4783c9dc4e New AMD rsh_divrem_hensel_qr_1 asm fn 2009-09-29 21:53:03 +00:00
jasonmoxham
a2bf208858 add via nano cpuid and code path 2009-09-27 23:12:12 +00:00
jasonmoxham
f446d380ce whoops uppercase instead of lower case 2009-09-08 02:05:31 +00:00
jasonmoxham
993fab5c21 rename K10_2 to K102 as autotools doesnt like - and fat mechanism doesn't like _ 2009-09-08 01:55:29 +00:00
jasonmoxham
9010f58508 update fat to cope with K10_2 and core2,k8 etc on 32bit 2009-09-08 01:49:01 +00:00
jasonmoxham
a6542196d3 Select best asm functions from existing for Atom cpu 64bit 2009-09-06 12:49:19 +00:00
jasonmoxham
70c9a062d9 New asm functions for nehalem mpn_add_err1_n mpn_sub_err1_n 2009-09-01 15:03:33 +00:00
jasonmoxham
9a315eef2c mixed up gas and yasm syntax 2009-08-23 23:52:01 +00:00
jasonmoxham
1c4bb4fa9f didn't like it as a macro , so new amd asm functions mpn_inclsh mpn_declsh 2009-08-23 23:44:19 +00:00
jasonmoxham
a614713922 New macro/function for AMD mpn_inclsh_n 2009-08-23 23:04:14 +00:00
jasonmoxham
573b911273 New asm functions K8/K10 mpn_addlsh_n mpn_sublsh_n and carry-in varients 2009-08-23 22:20:49 +00:00
jasonmoxham
7268e5f9ac New asm function nehalem mpn_addlsh_n , delete old mpn_addlsh1_n 2009-08-23 17:57:21 +00:00
jasonmoxham
620c9e38df core2/penryn new addlsh faster than old addlsh1 , so delete 2009-08-23 16:43:52 +00:00
jasonmoxham
211e597c89 add new function core2/penryn mpn_addlsh_n 2009-08-23 15:58:03 +00:00
jasonmoxham
303f9fb219 New K8/K10 asm function mpn_sub_err1_n 2009-08-18 22:36:21 +00:00
jasonmoxham
392ea17854 New K8/K10 asm function add_err1 2009-08-18 15:37:23 +00:00
jasonmoxham
4f9d128e34 New asm functions mpn_copyi for core2/penryn 2009-08-14 09:03:07 +00:00
jasonmoxham
a69bf92c40 New asm function nehalem mpn_store 2009-08-13 09:57:49 +00:00
jasonmoxham
59b98ca38f New core2/penryn mpn_store 2009-08-13 08:59:39 +00:00
jasonmoxham
1a7781ade8 New K8 asm function mpn_lshiftc 2009-08-12 03:13:11 +00:00
jasonmoxham
942d2666ca Tweek K8 mpn_rshift 2009-08-11 02:27:30 +00:00
jasonmoxham
2d1deac90c New K8 asm functions mpn_lshift2 mpn_rshift2 2009-07-24 11:54:46 +00:00
jasonmoxham
2dae13c07c New intel x86_64 assembler code for left/right shift 2009-07-24 11:50:15 +00:00
wbhart
cc86e972e5 Merged in David Harvey's mulmid code - not actually used by anything
yet. No division code.
2009-07-24 03:12:09 +00:00
wbhart
9d8438f70b Added toom53 and fiddled with the toom4 cutoff on penryn. 2009-07-23 07:48:34 +00:00
wbhart
42c02e9733 Moved new mul_basecase into netburst directory. 2009-06-07 00:32:55 +00:00
wbhart
d599623d25 Added tuning values for netburst. 2009-06-06 23:37:19 +00:00
wbhart
6b49328621 Slightly relaxed the conditions used in make tune to prevent tuning malfunctions and added make tune values for netburst. 2009-06-02 15:35:31 +00:00
wbhart
bc25f5bacd More generic x86_64 tuning values and values for fat binaries (taken from k8 values). 2009-05-31 18:47:52 +00:00
wbhart
987e5940a8 Added some generic x86_64 tuning values (just copied from K8). 2009-05-31 18:40:51 +00:00
jasonmoxham
4a767f802c update fat for divrem_2 2009-05-29 17:28:19 +00:00
wbhart
b96d7f466b Tuning parameters for Core2. 2009-05-28 09:20:35 +00:00
wbhart
b0db490a0b K8 tuning values. 2009-05-28 00:12:42 +00:00
jasonmoxham
2fff63ed30 nehalem mparam update 2009-05-27 23:02:45 +00:00
wbhart
5a67fa8b45 Added K10 tuning values. 2009-05-27 22:27:18 +00:00
wbhart
f33c6a799e Tuning parameters for penryn. 2009-05-27 19:34:25 +00:00
jasonmoxham
749c195a7c Convert new divrem to yasm format 2009-05-27 14:28:30 +00:00
jasonmoxham
1ff359d28a divrem_2 speedup 2009-05-27 11:21:58 +00:00
jasonmoxham
61cd1223a3 New asm function mpn_divrem_euclidean_qr_2 for X86_64 2009-05-20 17:58:41 +00:00
jasonmoxham
e788f8b9b4 mul_basecase to yasm 2009-05-20 13:03:53 +00:00
jasonmoxham
5fb3fafe42 mul_basecase for GAS, so I know what the yasm conversion is of 2009-05-20 00:01:35 +00:00
jasonmoxham
9c68614d77 Add new function generic mpn_divrem_euclidean_qr_2 2009-05-19 09:30:34 +00:00
wbhart
a49b4e2337 More tuning values, not that they make any difference. 2009-05-19 05:44:48 +00:00
wbhart
3d66bd5322 Added tuning values for core2 and fixed a minor bug in mul_n.c. 2009-05-19 04:03:49 +00:00
wbhart
d96ef5e5d9 Reverted a change which slows things down on k8. 2009-05-19 02:04:09 +00:00
wbhart
e330cc79bc Fixed some bugs related to tuning gcdext and added tuning for toom4 and
toom7 squaring code.
2009-05-19 00:57:17 +00:00
jasonmoxham
2768eeaaf0 New asm functions mpn_store MPN_ZERO for k8/k10/nehalem 2009-05-14 20:30:27 +00:00
jasonmoxham
3b7c555c8e New generic functions/macros mpn_lshift2 mpn_rshift2 and tests/speed etc 2009-05-14 02:44:19 +00:00
jasonmoxham
ccf3200d93 mul_basecase tweeks 2009-05-13 22:51:35 +00:00
jasonmoxham
6ee4e35940 New asm function mpn_mul_basecase for K8/K10/Core2/Penryn/Nehalem 2009-05-13 19:49:42 +00:00
jasonmoxham
0a1d07af4e New asm function mpn_sublsh1_n for K8/K10 2009-05-10 20:03:47 +00:00
jasonmoxham
574f3be308 New asm function mpn_divexact_byff for K8/K10/Core2/penryn/nehalem 2009-05-10 19:35:54 +00:00
jasonmoxham
359fab42b5 New asm functions mpn_rsh1add_n mpn_rsh1sub_n for K8/K10/Core2/penryn/nehalem 2009-05-10 18:46:48 +00:00
jasonmoxham
428e43b40e New asm functions mpn_addadd_n mpn_addsub_n mpn_subadd_n for K8/K10 2009-05-10 16:25:01 +00:00
jasonmoxham
90d8207a80 New functions mpn_sumdiff for core2/penryn/nehalem , or rather faster to do separate add and sub 2009-05-10 03:39:43 +00:00
jasonmoxham
b07549802a New asm functions mpn_add_n mpn_sub_n for Core2/penryn/nehalem 2009-05-10 01:26:52 +00:00
jasonmoxham
0c3c909910 New asm functions for mpn_copyi mpn_copyd for k8,k10,core2,penryn,nehalem 2009-05-10 00:20:44 +00:00
wbhart
7a0e036d36 Fixed toom4 and toom7 issues and added k8 tuning code. 2009-05-09 21:12:13 +00:00
wbhart
72f93a085c Added new toom3 code. 2009-05-09 20:56:34 +00:00
jasonmoxham
3599d92433 converted addmul_2 to yasm 2009-04-14 17:00:30 +00:00
jasonmoxham
b5aef8ffc3 fat bits 2009-04-13 22:15:45 +00:00
jasonmoxham
996bd50496 add divrem_euclidean_qr_1 , divexact_byBm1of to fat structure 2009-04-13 21:42:55 +00:00
jasonmoxham
38072364ee duplicate x86_64 mul_2.as to overcome fat issues 2009-04-13 20:32:16 +00:00
jasonmoxham
ec89cb8c61 removed divrem_hensel.asm 2009-04-13 20:14:03 +00:00
jasonmoxham
875685f2a1 update gmp-mparam's 2009-04-11 04:03:34 +00:00
jasonmoxham
057df2db7c add header 2009-04-10 23:07:41 +00:00
jasonmoxham
d2038f6348 new x86_64 addmul_2.asm :note convert to yasm , update netburst gmp-mparam.h 2009-04-10 22:58:42 +00:00
jasonmoxham
87af550e1c copy more core2 asm function to netburst 2009-04-06 20:59:23 +00:00
jasonmoxham
f70778cb24 copy some core2 asm to netburst 2009-04-06 20:51:21 +00:00
jasonmoxham
6787300718 remove un-needed case in mul_basecase.as for x86_64 2009-04-02 00:25:40 +00:00
jasonmoxham
d6f0373c37 update gmp-mparam for k10,core2,penryn 2009-04-01 22:48:19 +00:00
jasonmoxham
5ecc4581da nehalem,k8 tune params 2009-04-01 22:13:15 +00:00
jasonmoxham
863fd95eb1 update k10,nehalem,core2,penryn gmp-mparam.h 2009-04-01 13:37:17 +00:00
wbhart
e42709e967 Added toom4 multiplication. 2009-04-01 08:21:03 +00:00
jasonmoxham
8ca3be5bef merge div-branch into trunk with svn merge -r 1782:1816 ../branches/x86_64-division/ run on my local trunk 2009-03-31 23:56:06 +00:00
jasonmoxham
587bf31b2c New assembler x86_64 mpn_mul_2 2009-03-31 22:50:46 +00:00
wbhart
32409ddc7d Removed superfluous instructions from conversion to yasm format in
diveby3.
2009-03-29 19:05:14 +00:00
jasonmoxham
2235444edf x86_64 mpn_subadd_n plus tests,tune 2009-03-29 10:49:51 +00:00
jasonmoxham
5a048dae03 merged x86_64 cpuid branch into trunk with svn merge -r 1755:1779 ../branches/x86_64_cpuid/ run in my local copy of trunk 2009-03-19 19:52:22 +00:00
jasonmoxham
ef025d7676 removed space 2009-03-15 14:27:26 +00:00
jasonmoxham
f2a624baa2 remove crlf from old add/sub_n and remove yasm macros from GLOBAL_FUNC names 2009-03-15 13:29:03 +00:00
jasonmoxham
bcdb64a903 copy add/sub from mpir-0.9/mpn/x86_64/amd/ to mpn/x86_64/core2/ for the nocona with no lahf 2009-03-15 02:22:21 +00:00
jasonmoxham
599f86a919 lahf and nocona hack... aaarrrrgggghhhh 2009-03-15 01:18:54 +00:00
jasonmoxham
9466115888 Atom cpuid update 2009-03-14 00:16:29 +00:00
jasonmoxham
4e092271ed Nehalem cpuid update 2009-03-13 20:00:56 +00:00
jasonmoxham
ea9ce09036 delete amd copyi.as and copyd.as 2009-03-06 16:01:39 +00:00
jasonmoxham
5cfca1657e remove crlf from k10 asm files 2009-03-06 15:38:21 +00:00
jasonmoxham
f920a71acf remove define test for copyi/d 2009-03-06 05:17:31 +00:00
jasonmoxham
a0c2458b0b added include files 2009-03-06 05:08:36 +00:00
jasonmoxham
799f347514 cant spell 2009-03-06 05:00:44 +00:00
jasonmoxham
77060ac2f6 move amd specific copy back to amd dir , and write new fat fallback copy fn 2009-03-06 04:55:43 +00:00
jasonmoxham
3242063820 removed dos crlf from linux asm files , update configure to recognize GLOBAL_FUNC for HAVE_NATIVE_functions 2009-03-05 17:50:57 +00:00
wbhart
0de1cfd773 Changed alignb #,nop back to align # because it appears to make no
difference. 

Got rid of relative paths for yasm_mac.inc.
2009-03-05 16:28:17 +00:00
wbhart
2831de1ed4 Jason Moxham's Core 2 assembly code to yasm format. 2009-03-05 15:48:35 +00:00
wbhart
f596e5d3ed Last of Jason Moxham's K8 assembly code converted to yasm format. 2009-03-04 22:01:05 +00:00
wbhart
47be515d09 More of Jason Moxham's code converted to yasm format. 2009-03-04 21:42:45 +00:00