Alexander Kruppa
|
e3d7be3b31
|
sublsh1_n by Nurmann, adapted to MPIR
addlsh1_n.as and sublsh1_n.as mostly unified now
|
2016-12-08 06:07:02 +01:00 |
|
Alexander Kruppa
|
85d53dbc6e
|
addlsh1_n by Nurmann, adapted to MPIR
|
2016-12-08 03:39:25 +01:00 |
|
Alexander Kruppa
|
2f172e1dce
|
mul_1 from GMP 6.1.1
|
2016-12-07 19:22:28 +01:00 |
|
Alexander Kruppa
|
4c7cdee83c
|
sqr_basecase from GMP 6.1.1
|
2016-12-07 19:02:22 +01:00 |
|
Alexander Kruppa
|
6bb39eab79
|
com_n, adapted from Nurmann's copyi code
|
2016-12-06 18:08:13 +01:00 |
|
Alexander Kruppa
|
1871f04956
|
addmul_1 and submul_1, converted from GMP
|
2016-12-05 22:55:21 +01:00 |
|
Alexander Kruppa
|
17687a2992
|
Haswell mul_basecase from GMP 6.1.1, converted to Intel syntax
|
2016-12-01 12:39:26 +01:00 |
|
Alexander Kruppa
|
e508181a75
|
Version of mpn/x86_64/sandybridge/sub_n.as, super-optimized for Haswell
New speed about 1.20c/l on Haswell, was 1.33c/l
|
2016-11-28 19:43:46 +01:00 |
|
Alexander Kruppa
|
5d75ebc8bf
|
Reduce number of registers used and use %defines for register names
|
2016-11-27 00:51:45 +01:00 |
|
Alexander Kruppa
|
d11c3ca728
|
Bugfix: operand name macros were wrong
|
2016-11-25 18:11:38 +01:00 |
|
Alexander Kruppa
|
ea49db539e
|
Revert "Temporarily removed due to bug"
This reverts commit 38e8585c05 .
|
2016-11-25 18:11:21 +01:00 |
|
Alexander Kruppa
|
38e8585c05
|
Temporarily removed due to bug
|
2016-11-25 15:27:21 +01:00 |
|
Alexander Kruppa
|
8100363a85
|
Version of mpn/x86_64/sandybridge/add_n.as, super-optimized for Haswell
New speed about 1.21c/l on Haswell, was 1.33c/l
|
2016-11-25 15:25:09 +01:00 |
|
Alexander Kruppa
|
6316e39430
|
Increasing copy with AVX2 for Haswell
|
2016-11-25 11:51:54 +01:00 |
|
Alexander Kruppa
|
29577b5109
|
Decreasing copy with AVX2 for Haswell
|
2016-11-24 02:01:38 +01:00 |
|
Alexander Kruppa
|
4660be16f6
|
AVX-based rshift for 4-issue Intel cpus (Haswell and newer)
|
2016-11-22 23:18:52 +01:00 |
|
Alexander Kruppa
|
105c26c466
|
AVX-based lshift for 4-issue Intel cpus (Haswell and newer)
|
2016-11-22 21:58:43 +01:00 |
|
Alexander Kruppa
|
99a1f8d05b
|
Add vzeroupper to avoid stall on Haswell if SSE2 code follows
|
2016-11-22 15:03:02 +01:00 |
|
Alexander Kruppa
|
aa75752824
|
AVX-based lshift1 and rshift1 for 4-issue Intel cpus (Haswell and newer)
|
2016-11-18 21:54:07 +01:00 |
|
William Hart
|
8435273a1a
|
Remove sb_div* small implementation (due to bug and due to being a very minor
performance improvement).
|
2015-11-13 14:47:44 +00:00 |
|
William Hart
|
45e7dbc9b4
|
Added piledriver, ivybridge, haswell to configure and fat build.
|
2014-03-25 17:32:34 +00:00 |
|