Commit Graph

3131 Commits

Author SHA1 Message Date
Brian Gladman
55752e8061 add the revised add_n/sub_n assembler code to the Windows build 2016-12-13 14:10:48 +00:00
Brian Gladman
df53b304fb Merge branch 'master' of https://github.com/akruppa/mpir 2016-12-13 13:21:32 +00:00
Alexander Kruppa
bd53d5749e add_n and sub_n with 8-way unrolling
1.075c/l on Haswell
2016-12-12 17:37:12 +01:00
Brian Gladman
89c11fbdfb Add the latest haswell and skylake code to the Windows x64 build 2016-12-10 14:15:40 +00:00
Brian Gladman
e2d20ad009 Merge branch 'master' of https://github.com/akruppa/mpir 2016-12-09 17:05:26 +00:00
Alexander Kruppa
cfc589609e Move to haswell/
This sumdiff_n is much slower on Haswell (2.6c/l) than on Skylake (2c/l)
but it still provides a ~3% speed up for a 1M limb FFT compared to having
no sumdiff_n at all.
2016-12-08 16:23:48 +01:00
Alexander Kruppa
e3d7be3b31 sublsh1_n by Nurmann, adapted to MPIR
addlsh1_n.as and sublsh1_n.as mostly unified now
2016-12-08 06:07:02 +01:00
Alexander Kruppa
85d53dbc6e addlsh1_n by Nurmann, adapted to MPIR 2016-12-08 03:39:25 +01:00
Alexander Kruppa
2f172e1dce mul_1 from GMP 6.1.1 2016-12-07 19:22:28 +01:00
Alexander Kruppa
4c7cdee83c sqr_basecase from GMP 6.1.1 2016-12-07 19:02:22 +01:00
Alexander Kruppa
ff7c73e955 Use local label names (.L) 2016-12-07 18:09:01 +01:00
Alexander Kruppa
95f95b17c6 Use local label names (.L)
Otherwise, profiling shows separate event counts for each jump label rather
than for the respective complete function
2016-12-07 17:46:16 +01:00
Alexander Kruppa
6bb39eab79 com_n, adapted from Nurmann's copyi code 2016-12-06 18:08:13 +01:00
Brian Gladman
e89e09a43d add latest skylake code to Windows x64 2016-12-06 12:44:17 +00:00
Brian Gladman
4c7fa87118 add assembler code for haswell, skylake and skylake_avx to the WIn64 build 2016-12-06 12:01:20 +00:00
Alexander Kruppa
1871f04956 addmul_1 and submul_1, converted from GMP 2016-12-05 22:55:21 +01:00
Brian Gladman
a5193faa89 Merge branch 'master' of https://github.com/akruppa/mpir 2016-12-05 16:04:01 +00:00
Alexander Kruppa
4459641bad sumdiff_n optimized for Skylake
2c/l
2016-12-05 16:40:57 +01:00
Brian Gladman
fab2c75e45 Merge branch 'master' of https://github.com/akruppa/mpir 2016-12-05 15:05:38 +00:00
Alexander Kruppa
01b8132c41 Identify skylakeavx and skylake and set path accordingly 2016-12-05 15:12:35 +01:00
Brian Gladman
36983f9049 add Haswell mpn_mul_basecase and mpn_sub_n/nc for Win64; tidy up YASM macros 2016-12-01 16:51:05 +00:00
Brian Gladman
8b415aec99 Merge branch 'master' of https://github.com/akruppa/mpir 2016-12-01 15:22:06 +00:00
Alexander Kruppa
17687a2992 Haswell mul_basecase from GMP 6.1.1, converted to Intel syntax 2016-12-01 12:39:26 +01:00
Brian Gladman
0432e1511d update the MBBUILD script for Visual Studio 2017 2016-11-30 09:09:50 +00:00
Alexander Kruppa
e508181a75 Version of mpn/x86_64/sandybridge/sub_n.as, super-optimized for Haswell
New speed about 1.20c/l on Haswell, was 1.33c/l
2016-11-28 19:43:46 +01:00
Brian Gladman
37c147fff3 Merge branch 'master' of https://github.com/BrianGladman/mpir 2016-11-27 15:25:48 +00:00
Brian Gladman
17b81f6006 add mpn_add_n and mpn_add_nc to x64 haswell build 2016-11-27 14:06:56 +00:00
Brian Gladman
d0d949835a Merge branch 'master' of https://github.com/akruppa/mpir 2016-11-27 10:56:14 +00:00
Alexander Kruppa
5d75ebc8bf Reduce number of registers used and use %defines for register names 2016-11-27 00:51:45 +01:00
Brian Gladman
d61bdcaf09 set release build of tests for foster linking 2016-11-26 22:52:26 +00:00
Brian Gladman
77b483e79f add more win64 assembler for haswell 2016-11-26 22:35:25 +00:00
Brian Gladman
a95556b926 Merge branch 'master' of https://github.com/akruppa/mpir 2016-11-26 18:35:20 +00:00
Brian Gladman
ee198165c9 prepare to add win64 assembler code with parameters in XMM/YMM registers 2016-11-26 09:41:07 +00:00
Alexander Kruppa
d11c3ca728 Bugfix: operand name macros were wrong 2016-11-25 18:11:38 +01:00
Alexander Kruppa
ea49db539e Revert "Temporarily removed due to bug"
This reverts commit 38e8585c05.
2016-11-25 18:11:21 +01:00
Brian Gladman
ed3aa00581 Merge branch 'master' of https://github.com/akruppa/mpir 2016-11-25 15:49:11 +00:00
Alexander Kruppa
38e8585c05 Temporarily removed due to bug 2016-11-25 15:27:21 +01:00
Alexander Kruppa
8100363a85 Version of mpn/x86_64/sandybridge/add_n.as, super-optimized for Haswell
New speed about 1.21c/l on Haswell, was 1.33c/l
2016-11-25 15:25:09 +01:00
Alexander Kruppa
aac660af90 Merge branch 'master' of ../mpir.wbhart 2016-11-25 15:11:17 +01:00
Alexander Kruppa
f7f64a4ff2 Add missing colon 2016-11-25 14:55:31 +01:00
Alexander Kruppa
6316e39430 Increasing copy with AVX2 for Haswell 2016-11-25 11:51:54 +01:00
Alexander Kruppa
29577b5109 Decreasing copy with AVX2 for Haswell 2016-11-24 02:01:38 +01:00
Alexander Kruppa
4660be16f6 AVX-based rshift for 4-issue Intel cpus (Haswell and newer) 2016-11-22 23:18:52 +01:00
Alexander Kruppa
105c26c466 AVX-based lshift for 4-issue Intel cpus (Haswell and newer) 2016-11-22 21:58:43 +01:00
Alexander Kruppa
99a1f8d05b Add vzeroupper to avoid stall on Haswell if SSE2 code follows 2016-11-22 15:03:02 +01:00
Brian Gladman
a9c1f81c03 Merge branch 'master' of github.com:wbhart/mpir 2016-11-21 16:25:40 +00:00
Brian Gladman
27e58df332 Merge branch 'master' of github.com:wbhart/mpir 2016-11-21 16:25:10 +00:00
wbhart
160fd98d1a Merge pull request #179 from averkhaturau/master
c++ compilation error fixed
2016-11-21 14:30:00 +01:00
Brian Gladman
3abd2d97e0 Add Visual Studio 2017 build files 2016-11-18 23:13:26 +00:00
Brian Gladman
2221baffed make Visual Studio 2017 the default build version 2016-11-18 23:07:33 +00:00