Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

simpqs failure #39

Open
hvds opened this issue May 3, 2023 · 7 comments
Open

simpqs failure #39

hvds opened this issue May 3, 2023 · 7 comments

Comments

@hvds
Copy link

hvds commented May 3, 2023

I have a note from back in December to investigate this, but never did get around to it. As I understand it, if we choose to call simpqs as part of the factorization strategy it should always succeed; however the master branch failed to find the factors in this case:

945963552037903692304185224846621632975583515796777435749818606681847712555267388667817
= 21744489429639490589994133152841 * 43503599157793016488853604280294370203685375046705264737

Maybe by coincidence, Alpertron also failed on the same number (alpertron/calculators#22).

@danaj
Copy link
Owner

danaj commented Jun 25, 2023

Sorry about the delay, I'm looking at this. I see Alpertron fixed it in their code.

There has been a newer version of SIMPQS from FLINT for a number of years, but it would take some work to port and tune. My understanding was it was performance and slightly larger numbers, rather than bug fixes.

What should really happen is I write my own code.

Tilman Neumann had a lot of great ideas and factoring code he posted on mersenneforum a few years back. All in Java but the algorithms are similar enough. See https://github.com/TilmanNeumann/java-math-library.

@hvds
Copy link
Author

hvds commented Jun 25, 2023

Thanks @danaj; just to clarify, were you able to reproduce the failure?

@danaj
Copy link
Owner

danaj commented Jun 26, 2023

Yes. It does not factor even when the number of relations is raised to 70k. It takes about 45 minutes on my laptop for an attempt. This is pretty close to the size limit of SIMPQS 1.0, and it doesn't have a lot of testing at this size. Bumping up the number of relations and sieving size doesn't help.

C-Quadratic-Sieve (another open source project) needs a higher than normal size limit, but successfully factors in 74 minutes.

SIMPQS 2.0 succeeded in 22 minutes. Combining the partial relations really speeds it up -- version 1.0 (in this code) finds hundreds of thousands of partials but doesn't do anything with them.

@hvds
Copy link
Author

hvds commented Mar 19, 2024

For the record, here's another one that fails (confirmed against master at 2389dcb):

# qs trying 9180435894215652020112626451519741807924491872283194331211065824189266475562769247511369 (89 digits)
# qs    mult 1, digits 89, sieving 384000, primes 72000
# qs: 9180435894215652020112626451519741807924491872283194331211065824189266475562769247511369
Nothing found (4664.35)

Factorization from Alpertron is 1059405511716795159376523757718030895723651 * 8665648604506982829490012915377592720602462019.

@hvds
Copy link
Author

hvds commented Mar 24, 2024

I spent some time playing with simpqs-2.0, and modifying it to avoid the need for temp files. Numbers below are for that code before and after my modifications (5 lines from simpqs with -DTIMING, 2 lines from /usr/bin/time. I think there are some obvious further improvements I can make, and some leak testing to do, then I'll start looking at the MPUG implementation and working on a PR.

The two test cases above also succeed; memory consumption reported by /usr/bin/time for those are 665,464k and 781,248k respectively, which seems reasonable to me.

I'm not sure why the advantage of the in-memory approach drops off so much for the longer test numbers, so I may be doing something silly. Bugs I had to fix in my changes before getting correct results were: ensure lists of factors are sorted; when combining a pair of partials, remove only one from the list not both; make sure there are no duplicates in the list of full relations.

I also spotted a bug in simpqs-2.0 causing it to accidentally drop one of the partial or full relationships from each batch when it sorts it for merging to the main list. Patch below.

7520587276150280509023659781989524950620229823
before:
Total time = 55,246,645,980
Polynomial generation time = 469,877,890
Small prime sieve time = 878,725,280
Large prime sieve time = 121,628,444
Evaluate sieve time = 53,776,414,366
0.75user 0.18system 0:15.39elapsed 6%CPU (0avgtext+0avgdata 7008maxresident)k
0inputs+32496outputs (0major+1154minor)pagefaults 0swaps

after:
Total time = 396,436,816
Polynomial generation time = 87,586,572
Small prime sieve time = 163,559,598
Large prime sieve time = 21,052,250
Evaluate sieve time = 124,238,396
0.15user 0.00system 0:00.16elapsed 100%CPU (0avgtext+0avgdata 10204maxresident)k
0inputs+0outputs (0major+1985minor)pagefaults 0swaps

1000000000000000000000000000045999999999999999999999999999373
before:
Total time = 50,537,408,096
Polynomial generation time = 3,156,098,148
Small prime sieve time = 11,119,522,776
Large prime sieve time = 5,269,858,896
Evaluate sieve time = 30,991,928,276
6.68user 0.18system 0:14.14elapsed 48%CPU (0avgtext+0avgdata 9780maxresident)k
0inputs+119576outputs (0major+1852minor)pagefaults 0swaps

after:
Total time = 15,873,578,206
Polynomial generation time = 2,196,816,748
Small prime sieve time = 7,927,351,906
Large prime sieve time = 3,785,608,606
Evaluate sieve time = 1,963,800,946
4.49user 0.00system 0:04.50elapsed 99%CPU (0avgtext+0avgdata 29972maxresident)k
0inputs+0outputs (0major+6904minor)pagefaults 0swaps

9999999999999999999999999999890000000120999999999999999999999999998669
before:
Total time = 205,680,758,142
Polynomial generation time = 19,810,142,278
Small prime sieve time = 36,756,219,680
Large prime sieve time = 102,059,949,106
Evaluate sieve time = 47,054,447,078
49.91user 0.38system 0:57.52elapsed 87%CPU (0avgtext+0avgdata 24808maxresident)k
0inputs+411544outputs (0major+5674minor)pagefaults 0swaps

after: 
Total time = 174,057,510,324
Polynomial generation time = 19,373,312,874
Small prime sieve time = 36,913,119,792
Large prime sieve time = 103,397,882,308
Evaluate sieve time = 14,373,195,350
48.77user 0.02system 0:48.80elapsed 99%CPU (0avgtext+0avgdata 87828maxresident)k
0inputs+0outputs (0major+22903minor)pagefaults 0swaps

100000000000000000000000000000000000001039999999999999999999999999999999999997943
before: 
Total time = 1,374,383,504,964
Polynomial generation time = 65,218,945,418
Small prime sieve time = 162,918,790,978
Large prime sieve time = 800,168,532,664
Evaluate sieve time = 346,077,235,904
353.68user 2.21system 6:24.35elapsed 92%CPU (0avgtext+0avgdata 107000maxresident)k
0inputs+2383584outputs (0major+40691minor)pagefaults 0swaps

after:
Total time = 1,219,654,563,794
Polynomial generation time = 62,417,257,240
Small prime sieve time = 161,890,229,918
Large prime sieve time = 782,612,127,516
Evaluate sieve time = 212,734,949,120
340.82user 0.14system 5:40.96elapsed 100%CPU (0avgtext+0avgdata 424288maxresident)k
0inputs+0outputs (0major+114870minor)pagefaults 0swaps

99999999999999999999999999999999999941006299999999999999999999999999999999996283
before:
Total time = 1,231,884,543,838
Polynomial generation time = 54,746,126,150
Small prime sieve time = 141,500,148,740
Large prime sieve time = 665,539,795,562
Evaluate sieve time = 370,098,473,386
305.83user 2.20system 5:44.72elapsed 89%CPU (0avgtext+0avgdata 106952maxresident)k
0inputs+2396216outputs (0major+41141minor)pagefaults 0swaps

after:
Total time = 1,141,988,383,074
Polynomial generation time = 57,283,683,134
Small prime sieve time = 148,617,383,468
Large prime sieve time = 707,621,162,434
Evaluate sieve time = 228,466,154,038
319.19user 0.15system 5:19.34elapsed 100%CPU (0avgtext+0avgdata 435068maxresident)k
0inputs+0outputs (0major+117724minor)pagefaults 0swaps
    bugfix: sort_lp_file() loses an entry due to off-by-one
--- a/lprels.c
+++ b/lprels.c
@@ -187,6 +187,8 @@ long sort_lp_file(char *filename)
       cur_line = buf; /* back up to the beginning of the line */
     }
   } /* for */
+  /* (hv) bugfix: code below expects i to be the number of entries */
+  ++i;
 
   fclose(TMP);
 

@hvds
Copy link
Author

hvds commented Mar 28, 2024

While working on this I've found a bug in the existing MPUG implementation: my first 80-digit testcase above failed to factorize because primecounts[1] = 83598 got truncated to fit in a 16-bit unsigned short, so MPUG instead treated this as a factor of p^18062, and ends up finding gcd(n, temp) = 1. Restoring declaration of primecounts to unsigned long fixes it.

I've applied the patch below to the checkout (from db88b86) my maths code is working against - I suspect that may fix all the factorization failures I've been seeing.

--- a/simpqs.c
+++ b/simpqs.c
@@ -1034,7 +1034,7 @@ static int mainRoutine(
     unsigned long npartials = 0;
     unsigned long relsFound = 0;
     unsigned long  * relations;
-    unsigned short * primecount;
+    unsigned long  * primecount;
     unsigned char  * sieve;
     int            * exponents;
     unsigned long  * aind;
@@ -1361,14 +1361,14 @@ static int mainRoutine(
     /* Now do the "sqrt" and GCD steps hopefully obtaining factors of n */
     mpz_set(farray[0], n);
     nfactors = 1;  /* We have one result -- n */
-    New( 0, primecount, numPrimes, unsigned short);
+    New( 0, primecount, numPrimes, unsigned long);
     if (primecount == 0) croak("SIMPQS: Unable to allocate memory!\n");
     for (l = (int)relSought-64; l < (int)relSought; l++)
     {
         unsigned int mat2offset = rightMatrixOffset(numPrimes);
         mpz_set_ui(temp,1);
         mpz_set_ui(temp2,1);
-        memset(primecount,0,numPrimes*sizeof(unsigned short));
+        memset(primecount,0,numPrimes*sizeof(unsigned long));
         for (i = 0; i< (int)numPrimes; i++)
         {
            if (getEntry(m,l,mat2offset+i))

@hvds
Copy link
Author

hvds commented Apr 4, 2024

I spent some time playing with simpqs-2.0, and modifying it to avoid the need for temp files.

The results of this now at #48, including fix for the issue in this ticket as 955bb25.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants