Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Division causes assert failure #123

Open
guidovranken opened this issue Jan 12, 2023 · 18 comments
Open

Division causes assert failure #123

guidovranken opened this issue Jan 12, 2023 · 18 comments

Comments

@guidovranken
Copy link

import bigints

var a = initBigint("6277101735386680763835789123314955362437298222279840143829")
var b = initBigint("1461501637330902918203684832716283019655932313743")
var r = a.div(b)

Output:

Error: unhandled exception: /home/jhg/.nimble/pkgs/bigints-1.0.0/bigints.nim(652, 14) `q1 <= uint32.high`

Using latest repository version of bigints, Nim 1.6.10 on Linux x64.

@pietroppeter
Copy link
Contributor

Wow that's bad, thanks for reporting! I can reproduce on playground: https://play.nim-lang.org/#ix=4kYW

@pietroppeter
Copy link
Contributor

I checked going back to older commits and it seems to be a very old issue: already present in (Aug 2020, after merging a PR by Araq to make it work with nim devel) 8cb412f

going further in the past is hard since I could not find a nim version that would make bigints compile without additional changes. probably not worth it to go that way back.

@pietroppeter
Copy link
Contributor

doing a bit of debugging but need to stop now, will try to look into this later (if someone wants to pick this up, most welcome)

for ref this is how I annotated unsignedDivRem (where the bug occurs)

import sugar

# From Knuth and Python
func unsignedDivRem(q, r: var BigInt, n, d: BigInt) =
  var
    nn = n.limbs.len
    dn = d.limbs.len
  dump nn
  dump dn

  if n.isZero:
    q = zero
    r = zero
  elif nn < dn:
    # n < d
    q = zero
    r = n
  elif dn == 1:
    var x: uint32
    unsignedDivRem(q, x, n, d.limbs[0])
    r.limbs = @[x]
    r.isNegative = false
  else:
    assert nn >= dn and dn >= 2

    # normalize
    let ls = 32 - bits(d.limbs[d.limbs.high])
    r = d shl ls
    q = n shl ls
    if q.limbs.len > n.limbs.len or q.limbs[q.limbs.high] >= r.limbs[r.limbs.high]:
      q.limbs.add(0'u32)
      inc(nn)

    let k = nn - dn
    assert k >= 0
    var a: BigInt
    a.limbs.setLen(k)
    let wm1 = r.limbs[r.limbs.high]
    let wm2 = r.limbs[r.limbs.high-1]
    var ak = k

    var zhi = zero
    var z = zero
    var qib = zero
    var q1b = zero

    dump q
    dump r
    dump nn
    dump dn
    dump k

    for v in countdown(k-1, 0):
      dump v
      # estimate quotient digit, may rarely overestimate by 1
      let vtop = q.limbs[v + dn]
      assert vtop <= wm1
      let vv = (uint64(vtop) shl 32) or q.limbs[v+dn-1]
      var q1 = vv div wm1
      var r1 = vv mod wm1

      while (wm2 * q1) > ((r1 shl 32) or q.limbs[v+dn-2]):
        debugEcho "overestimating"
        dump (wm2 * q1)
        dump ((r1 shl 32) or q.limbs[v+dn-2])
        dec q1
        r1 += wm1
        if r1 > uint32.high:
          debugEcho "breaking on r1"
          break
        dump (wm2 * q1)
        dump ((r1 shl 32) or q.limbs[v+dn-2])

      dump q1
      dump uint32.high
      assert q1 <= uint32.high  # error occurs here

      ...

and this debug output with above example:

nn = 6
dn = 5
q = (limbs: @[3572406741, 1867334985, 3529740808, 4294963510, 4294967295, 4294967295, 0], isNegative: false)
r = (limbs: @[4294738063, 4294967295, 4294967295, 4294967295, 4294967295], isNegative: false)
nn = 7
dn = 5
k = 2
v = 1
q1 = 1
uint32.high = 4294967295
v = 0
overestimating
(wm2 * q1) = 18446744073709551615
((r1 shl 32) or q.limbs[v + dn - 2]) = 4294963510
(wm2 * q1) = 18446744069414584320
((r1 shl 32) or q.limbs[v + dn - 2]) = 18446744073709547830
q1 = 4294967296
uint32.high = 4294967295

@dlesnoff
Copy link
Contributor

@guidovranken Thanks for the report.
@pietroppeter Thanks for the debugging and insightful stacktrace. No need to go further down the commit history, I guess the error is here from the very first time this algorithm was committed.
« From Knuth and Python » is a very obscure reference. It will be hard to find where the algorithm and the notations come from.
I will check in TAOCP vol.2 tonight.

@guidovranken
Copy link
Author

You're welcome. Is this library currently used in production anywhere?

@dlesnoff
Copy link
Contributor

Short answer: no.
It was planned to be used in Nim's compiler : nim-lang/Nim#14696

@guidovranken
Copy link
Author

Ok. I found this with my math&crypto fuzzer: https://github.com/guidovranken/cryptofuzz
If this library ends up getting used in important software we can fuzz it on OSS-Fuzz: https://github.com/google/oss-fuzz

@pietroppeter
Copy link
Contributor

pietroppeter commented Jan 12, 2023

ah, nice to know it was found with a fuzzer. After your example I actually thought, we should fuzz this library... I was thinking bout using a recently implemented library from planetis https://github.com/status-im/nim-drchaos (there is also a video from nimconf 2022), but the list of bugs found by your library is definitely impressive!

regarding the "knuth and python" notes, maybe @def- know a bit more but I imagine python implements a division algorithm by knuth and googling around my guess would be this is algo Algorithm 4.3.1D" from page 2 of Volume 1 of TAOCP, linking this HN conversation of an article about this since it seems interesting https://news.ycombinator.com/item?id=26562819
edit: actually the article references it as "Volume 2: Seminumerical Algorithms, Chapter 4.3: Multiple-Precision Arithmetic of The Art of Computer Programming, Donald Ervin Knuth presents Algorithm D (Division of nonnegative integers):"

@pietroppeter
Copy link
Contributor

for the reference in python I think it might be here: https://github.com/python/cpython/blob/2161bbf243983c625e8f24cdf43b757f2a21463b/Modules/_decimal/libmpdec/basearith.c#L293

@dlesnoff
Copy link
Contributor

regarding the "knuth and python" notes, maybe @def- know a bit more but I imagine python implements a division algorithm by knuth and googling around my guess would be this is algo Algorithm 4.3.1D" from page 2 of Volume 1 of TAOCP, linking this HN conversation of an article about this since it seems interesting https://news.ycombinator.com/item?id=26562819 edit: actually the article references it as "Volume 2: Seminumerical Algorithms, Chapter 4.3: Multiple-Precision Arithmetic of The Art of Computer Programming, Donald Ervin Knuth presents Algorithm D (Division of nonnegative integers):"

I had a doubt whether it was in volume 1 or volume 2. Glad it is in volume 2.

@def-
Copy link
Member

def- commented Jan 12, 2023

for the reference in python I think it might be here: https://github.com/python/cpython/blob/2161bbf243983c625e8f24cdf43b757f2a21463b/Modules/_decimal/libmpdec/basearith.c#L293

That seems correct, also references the correct section of TAOCP.

@dlesnoff
Copy link
Contributor

From what I have read so far, @def- changed a little bit the normalization constant. I don't think this impact the mathematical properties that much, but we must be extra careful to the changes it occurs on the algorithm correctness.
Instead of :

    d = MPD_RADIX / (vconst[n-1] + 1);

like in the python code or TAOCP, he chose to multiply by this power of two:

let ls = 32 - bits(d.limbs[d.limbs.high])

I guess theorem B (in the same 4.3.1 section of TAOCP) saying that our guessed quotient is between q and q+2, does not hold anymore.
There is another normalization algorithm depicted in exercise 28, but it seems different.
There is the exercise 37 that proposes to avoid completely the normalization and unnormalization steps, when the radix is a power of two (which is our case). I don't know if it is what dennis has tried to do here. It still looks like a normalization step.

I can't debug it for now.

@pietroppeter
Copy link
Contributor

pietroppeter commented Jan 13, 2023

Mmh, at a first glance the quantity q1 seems to me computed the same as qHat in step D3 of the algorithm as stated in TAOCP (vv plays the role of u_j b + u_j+1 and wm1 is v1). Note that in the algorithm it does not use the same partial quotient as in Theorem B. Note also that in TAOCP low indices are used for most significant digits.

For reference here is a folder with relevant pages from (an old edition of) TAOCP : https://drive.google.com/drive/folders/1--9CWffNwg1fmA26RUXfcSkUPzF6gZLE

@pietroppeter
Copy link
Contributor

I would assume the bug happens (occasionally? always?) when we fall in the case where the partial quotient needs to be decreased by 1, which (as per Ex 21 of TAOCP) should happen with probability 3/b where b in our case is 2^32 (less than 1 in a billion chance). the debugging of the example above is consistent with this. Have not had time though to look at this more closely.

@dlesnoff
Copy link
Contributor

dlesnoff commented Jan 13, 2023

Note also that in TAOCP low indices are used for most significant digits.

They inverted that in my revised edition of TAOCP. This is now exactly the inverse. The high indices are used for most significant digits. That's wonderful.

@pietroppeter
Copy link
Contributor

pietroppeter commented Jan 16, 2023

for the reference in python I think it might be here: https://github.com/python/cpython/blob/2161bbf243983c625e8f24cdf43b757f2a21463b/Modules/_decimal/libmpdec/basearith.c#L293

after looking better at python, I think the correct implementation of long division used by python is here:
https://github.com/python/cpython/blob/5ef90eebfd90147c7e774cd5335ad4fe69a7227c/Objects/longobject.c#L2768

this is what is linked to in this article about integers in python: https://tenthousandmeters.com/blog/python-behind-the-scenes-8-how-python-integers-work/

the previous reference was to the same algorithm implementation in the self-contained c library mpdecimal which is wrapped and used in python's decimal module, see also https://www.bytereef.org/mpdecimal/index.html and https://github.com/python/cpython/tree/2161bbf243983c625e8f24cdf43b757f2a21463b/Modules/_decimal

@dlesnoff
Copy link
Contributor

I think it is important to write that the division implementation is flawed in the README.md. It would prevent people using the library while we seek a fix.

@mratsim
Copy link

mratsim commented May 30, 2023

While running my own fuzzing I also get this error pretty consistently.

Input

import bigints

let base = initBigInt(
  "7fffffffffffffffe0000000000001f8000000000001ffff00000007ffff",
  base = 16
)
let exp = initBigInt(
  "f0ff0f00e0ff07000000800300000000e0ffffffffffff01000000000000",
  base = 16
)
let M = initBigInt(
  "e00f0000000000000000ffff03004000feffffffffffff07000000000000c0",
  base = 16
)

let r = powmod(base, exp, M)

With a modified bigints for nice reporting:

[Failure] unsignedDivRem at `q1 <= uint32.high` with inputs:
[Failure]  n: 0x3fffffffffffffffe0000000000001f8040000000001ff8100000008000260403ffffffe07e03c1000001f83fc0c0001001fffec0002003ffff00001
[Failure]  d: 0xe00f0000000000000000ffff03004000feffffffffffff07000000000000c0
[Failure] called from powmod with inputs:
[Failure]  base:     0x7fffffffffffffffe0000000000001f8000000000001ffff00000007ffff
[Failure]  exponent: 0xf0ff0f00e0ff07000000800300000000e0ffffffffffff01000000000000
[Failure]  modulus:  0xe00f0000000000000000ffff03004000feffffffffffff07000000000000c0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants