Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use of bad TTData in search() when ttHit is false #5503

Open
mstembera opened this issue Jul 20, 2024 · 12 comments
Open

Use of bad TTData in search() when ttHit is false #5503

mstembera opened this issue Jul 20, 2024 · 12 comments
Labels
needs-analysis Needs further analysis

Comments

@mstembera
Copy link
Contributor

Describe the issue

As pointed out by @dubslow here #5416 (comment) and @peregrineshahin #5416 (comment) in some places search() uses values from TTData even when ttHit is false.

Expected behavior

Don't use any values from TTData when ttHit is false.

Steps to reproduce

Inspect Search::Worker::search() code.

Anything else?

No response

Operating system

All

Stockfish version

a8401e8

@peregrineshahin
Copy link
Contributor

Regarding this issue, the LTC fix passed https://tests.stockfishchess.org/tests/view/66a243b14ff211be9d4ecd4a
but I'm not able to reproduce any trigger for these two places left "unguarded" with any testing condition/long benches (multithreaded included), I'm not sure that we are using any 'bad TTData in search() when ttHit is false' as of the moment.
It probably can be explained logically, but not too sure..

@mstembera
Copy link
Contributor Author

Hmm I'm even less sure. Maybe open a PR for the fix as an RFC and see what the maintainers and others say?

@Disservin Disservin added the needs-analysis Needs further analysis label Oct 20, 2024
@mstembera
Copy link
Contributor Author

Update:
I just checked the latest SF version cf10644 and changing the return of TranspositionTable::probe() on line 241 in tt.cpp from

    return {false, TTData(), TTWriter(replace)};

to

constexpr TTData emptyData = {
    Move::none(),
    VALUE_NONE,
    VALUE_NONE,
    DEPTH_ENTRY_OFFSET,
    BOUND_NONE,
    false
};

return {false, emptyData, TTWriter(replace)};

now changes the bench from 959947 to 934447. This proves that we have a not just theoretical bug somewhere in search that is making use of random TTData data even when ttHit is false. I will be submitting a test to return these safe defaults from TranspositionTable::probe(). I'm not an expert on search but that part should be fixed as well.

@peregrineshahin Do you plan on opening a PR for your fix above?

mstembera added a commit to mstembera/Stockfish that referenced this issue Dec 9, 2024
@mstembera
Copy link
Contributor Author

mstembera commented Dec 9, 2024

@mstembera
Copy link
Contributor Author

Oops. I was accidentally comparing benches between different versions of SF. The bug still exists, it's just that bench doesn't change.

mstembera added a commit to mstembera/Stockfish that referenced this issue Dec 9, 2024
mstembera added a commit to mstembera/Stockfish that referenced this issue Dec 9, 2024
@peregrineshahin
Copy link
Contributor

peregrineshahin commented Dec 10, 2024

@peregrineshahin Do you plan on opening a PR for your fix above?

Well we recently added is_valid method as an rfc, and as I said for the two places that are left I'm not sure I can provide proof that they do anything functional under any test conditions.
So I have two concerns:
1- I wouldn't have a good argument against a PR that would simplify them...
2- might not be relevant, but for the previous exact reason I propose for the newly introduced is_valid method to never make VALUE_NONE constant available to outsiders to exclude the fact that someone might check against it instead of the new method.

@mstembera
Copy link
Contributor Author

@peregrineshahin
Thanks. I agree w/ 1- and don't really follow 2 -.
I was looking at your patch and in there you guard against
ttData.value != VALUE_NONE.
I thought the guard would be ttHit != false.

Separately I tried returning variations of things like

return {false,
        TTData{Move::none(), 5000, 4000, 50, BOUND_EXACT, true},
        TTWriter(replace)};

from probe and I can't get it to affect bench no matter what I try.

@peregrineshahin
Copy link
Contributor

peregrineshahin commented Dec 10, 2024

ttData.value != VALUE_NONE. I thought the guard would be ttHit != false.

I think I answered this somewhere, there was a recent patch that removed all checks for ttHit != false, because they were redundent and we still needed to check for != VALUE_NONE anyways, so that patch simplified all the checks against ttHit and kept the ttValue checks as under all conditions implies a ttHit existing but a ttHit existing alone still wasn't enough.

@peregrineshahin
Copy link
Contributor

peregrineshahin commented Dec 10, 2024

Well not under all conditions but we have this explicitly in search
ttData.value = ttHit ? value_from_tt(ttData.value, ss->ply, pos.rule50_count()) : VALUE_NONE;
now for the probe one I think this because the ttHit is false thus this is only a replacement entry

    ss->ttHit    = ttHit;
    ttData.move  = rootNode ? thisThread->rootMoves[thisThread->pvIdx].pv[0]
                 : ttHit    ? ttData.move
                            : Move::none();
    ttData.value = ttHit ? value_from_tt(ttData.value, ss->ply, pos.rule50_count()) : VALUE_NONE;
    ss->ttPv     = excludedMove ? ss->ttPv : PvNode || (ttHit && ttData.is_pv);
    ttCapture    = ttData.move && pos.capture_stage(ttData.move);

Now because the search checks for validity of ttValue this means it must be ttHit, and bounds is not used unless we know it's a ttHit, then the remaining thing is actually the ttDepth

        // Decrease reduction if position is or has been on the PV (~7 Elo)
        if (ss->ttPv)
            r -= 1024 + (ttData.value > alpha) * 1024 + (ttData.depth >= depth) * 1024;

        // Decrease reduction for PvNodes (~0 Elo on STC, ~2 Elo on LTC)
        if (PvNode)
            r -= 1024;

        // These reduction adjustments have no proven non-linear scaling

        // Increase reduction for cut nodes (~4 Elo)
        if (cutNode)
            r += 2518 - (ttData.depth >= depth && ss->ttPv) * 991;

here your depth might have an effect on PvNodes that doen't have ttHits Idk why it doesn't affect it.
It's a result of the after proccessing of ttData in search.

@peregrineshahin
Copy link
Contributor

are you sure your numbers, don't even affect this reduction if it's a PvNode and not a ttHit?

        // Decrease reduction if position is or has been on the PV (~7 Elo)
        if (ss->ttPv)
            r -= 1024 + (ttData.value > alpha) * 1024 + (ttData.depth >= depth) * 1024;

I would have guessed that the only reason this is guarded is because we are in search and ttData.depth is 0 so 0 < than any search depth.

@mstembera
Copy link
Contributor Author

Thanks for the explanations. Yes I have not been able to return anything from probe to make the bench change no matter what I tried. I am no longer convinced we have a bug.

@mstembera
Copy link
Contributor Author

Hmm so it turns out the compiler generated constructor for TTData() is initializing everything to 0. When we skip this initialization some compile checks are failing with bad bench checksums. #5766

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-analysis Needs further analysis
Projects
None yet
Development

No branches or pull requests

3 participants