-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Include cases with passed pawns in rook scale factor #2792
Conversation
Personally, I've watched quite a few games where Stockfish has high evals for rook endgames with a passed pawn but turned out to be a draw. So I hope this will help Stockfish to not get lured into such endgames. |
I don't think non-regression bounds are appropriate, the actual non-regression confidence is too low. While this do remove a condition, this is conceptually similar to a parameter tweak. |
seems like a very typical simplification with the usual confidence on non-regression. |
Very typical "trivial simplification" that gives very little benefit in code clarity and maintainability. It's not the first that has been proposed and merged, but the previous ones were not issue-free. It's wrong to have same bounds for a trivial simplification for which even a 0.2 elo cost is not justifiable and a massive simplification that opens new space for something else to do better ; especially considering the simplification bounds are laxer than the old elo-gaining bounds that ended up letting regression pass through. Maybe this patch actually is a small gain that wasn't very lucky, or is really near neutral, I don't know. But in the trade-off between the confidence we have on the elo impact and the impact on maintainability/future improvements, I'd rate this as too low even though current rules taken strictly say it's good enough. The rationale behind the patch attempt, according to the fishtest comment, was to try and make Stockfish stronger. The only reason theses bounds aren't too big of an issue is because most SF devs are not actively trying to simplify Stockfish by removing any small condition they can find. But if you think I'm wrong and are okay with me pushing a lot of minor simplifications tests on fishtest, I can probably find a few dozen to try. Then we'll see how many I can get to pass STC+LTC and then we can run a fixed games test to measure elo impact of a combo of all those that managed to pass. That would settle the question. |
@Alayan-stk-2, of course, any reasonable set of rules can be misused... and I don't think this current PR is an example. It will improve the evaluation of certain draw rook endgames, without performance regression overall. Certainly, there have been tests in the past that measured the Elo impact of a whole set of simplifications, and the Elo impact was very small. The results are somewhere in fishcooking, we should find them back and put them somewhere on the wiki. Feel free to tests a few simplifications that make sense, without misusing the infrastructure. In my experience, there is little that can be removed without regression. Some things can be removed (e.g. certain endgames such as KBNvK), but we like to keep them for other reasons. |
Thanks! |
This will help scale down relatively high eval in drawish rook endgames with passed pawn like in TCEC S18 Superfinal Game 90. Passed STC LLR: 2.94 (-2.94,2.94) {-1.50,0.50} Total: 50456 W: 9644 L: 9540 D: 31272 Ptnml(0-2): 760, 5637, 12332, 5737, 762 https://tests.stockfishchess.org/tests/view/5efcb76e59f6f035328940ed Passed LTC LLR: 2.94 (-2.94,2.94) {-1.50,0.50} Total: 77264 W: 9518 L: 9518 D: 58228 Ptnml(0-2): 402, 6766, 24321, 6716, 427 https://tests.stockfishchess.org/tests/view/5efd2ad759f6f03532894143 closes official-stockfish#2792 Bench: 4431626
This will help scale down relatively high eval in drawish rook endgames with passed pawn like in TCEC S18 Superfinal Game 90.
Passed STC
LLR: 2.94 (-2.94,2.94) {-1.50,0.50}
Total: 50456 W: 9644 L: 9540 D: 31272
Ptnml(0-2): 760, 5637, 12332, 5737, 762
https://tests.stockfishchess.org/tests/view/5efcb76e59f6f035328940ed
Passed LTC
LLR: 2.94 (-2.94,2.94) {-1.50,0.50}
Total: 77264 W: 9518 L: 9518 D: 58228
Ptnml(0-2): 402, 6766, 24321, 6716, 427
https://tests.stockfishchess.org/tests/view/5efd2ad759f6f03532894143
Bench: 4431626