Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include cases with passed pawns in rook scale factor #2792

Closed
wants to merge 1 commit into from

Conversation

SFisGOD
Copy link
Contributor

@SFisGOD SFisGOD commented Jul 3, 2020

This will help scale down relatively high eval in drawish rook endgames with passed pawn like in TCEC S18 Superfinal Game 90.

Passed STC
LLR: 2.94 (-2.94,2.94) {-1.50,0.50}
Total: 50456 W: 9644 L: 9540 D: 31272
Ptnml(0-2): 760, 5637, 12332, 5737, 762
https://tests.stockfishchess.org/tests/view/5efcb76e59f6f035328940ed

Passed LTC
LLR: 2.94 (-2.94,2.94) {-1.50,0.50}
Total: 77264 W: 9518 L: 9518 D: 58228
Ptnml(0-2): 402, 6766, 24321, 6716, 427
https://tests.stockfishchess.org/tests/view/5efd2ad759f6f03532894143

Bench: 4431626

@SFisGOD
Copy link
Contributor Author

SFisGOD commented Jul 3, 2020

Personally, I've watched quite a few games where Stockfish has high evals for rook endgames with a passed pawn but turned out to be a draw. So I hope this will help Stockfish to not get lured into such endgames.

@Alayan-stk-2
Copy link

I don't think non-regression bounds are appropriate, the actual non-regression confidence is too low.

While this do remove a condition, this is conceptually similar to a parameter tweak.

@vondele
Copy link
Member

vondele commented Jul 3, 2020

seems like a very typical simplification with the usual confidence on non-regression.

@Alayan-stk-2
Copy link

Very typical "trivial simplification" that gives very little benefit in code clarity and maintainability. It's not the first that has been proposed and merged, but the previous ones were not issue-free.

It's wrong to have same bounds for a trivial simplification for which even a 0.2 elo cost is not justifiable and a massive simplification that opens new space for something else to do better ; especially considering the simplification bounds are laxer than the old elo-gaining bounds that ended up letting regression pass through.

Maybe this patch actually is a small gain that wasn't very lucky, or is really near neutral, I don't know. But in the trade-off between the confidence we have on the elo impact and the impact on maintainability/future improvements, I'd rate this as too low even though current rules taken strictly say it's good enough. The rationale behind the patch attempt, according to the fishtest comment, was to try and make Stockfish stronger.

The only reason theses bounds aren't too big of an issue is because most SF devs are not actively trying to simplify Stockfish by removing any small condition they can find.

But if you think I'm wrong and are okay with me pushing a lot of minor simplifications tests on fishtest, I can probably find a few dozen to try. Then we'll see how many I can get to pass STC+LTC and then we can run a fixed games test to measure elo impact of a combo of all those that managed to pass. That would settle the question.

@vondele
Copy link
Member

vondele commented Jul 3, 2020

@Alayan-stk-2, of course, any reasonable set of rules can be misused... and I don't think this current PR is an example. It will improve the evaluation of certain draw rook endgames, without performance regression overall.

Certainly, there have been tests in the past that measured the Elo impact of a whole set of simplifications, and the Elo impact was very small. The results are somewhere in fishcooking, we should find them back and put them somewhere on the wiki.

Feel free to tests a few simplifications that make sense, without misusing the infrastructure. In my experience, there is little that can be removed without regression. Some things can be removed (e.g. certain endgames such as KBNvK), but we like to keep them for other reasons.

@vondele vondele added the to be merged Will be merged shortly label Jul 3, 2020
@vondele vondele closed this in 67818ee Jul 3, 2020
@vondele
Copy link
Member

vondele commented Jul 3, 2020

Thanks!

noobpwnftw pushed a commit to noobpwnftw/Stockfish that referenced this pull request Aug 15, 2020
This will help scale down relatively high eval in drawish rook endgames with passed pawn like in TCEC S18 Superfinal Game 90.

Passed STC
LLR: 2.94 (-2.94,2.94) {-1.50,0.50}
Total: 50456 W: 9644 L: 9540 D: 31272
Ptnml(0-2): 760, 5637, 12332, 5737, 762
https://tests.stockfishchess.org/tests/view/5efcb76e59f6f035328940ed

Passed LTC
LLR: 2.94 (-2.94,2.94) {-1.50,0.50}
Total: 77264 W: 9518 L: 9518 D: 58228
Ptnml(0-2): 402, 6766, 24321, 6716, 427
https://tests.stockfishchess.org/tests/view/5efd2ad759f6f03532894143

closes official-stockfish#2792

Bench: 4431626
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
to be merged Will be merged shortly
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants