-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Equal quotas of CPU power for each user (description of the problem and proposal) #1427
Comments
no thanks. |
This comment was marked as abuse.
This comment was marked as abuse.
that the current scheme works fine, no need to change it. |
This comment was marked as abuse.
This comment was marked as abuse.
Just submit whatever you want |
This comment was marked as abuse.
This comment was marked as abuse.
This is just nonsense. Tests take time that’s normal. Engine development takes time and at this level multiple tries to get a change right. You might also be the only who considers this a bug you also stated that it only makes analysis a bit worse. Stockfish isn’t really developed for pure analysis. There was a recent patch by me that increases the singular extension check depth when the previous depth was really high. By your logic this was also a bug? This issue looks like you are annoyed that in total you have less cores than other people who submit many patches but no one cares about that… we have no time pressure and viz has made many successful patches in the past and future |
You shown that there is disparity in tests distribution by users, not that there is an issue. Currently there is no issue as tests complete and don't backlog. |
This comment was marked as spam.
This comment was marked as spam.
There is an important distinction between someone having 6 different ideas and submitting 6 unique patches versus someone having one idea and submitting 6 different versions simultaneously only differing by constants. Yes tests still do complete but if one user has 50% of fishtest usage to themselves everyone else's tests take twice as long as they otherwise would. As a result I quietly stopped contributing CPU time to fishtest and while I still participate in development this is a big reason I do so less than in the past. This issue recurring shows it's discouraging to others as well. I therefore wonder how good this situation is for SF development in the long run. There were some constructive thoughts by @vondele here official-stockfish/Stockfish#3234 (comment) and a nice proposal to improve things here #869. I wish it would get acted on. |
This comment was marked as abuse.
This comment was marked as abuse.
This comment was marked as abuse.
This comment was marked as abuse.
As mentioned in the relevant Discord #ideas channel post... there are a few issues with the suggestion. I would suggest you make an effort to not dodge these issues and actually make reasonable responses to them: The main goal of Stockfish is to be the strongest chess engine in the world. Providing world-class analysis is something that comes along with it. To put it bluntly, Magnus Carlsen's analysis of the game would be world-class compared to a 1200-rated player on Lichess. Likewise, Stockfish, being the strongest engine in the world, provides a world-class analysis in comparison. This, by all means, doesn't make Stockfish's main goal as a position solver or analyzer. It is just an added benefit that to be the top engine, Stockfish has to be good at analyzing a vast majority of positions accurately. The way I see it, gaining elo is currently the main goal of Stockfish. Fishtest exists for providing SF contributors with actionable statistical data. Thus, it makes sense to allow some tests that would sway from the main goal, but help collect such data. However, the main goal should always still be prioritized, since it's the main goal. Thus, there are guidelines in place which outline how much computational resources each patch/test gets. This makes most efficient use of the computational resources available to Fishtest, and benefits the path chosen by SF contributors the most — something Fishtest should be doing. The current way patches are tested for SF is using SPRT. What this means is that if a patch is both really good or really bad, the test finishes relatively quickly. Hence, even if one is talking about quality over quantity, it is never an issue to begin with. Write good patches, and they'll pass quickly anyways. Write bad patches, same thing applies. Neutral patches is where the issue is. And if your patch is neutral, it doesn't matter whether you think it was a high quality or a super thought out change. Statistically, it doesn't work out, and hence it is same as all the other 10 patches submitted by a single user using up 50% of Fishtest computational resources. Why should your patch be any different? In the Discord post & here, you are specifically targeting a Stockfish contributor for using a vast majority of Fishtest resources with a multitude of patches. Same contributor who has helped Stockfish gain a lot of elo in the past, furthering Stockfish on its main path. Why shouldn't more computational resources be invested into his tests? I think what's missing is an understanding of investment. Like a company wouldn't invest equally into everything they're doing, especially not into projects that sway from its main goal, why should Fishtest invest into stuff that sways from Stockfish's main goal? |
So far i think the people with the most tests have made stockfish gain the most elo. So….. |
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as abuse.
This comment was marked as abuse.
Quite frankly, it doesn't matter whether you agree with it or not. You're always welcome to fork Stockfish and start your analysis version (as long as you adhere to the license). You don't devise the primary goal of Stockfish, but active contributors of Stockfish for the past years do, and this is the goal they've set. It is also the goal Stockfish started with.
Again, if the analysis is what you seek and you're fine with the weaker engine, feel free to fork Stockfish and do your thing. Official Stockfish refuses to support it; therefore, Fishtest (unless you fork and start your instance) also refuses to support it with full power. I repeat myself, it doesn't matter whether you agree with this or not.
False statement. There are many techniques implemented in Stockfish's search that complement one another. Stuff like History Heuristic greatly assist Late Move Reductions to be as effective as it is.
This is what we call a super small sample size. How many tests like these have you made a pass on the first try? Are you saying that if I flip a coin a certain way and it is Heads once, it will always be Heads? See how your statements are made with no statistical evidence? |
This is not about the people, each good idea matters. If a developer can come up with 6 valid ideas (Yes, this does include variations of the same idea, they are essential to development) each one of those ideas deserve as much processing power as the others. |
@dsekercioglu Yes and (yes) :) All I'm saying is don't be selfish and schedule all variations "at the same time" and take 50% |
Doing such actually allows Stockfish to get a gainer a lot of the time. This is a feature. Not a bug. |
@TheBlackPlague How is scheduling many variants simultaneously any more likely to produce an elo gainer than scheduling those same variants say 2 or 3 at a time? In fact if one variant from the first batch passes the others may not need to be scheduled at all. In the mean time the other developers can be more productive because their turn arounds are faster. |
Do you know when tests are going to pass and fail? No. Hence, the test may pass or fail while I'm asleep, and then the other one would have to be tested tomorrow (making it longer for the whole idea and its variations to be tested). Fishtest isn't poor. It isn't in need of this kind of worrying. It has a lot of cores. |
Yes waiting about the same amount of time as everyone else is fair. Those cores are not going idle. They work on everyone's patches. One person over scheduling is at the expense of everyone else. |
Yeah I'm all in for this to be implemented - I will finally have a serious reason to stop trying to contribute to sf. |
And yeah, you seem to be really keen on spurting long discussions where you repeat the same "reasoning" almost no one buys as if it's some new point. Repeating the same shit 50 times in a row wouldn't make it 50 times more viable. |
I do not understand why the OP objects to running a statistics gathering test at 50 percent throughput. It is standard Fishtest practice that research should be carried out at lower throughput. |
He believes that Stockfish contributors should change the primary goal of Stockfish from being the strongest engine in the world to being a better analysis tool (based on what criteria? I don't know). This is evident with him saying the following:
Therefore, with his thinking in mind, one can see why he believes that this test should run at full throughput. Anyone rational enough to realize that Stockfish isn't a project that revolves around him would know that such is not the standard and will likely never be since Stockfish's primary goal was, is, and will remain as being the strongest engine in the world. |
This comment was marked as abuse.
This comment was marked as abuse.
No, that's the only one. Everything else is just another use of Stockfish. Not a primary goal.
Mind my language, but Stockfish doesn't give two shits what they both use. They use Stockfish because it's the strongest engine. That's it, period. If Stockfish continues to be the strongest engine, they'll continue to use it. It's that simple.
Well, that seems like a YOU problem. Because most people who understand Stockfish's search see how History Heuristic assists Late Move Reductions.
That has nothing to do with what I said. @Vizvezdenec literally just said you ramble on pointlessly and you're doing just that. For example, your argument here was that Stockfish Contributors actually cares about what Chess.com or Lichess use. No, they do not. Quite frankly, I understand your urge to have a pointless discussion but could you stop? Majority of people on this issue have already stated that they're against what you proposed. Hence, it won't be happening. Start your own instance of Fishtest if you really want it. |
This comment was marked as spam.
This comment was marked as spam.
https://github.com/official-stockfish/Stockfish/graphs/contributors?from=2021-09-23&to=2022-09-23&type=c |
can you at least not try to blanantly LIE? |
This comment was marked as abuse.
This comment was marked as abuse.
can we stop this discussion? It is pointless and a waste of time. |
I had already started the topic on the Discord channel, but decided to write my thoughts also here, so that they are better seen.
The problem
Fishtest is now dominated by a few developers who submit tests at a high rate, running 5-10 of them simultaneously each at a normal throughput. Those who are more parsimonious have a hard time testing their ideas.
Some statistics to demonstrate the amount of inequality. The table shows the distribution of tests and CPU cores among users (at 15:00, yesterday)
Moreover, the leader in this table decided that 100% throughput was too much for my only test and lowered it to 50%.
The current practice is unfair for developers who employ more parsimonious approach to testing and run tests one after another. It is also detrimental to development of Stockfish. Aggressive developers take CPU power away from others, who work on different areas of improvement or simply use power more efficiently.
To list the negative effects:
My proposal
The proposal is simple. Rewrite the server so that it schedules tasks in such a way that every user gets equal amount of CPU power for all the tests they are running together. Then the power is further distributed within each user quota proportionally to the throughput value set for each test run. A user can run one test at a time or many of them by their choice, but it shouldn't change the overall amount of power utilized by the user.
The benefits of this approach are:
The text was updated successfully, but these errors were encountered: