[RFC 0077] Stale Issues Amendement #77

blaggacao · 2020-10-14T02:45:11Z

Based on: NixOS/nixpkgs#100460 (comment)

Rendered

Based on the first reactions and feedback, unfortunately, I have to clarify this:

Please read carefully, this RFC does not significantly increase the rate of notifications.
This RFC does not seek to impose anything on anyone (quite a few comments seem to interpret this RFC as an imposition to their customs or freedoms) — this is over-interpretation and is not the case. It seeks transparency about the facts. If an increase in transparency does prompt (some) people to actions, then this is a positive spill-over effect.
There seems to be an implicit proxy discussion about different labels going on here (notably the 2.status: * category). This RFC does not address the expressiveness of the current label structure. It assume it needs improvement, but a general overhaul might be something for a different RFC.
Unfortunately, authors — the primary target of the stale-bot's reminder — are likely to be systemically underrepresented in this RFC's discussion. So please read it with their eyes and needs, too. They are a diverse group — not only core or regular contributors.

nixos-discourse · 2020-10-14T03:01:34Z

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/how-many-people-are-paid-to-work-on-nix-nixpkgs/8307/85

rfcs/0077-stale-issues-amendment.md

nh2 · 2020-10-14T03:41:58Z

I'm not in favour.

There is no practical benefit of marking things as stale at higher rate. The bot's main purpose is to, infrequently, ask humans whether an issue is still relevant, to prevent years-old issues lying around that are resolved or obsolete. Its purpose is not to determine whether the "conversation is stale".

We don't need a bot to tell us that a "conversation is stale". That is already observable from the silence and dates on the discussions. It is not actionable. The bot's current purpose is actionable and pragmatic. It prompts a human to triage the issue,
not to get a conversation going.

There is a very concrete drawback of increasing the rate: Spamming people with irrelevant messages, increasing overhead. Many of us are subscribed to thousands of issues; you can calculate how many bot emails per day this means. (This should be added to the drawbacks section.) So they better be actionable. It is unreasonable to expect that an issue has resolved itself within the proposed 90 days; that time is shorter than NixOS's 180 days cadence, and lots of upstream software won't even make a point release within that time.

Data: As of now there are 1750 open issues marked as stale, and 450 stale issues were marked as closed. It might make sense to accelerate the stale time if we were running out of stale issues to triage, but that isn't the case. For those nixpkgs contributors that have extra time and energy available, I recommend to triage some of the 1750, or to work on one of the 2500 non-stale issues. For the contributers that are already packed with work, it would be beneficial to not drive up bot notification spam by 2x.

nh2 · 2020-10-14T04:37:36Z

Did you take into consideration NixOS/nixpkgs#100462?

I did see it linked as related in the RFC, and I just replied to that proposed change in NixOS/nixpkgs#100462 (comment), but I don't think it would change one's opinion on the RFC proposed here (which can be understood independent of the current / alternative bot text).

Could you please revise your opinion under the additional arguments of db779b9?

I don't follow the newly added conclusions:

+One might think, that this increases the load of "spam" notifications on
+subscribers. However, the bot ifirst and foremost prompts and helps the author.
+Subscribers, most of the time, do not have the highest stakes in this interaction.

I don't think this is the case. Most people are subscribers to an issue because they suffer from the same issue, and want to be notified when it's fixed. They would likely have filed it themselves, if somebody else hadn't done it yet.

+the stalebot has not prompted any action on the
+vast majority of interactions. That means, the stalebot is pretty inefective
+(since ignored).

No, it doesn't have to mean that.

It can mean nobody is interested in the issue. Then the bot fulfilled its purpose here of pointing that out, so it's perfectly effecive.
It can mean that determining whether some issue still exists is nontrival work, and nobody has had the time yet to reconfirm it.
For example, if the issue is "NixOS freezes every 30th suspend", that can be quite some work to triage.
It can mean that the people subscribed to the bug ignored the stale bot emails because they do not have time to post on Github all day, and instead hope for somebody with more time to triage the issue properly.

+The most plausible root cause is that the stale bot promted
+after an inhumanely long period of time in which the interest of the proponent
+might have shifted

That seems a lot less likely than any of the 3 reasons I gave above.

Let's keep in mind:

The key thing that moves issues forward is people writing code.

That's how stuff gets done; hard work with significant time investment. Talking and pinging rarely speeds up or resolves issues.

(Also, I should make clear, because the writing above reads harsh: I highly value any effort to make nixpkgs and its processes better. I don't think the proposed RFC is very effective at it though; I believe it would increase overheads and not increase the rate of progress.)

blaggacao · 2020-10-14T04:45:54Z

f647a35

resolved in the above commit

I don't think this is the case. Most people are subscribers to an issue because they suffer from the same issue, and want to be notified when it's fixed. They would likely have filed it themselves, if somebody else hadn't done it yet.

If a bot prompts the author, subscribers are not direct addressees of the bot. Can you propose a formulation that makes this clear? (I'll also try, since it seems to be misunderstood).

It can mean nobody is interested in the issue. Then the bot fulfilled its purpose here of pointing that out, so it's perfectly effective.

That would pretty much mean the bot is not actionable (it didn't trigger action), which is what we are trying to address in here.

It can mean that determining whether some issue still exists is nontrival work, and nobody has had the time yet to reconfirm it.
For example, if the issue is "NixOS freezes every 30th suspend", that can be quite some work to triage.

Actionably adressed by 887785b — you forgive me the pointer. 😜

It can mean that the people subscribed to the bug ignored the stale bot emails because they do not have time to post on Github all day, and instead hope for somebody with more time to triage the issue properly.

If so, due respect (in my stance) requires those group to ask for it (anew) instead of silently hoping for the best. Subscribers have constituing interests. The audience never is innocent. I guess you see the point.

Under this light, I'd kindly ask you anew for #77 (comment).

Talking and pinging rarely speeds up or resolves issues.

With due respect, I consider you are misleading and framing people here and preempt them from a chance of a genuine and unbiased interpretation of this RFC.

suhr · 2020-10-14T06:35:51Z

I'm strongly against this proposal.

People with vested interest should see the stale label and be prompted to think: "Oh, I need this, too. Damn, it's stale. Let's have a look and do something to help out."

Abusing sense of urgency might be fine in marketing, but is highly inappropriate for a open source project. Also, people who are working on an issue are not obligated to report their status weekly. This is not their daily job after all.

And of course, we don't need a bot to tell us that a "conversation is stale". The bot was introduced to help people to close issues that are no longer relevant. And in this regard, ~450 closed issues is not bad at all.

7c6f434c · 2020-10-14T07:54:26Z

If a bot prompts the author, subscribers are not direct addressees of the bot. Can you propose a formulation that makes this clear? (I'll also try, since it seems to be misunderstood).

You get notifications for new comments in threads where you were mentioned, you have commented, your review has been requested, or you have subscribed. The former three are considered «updates in threads you participate».

rfcs/0077-stale-issues-amendment.md

7c6f434c · 2020-10-14T19:54:19Z

But you are implicitly supporting the argument of this RFC, no one goes to the central library for every day practical life.

No one goes to library for everyday practical needs, full stop. Curated collections a people's private ones. I would not bet on people preferring their closest library to have less things available, it's a cost issue, not the goal. (Of course, many people don't care what their library has because they mostly use it as an interface to inter-library loans, but that makes them users of meta-libraries, often spread enough to be effectively uncurated)

Maybe the root of the opposition despite the pointed arguments for a change here still being that people are too dearly attached to their overloaded interpretation of either closed or stale?

What is «too»?

People who have done and continue to do a lot of work for Nix* have debated in finest details what workflows related to issues they prefer to use, and found a compromise that did not go too far from the preferences of the people involved

blaggacao · 2020-10-14T20:02:37Z

The library issue is an excellent analogon to reason about this. It circles back to better issue labelling, of which the nix project could need a fair more share through triaging (by a human). Since that is not available, this RFC seeks to improve auto-triage. As if at the entrance of the library you would put somebody and he hands over a book to every visitor saying: Please judge if this should be in the first row, second row, third row, etc. Is it for children or is it scientific? etc.

Applied to this RFC we could transpond: "Here is a book. Have a look, and then come back in 60 days instead of in 180 days and put it in the row." — bad comparision in most aspects, but one: mental cost of context switching. And by circumstance of very incomplete expressiveness common folks only have three states they can directly or indirectly influence:

closed
open
not-stale (once stale)
for PR: draft

(what a poor language) 😉 — it's like sitting in a car: you are limited to either flash light or honk or just feel go-cart-y

What is «too»?

"Too" is when a third party cannot come to the same conslusion of balancing the tradeoffs even when trying really hard to (theleologically) take into account all the arguments. It means that a third party percieves an outcome to negatively suffer from a bias. Thanks you asked! It could have been interpreted in so many less constructive ways! Such third party, then labeles the percieved imbalance with "too". The third party might have corss checked their judgment with previous experiences in similar cirumstances (which I for one have).

7c6f434c · 2020-10-14T20:11:39Z

data section interpretation. I think it really qualifies (at least to certain extend).

You need novel and heavy. As you noticed, this argument can also prompt a reaction «thanks for collecting the data, works as designed».

Weight → if RFCs where designed to bias against agility and cement conservatism, then I agree.

RFCs are intended to provide as much agility as possible… to inherently failure-prone process of building consensus of non-objection to some decision.

And you are promoting overturning a hard-earned decision on an issue discussed to exhaustion. Yes, you need an argument heavy enough to plausibly shift the balance of a long negotiation.

And also, by the way, using a dictionary as an argument has negative weight in my opinion. Basically, you are getting the things wrong way. We agreed on some behaviour, and slapped some random label on it. You can use a dictionary to argue that another label is much better. If you use a dictionary to argue from a pretty carelessly chosen label towards changing a pretty painfully negotiated behaviour, well, if it works, we have a larger problem.

blaggacao · 2020-10-14T20:23:33Z

Yes, you need an argument heavy enough to plausibly shift the balance of a long negotiation.

Agreed. So we can basically say:

This is actually a good idea.
Or it isn't.
Anyway, do we want to discuss this again?
Maybe not.
Let's not burn our fingers.

If this is the prevailing stance (in my words), how can we bring about coordinated change, where a some parts are seemingly unweighty but in combination unfold their power towards greater goals? We cannot bring about changes in small steps, since they might not be heavy enough and we cannot proof the overarching goal is heavy enough since we cannot make the small steps. So how do we escape this evolutionary deadlock? (If I was advocating for it — which I am).

And also, by the way, using a dictionary as an argument has negative weight in my opinion. Basically, you are getting the things wrong way. We agreed on some behaviour, and slapped some random label on it. You can use a dictionary to argue that another label is much better. If you use a dictionary to argue from a pretty carelessly chosen label towards changing a pretty painfully negotiated behaviour, well, if it works, we have a larger problem.

I think this is another input for amendments to this RFC. Thank you! But I would add:

We agreed on some random behaviour, and slapped some random label on it.

I'm repeating myself, but this RFC argues for a change in behaviour on the bases of a (perceived) novelty. Clearly framing the semantics is a way to ease people into the arguments. Since as you see, most people are opposed because of contextual private (to the nix ecosystem) information (or rather interpretation). This very interpretation has it's own issue. We have a real problem when:

stale means invalid
closed means shut up
What does 2: status: invalid mean again? → stale?

You see the point. I'm not sure how this is called, but it is codified language (the opposite of an open and welcoming environment).

7c6f434c · 2020-10-14T20:24:28Z

Offtopic

> Please judge if this should be in the first row, second row, third row, etc.

The absolute majority of entries in a work-oriented library are sorted by very very rough subdivision, then alphabetically. I guess if you only have fiction you can afford more extravagant approaches, but I hope no issues in our tracker are pure works of fiction.

It means that a third party percieves an outcome to negatively suffer from a bias.

See, trade-offs are trade-offs between preferences of different people. The arguments serve to explain the preferences and be able to negotiate about them. If some change makes my workflow inconvenient or annoying or whatever, it is both subjective (specific to me), but also absolutely impossible to fix by an argument that an alternative workflow I have tried and rejected is better. Everyone is an absolute authority on their preference among things they have already tried. And all of them are right, even though they do not agree. It is not bias, it is the actual hard work of negotiating trade-offs between values of different people.

blaggacao · 2020-10-14T20:27:52Z

Spot on! Thanks!

Except for:

workflow I have tried

A shorter time to first interaction has never been tried, as far as I know on the nix repositories. So that rises the questions are all arguments about an increase in unwanted notifications hypothetical? (I'm genuinely wondering and want to find out).

It would be very unsatisfying if such thin grounds would be an argument supporting the lack of heaviness of this proposal.

off-topic

I'm a new breed. I get it. And if this is perceived basis enough for opposition, I can swallow it, too. 😉 As most have seen, this is not my first RFC, and I think I'm of best service if I don't keep it my last, every time I spot a noteworthy opportunity for improvement. 👍

7c6f434c · 2020-10-14T20:39:51Z

So how do we escape this evolutionary deadlock?

Step one: find someone who has significant experience and also things it is a deadlock, and not a stable consenus with active users of whatever is being discussed mostly content. Failure might mean that this «deadlock» serves a purpose and fulfills it.

I'm repeating myself, but this RFC argues for a change in behaviour on the bases of a (perceived) novelty.

Nope, no novel arguments you provide are convincing.

This very interpretation has it's own issue.

Connotations are a thing. Yes, there are people who prefer «closed» to mean «no action useful». Yes, it is a valid and reasonable preference. No, it is not the only one.

We agreed on some random behaviour, and slapped some random label on it.

No, you do not understand what was happenning, which is why your proposal will probably fail.

The behaviour was discussed, negotiated, carefully debated. Then name… well, nobody cared enough to discuss it. This is the crucial difference.

A shorter time to first interaction has never been tried, as far as I know on the nix repositories.

Well, if stale-bot is the first interaction, then the issue is doomed for the time being. Might make sense to check if it got fixed by some refactoring in the next release, 180 days later.

But stale-bot is not a complete workflow. It is an influence on various workflows people have.

are all arguments about an increase in unwanted notifications hypothetical

We know what are the results of using stale-bot. Changing a parameter with pretty linear effects will lead to pretty predictable outcomes.

blaggacao · 2020-10-14T20:47:49Z

Why do other communities come to different conclusions? And why do they have actionable responses rates to stale bots superior to our 20%? (Ok I just made that up. But from my interactions to kubernetes, it is at least my perception.) — Thin grounds. Sorry.

If this claim proved to be true, does it mean that other communities make more efficient use of their resources?

Does it then mean we have a surplus of resources so we can waste them by not fine tuning our workflows to the purported data?

7c6f434c · 2020-10-14T21:04:00Z

Why do other communities come to different conclusions? And why do they have actionable responses rates to stale bots superior to our 20%?

Having a well-defined scope distinct from «package like half the global open source ecosystem and make it work together on a new technical foundation» helps.

Really, Nixpkgs has atypical pattern of relations between parts, and also has a relatively rare problem of «no, it is not a cheap action just to check if everything at least builds», and our domain experts are experts in areas too far apart… A lot of best practices are hard to translate.

Does it then mean we have a surplus of resources so we can waste them by not fine tuning our workflows to the purported data?

Can we afford wasting the quality issues? Yes, in the sense that our bottlenecks are elsewhere.

blaggacao · 2020-10-14T21:15:11Z

To conclude, the project might want to improve it's strategies and tactics to harness the resources (and wisdom) of the crowd in purposeful ways. This is an overarching goal, which in itself is very significant.

This RFC is trying to fine tune workflows towards this goal, but in itself is controversial (thanks to RFC51) and by extension lacks the criterion of isolated significance.

Since things start to move constructively into the direction of the above overarching goal, the best conclusion would be probably to leave this RFC around to allow people to reshape their opinions (or reinforce either way).

Example motions of better harnessing the resources of the multitude:

The concept of flakes
Some of this year's nixcon talks
Adopting nix for dev shell environments (eg numtide/devshell) — ¿Entice the developer!
nixpkgs-review
In general: improved tooling
Hopefully: better labelling (maybe subject of another RFC)
nix 3.0
Ideas of extracion of ./lib
nur / flakes registry
...

@7c6f434c Thank you for the discussion! 👍

7c6f434c · 2020-10-14T21:20:36Z

It might be that in a couple of days we find some opinions different in an interesting way (I mean, so far it is roughly ten opposing people, there are way more people interested in tooling)

…l contribs

blaggacao · 2020-10-14T22:49:37Z

@suhr

Abusing sense of urgency might be fine in marketing, but is highly inappropriate for a open source project. Also, people who are working on an issue are not obligated to report their status weekly. This is not their daily job after all.

This RFC does not pretend to usurp sense of urgency. To the contrary it pretends to counteract the evil force of fading memories and interests: a highly relevant force in open source projects. You can also look at it by appreciating the social (and also evolutionary) value of forgetting, if you are into sociology or psychology.
This RFC does not propose to prompt anybody into reporting their status. It is not obvious how you gained this interpretation? — for clarity why this is not obvious: this RFC is very specific about the meaning of stale (referring to Merriam Webster). So if somebody does not want to report ones status, then the definition of stale is still valid. If an issue is effectively imparied in vigor and effectivenss responding with the sole goal to un-stale can be considered a bad practice and a fight against an illusion that non staleness increases any odds (as people have pointed out, the label is not relevant, people doing code is).

And of course, we don't need a bot to tell us that a "conversation is stale". The bot was introduced to help people to close issues that are no longer relevant. And in this regard, ~450 closed issues is not bad at all.

I might need to correct you: the bot was introduced by the following motivation...

We have a large number of open issues that have accumulated over the years. Not all of them are still valid and need our attention.
By marking stale issues, we can more easily filter issues for ones that have at least one person interested in them.
There is no clear notion of relevance (as you claim) to be found within the accepted version of the RFC. The most specific part suffices with the technical definition of a stale issue / PR. Interpretation is left on the spectator.
450 is an absolute number, it is of questionable advancement to the discussion when not put it into perspective. I investigated the missing data and we now have a more complete picture as basis for appreciation: 8d2e818

suhr · 2020-10-15T06:55:24Z

To the contrary it pretends to counteract the evil force of fading memories and interests: a highly relevant force in open source projects.

But interests are largely driven by need. And when you need something, you usually don't forget about it.

I'm quite skeptical about encouraging people actions with bots. It works for answering “yeah, I don't have this problem anymore”. Beyond that, this is not working, it just annoys people.

this RFC is very specific about the meaning of stale (referring to Merriam Webster)

I must say, this is a less common definition of “stale”. The more common are:

tasteless or unpalatable from age
impaired in legal force or effect by reason of being allowed to rest without timely use, action, or demand

There is no clear notion of relevance (as you claim) to be found within the accepted version of the RFC.

https://dictionary.cambridge.org/dictionary/english/relevant

connected with what is happening or being discussed
correct or suitable for a particular purpose

Anyway, let's avoid dictionary debates.

blaggacao · 2020-10-16T14:30:00Z

But interests are largely driven by need. And when you need something, you usually don't forget about it.

Exactly! Henceforth, if you forget about something, and leave it around stale, you just created a negative marginal external effect on somebody else that compunds quite heavily with a database of 4.4k open issues. Some of the arguments expressed against this RFC do actually reflect quite pointedly this compound negative effect ("spam", "annoying", "message burst").

It's safe to say, that some people use the negative effects of not doing something as arguments against trying do something against those negative effevts. That's partly a logical fallacy, if one agrees that attention is a scarce resource, and a runaway self perpetuing innefficiency (which nixos ecosystem should not afford!).

Remeber, by the data there is still a 58% (63%) potential for spring cleanup to be lifted. Sure, we woudn't get this down to 0%, but if only to 20% that would reflect in more than 1k more issues properly archived (as in our librarian) instead of jamming the aisles.

I'm quite skeptical about encouraging people actions with bots. It works for answering “yeah, I don't have this problem anymore”. Beyond that, this is not working, it just annoys people.

Well, I think the evidence may speak against this: by my analysis in this RFC it "worked" for roughly 42% of interactions after a loooong period (problem of fading interest). The whole point of this RFC is to lower time to first interaction so that it actually gets less annoying & more relevant (as in "connected with what is happening or being discussed" — since concurrent with interests).

Apart from only leaving a non incremental comment with the only purpose of un staling an issue, which is bad habit, as we established:

Closing an issue is relevant (as in "correct or suitable for a particular purpose" — note that authors and maintainers can close): if I'm not more interested in old issues, I regularily close them, even if they still persist, since I effectively block others from ownership.
In my own experience, an incremental comment at a crucial moment, when the context is still somewhat in memory whith involved parties (not 180 days later), triggers in a fair share of times a chain of actions that ultimately result in resolution with reasonable probability. (People have argued that open source cannot expect professional commitment from everybody — and that's exactly the point here)

We also need to apprecaiate this RFC in the context in revising our label system with the goal of making it more effective. But that's a thing for another RFC.

blaggacao · 2020-10-26T18:03:24Z

As a first collateral of this motion, there now is:

NixOS/nixpkgs#101320

It makes mentally parsing the stalebot's action a matter of milliseconds.

In a few months, we can look again at the statistics if actionability has improved by it and account for if further action as proposed in this RFC can be considered as a non-zero-sum game (less so a negative sum game as even some people claimed).

blaggacao · 2020-10-26T18:06:58Z

@Mic92 Is it possible to label this RFC dormant due to "further data gathering"?

lheckemann · 2020-11-05T14:19:45Z

Like #74, would you like to close this until you're ready to proceed?

7c6f434c · 2020-11-06T08:55:37Z

Re: original topic: just have seen a comment today that stale bot existing is silly…

blaggacao · 2020-11-06T15:29:12Z

@lheckemann Fully agree, if that's the process (which it seems to be).

blaggacao changed the title ~~[RFC 0000] Stale Issues Amendement~~ [RFC 0077] Stale Issues Amendement Oct 14, 2020

blaggacao force-pushed the da-stale-period-amendemend branch 3 times, most recently from c49e0da to ba63579 Compare October 14, 2020 02:49

[RFC 0077] Stale Issues Amendement

0f0b8c9

blaggacao force-pushed the da-stale-period-amendemend branch from ba63579 to 0f0b8c9 Compare October 14, 2020 02:53

Emantor reviewed Oct 14, 2020

View reviewed changes

rfcs/0077-stale-issues-amendment.md Outdated Show resolved Hide resolved

rfcs/0077-stale-issues-amendment.md Outdated Show resolved Hide resolved

David Arnold added 2 commits October 13, 2020 22:24

typose thanks @Emantor

ba0c1f6

Declutter my notoriously bad writing style

2c0731d

Typos

3d873d4