Skip to content
Tomeulnv edited this page Sep 16, 2021 · 5 revisions

We manage our work using Github Issues (think of them as the tasks defined in your Scrum sprints). This allows us to organize and track what is happening on a project, and also provides a durable, replicable record of the work for future reference.

Every issue has a reporter (the person who created it) and should have one or more assignees (the person(s) who will execute it). The reporter chooses the assignee when the issue is created.

Scope

An issue is a discrete, well-defined unit of work on a project. Normally this means that an issue should not be more than a couple of weeks' worth of work, and should not be open for more than a month or two.

Issues are too broad if they become open-ended and end up mixing multiple work threads. "Write the follow up paper" or "Do the analysis" are usually not good issues (unless the project is very small). Sometimes an issue that started with manageable scope will grow as the project expands or new questions arise. At this point it is a good idea to carve off subparts into separate issues and close the original issue with an interim summary.

Issues should not be opened until the assignee is ready to work on them actively (or will be soon). To-do items that we plan to work on in the future should be placed on a project outline in the repository's Github wiki.

If work stops on an issue for any reason and is not expected to resume soon, the issue should be closed with an interim summary. If we plan to continue work later this can be noted on the project outline in the repository's Github wiki along with a link to the original issue.

Title

The issue title should be descriptive enough that somebody looking back at it later will be able to understand what the purpose of the issue was and how it fits into the larger context. It should use the imperative mood, and not end in a period ("Revise main figure") not ("main figure."). This post by Chris Beams has an excellent discussion of what makes a good Git commit message; the same principles apply to good issue titles as well.

Good titles:

  • Revise abstract
  • Add 2016 data to main robustness figure
  • Run bootstrap for IV regressions

Bad titles:

  • abstract
  • Robustness
  • Incorrect inputs causing error.

Description

The issue description should state the goals of the issue clearly. Like the title, it should usually be written in imperative mode. The description should be precise enough that a third party can judge whether the issue was completed or not. It should include enough explanation and context that someone who is not intimately familiar with the other work going on at that moment can understand it clearly -- remember that we will often be returning to these issues many months or even years later and trying to understand what was going on. If an issue relates directly to one or more other issues, this should be stated in the description with a link to the other isssue(s) (e.g., "Follow-up to #5").

The same principles regarding hyperlinks, @-references, etc. in the discussion of comments below apply to issue descriptions as well.

Good description:

Following #22, re-run the anlaysis on the Sherlock cluster to see if that improves performance.
* Run a minimal version of the base model on the Sherlock node `alpha`
* Test the subsampling procedure on the `alpha` node
* Run a minimal version of the base model on Sherlock's actual computing nodes.
* Test the subsampling procedure on Sherlock's actual computing nodes

Document necessary code changes to implement politext code on Sherlock and potential bottlenecks.
In the long term we want to migrate politext computing to Sherlock and this issue is a first step toward that.

Bad description

Redoing everything on Sherlock including the subsampling. Remember we want alpha not only the regular nodes.

Comments

Comments in Github issue threads are the main way we communicate about our work.

You can add comments to the thread in a browser or by replying to a notification email about the issue. When commenting by email reply, remember to delete the quoted text of the email thread below your actual reply. Otherwise, this will add duplicate text to the comment thread and make it hard to read.

You (the assignee) should post comments regularly summarizing progress. The comment threads are your real estate and you are free to include updates as often as you find useful. Preliminary output, "notes to self," etc. are fine. No issue should be left for more than two weeks without a comment updating the status, even if the comment only says: "Have not done any work on this issue in the last week".

If you have a question that requires input or attention from another lab member, you should write a comment, including an '@' reference, that makes clear exactly what input is needed. E.g., '@hannesdatta, Where would you like me to store the data files?' Users should keep email notifications for '@' references turned on. Anyone who is not the assignee of an issue will assume by default that comments not @-referencing them do not require their attention.

It is up to the you to judge the optimal time to request feedback from the reporter (or PIs on the project, etc.). You should usually not send results until you have spent some time making sure they are correct and make sense. When you do request feedback, you should provide a clear and concise summary making clear the situation and exactly what input you need. At the same time, you should not feel shy about requesting feedback when you are confident it will be efficient and valuable.

If you have an important interaction about an issue outside of GitHub -- in person, over video chat, etc. -- add a comment to briefly summarize the content of that interaction and the conclusions reached.

Issues are referenced by their Github issue number (e.g., "#5") when it is clear from context what repository the issue is in, or by the name of the repository plus the issue number (e.g., "news_trends #5") when it is not. Any reference to a Github issue in a comment thread, email, etc. should be hyperlinked to the issue itself. Note that Github does this automatically if you type "#" followed by a number in a Github issue thread.

Any reference to a file, directory, paper, or webpage should be hyperlinked to a permanent URL. This page has instructions for getting permanent URLs for files in Github repositories. Links to Dropbox files and directories can be copied from the web or desktop client.

Deliverable

Every issue must conclude with a reproducible deliverable

It is up to you (the assignee) to judge when the objectives in the task description plus any issues that have come up in the comment stream have been resolved. As a rule, you do not need to request confirmation of this from the reporter (or PIs on the project).

Each task should have a final deliverable containing all relevant results. The form of the deliverable may be any combination of:

  • Content added to the draft, slides, etc. in the repository
  • A PDF or markdown file
  • A summary in the final comment in the issue thread

The deliverable must be self-contained. It should usually begin with a concise summary of the task goal and the conclusions (e.g., answer to an empirical question), followed by supporting detail. A user should be able to learn all relevant results from the deliverable without looking back at the comment thread or task description.

By default, we produce PDF / markdown deliverables inside the repository following the same rules we use to produce papers, slides, etc. Code and documents that are specific to the issue and that we will not want to carry forward in the repository after the issue is complete can be created in the issue branch and then deleted before merging back to master. Such files are normally placed in a separate subdirectory called /issue/ at the top level of the repository. Permanent links to deliverables in the /issue/ subdirectory will continue to work even after the directory is deleted.

The deliverable must contain enough information that another user could replicate its results. For figures, tables, or other results produced by code, a user should be able to identify the relevant code and reproduce the output. This will usually be automatic when the output is produced inside the repository. For output produced by hand (e.g., literature reviews, manual calculations) the deliverable should include enough information about the steps performed that a user could have a decent shot at repeating them.

Closing an Issue

When an issue is complete, you should post a final summary comment and then close it. All closed issues must have one and only one summary comment. If changes made after the issue is closed (e.g., during peer review) require changes, you should edit the summary comment in place rather than creating a new one.

The final comment should begin with "Summary" on the first line (usually in bold or title font). It must also include a brief (usually at most a couple of paragraphs) recap of what was accomplished in the issue. It must include a revision-stable pointer to the deliverable -- usually a link along with additional information if needed (e.g., relevant page / table / figure numbers in the draft).

At this point you will also normally open a pull request to peer review the issue and merge the issue branch back to master.

Prioritizing work

By default, peer review takes priority over all open tasks, open tasks created earlier should take priority over tasks created later. In some cases we may give explicit instructions that override these defaults. If you are ever unsure about prioritization you should ask.

All such rules are just a guideline. Using your time productively takes precedent over the priority ordering of tasks.

Clone this wiki locally