Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

All exercise test files should reference canonical data version #784

Closed
N-Parsons opened this issue Oct 9, 2017 · 14 comments
Closed

All exercise test files should reference canonical data version #784

N-Parsons opened this issue Oct 9, 2017 · 14 comments

Comments

@N-Parsons
Copy link
Contributor

N-Parsons commented Oct 9, 2017

Currently some of the test files reference x-common//canonical-data.json with a version number, and a few test files lack any such a reference. Additionally, x-common was recently renamed to problem-specifications, so references to x-common are inherently outdated and potentially confusing.

Having a consistent version string should help with maintenance, since a script can fairly easily be written to compare the version numbers in the test file and the canonical data, and manual checks for how up to date it is are similarly easy. It also makes it easier for exercism users to find and check the canonical data, and thus to be able to contribute any changes they think are necessary if they identify problems or additional tests that would be useful.

Suggested wording

The currently adopted form for the version string seems to be:
# test cases adapted from `x-common//canonical-data.json` @ version: 1.0.1

To conform to PEP8, we should aim to limit the string to 79 characters, so I propose the following wording (75 chars):
# Tests adapted from `problem-specifications//canonical-data.json` @ v2.0.0

Does anyone have any alternatives or suggestions for improvements to this wording?

How to proceed

There are a couple of options for how to go about updating the test files:

  1. Only update version strings when the tests are next updated, and update tests when the canonical data change.
  2. Systematically update each test file to match the most recent canonical data version (or verify that it's already up-to-date), and update the version strings in the process.

Option 1 is obviously the less work-intensive one, but it would leave some inconsistency for quite a while. Option 2 is more work-intensive and should get everything up-to-date quicker, but it might burn people out a bit.

@ilya-khadykin
Copy link
Contributor

ilya-khadykin commented Oct 9, 2017

Thanks for raising the issue, it's an important one.
I think the second option is a way to go and it should be automated. Unfortunately, I have a lot of going on right now, so I can't do it right now

The other thing is that some tests can be trak specific, so it can be also stated in the comment giving the reader a sense of the reason for such decision

@N-Parsons
Copy link
Contributor Author

I had some free time this evening, so I put together a fairly rough script for checking the state of version numbering. I put the results into a gist to save making a very long post here, and I have summarised the results below.

Up-to-date: 33
Out-of-date: 8
Unknown: 31 (no version tag)
Unimplemented: 27
No canonical data: 14
No data needed: 5 (maybe more - requires manual checks)

@ilya-khadykin
Copy link
Contributor

@N-Parsons wow, that's awesome!
The next step will be a creation of issues for required exercises and linking them with this issue

@N-Parsons
Copy link
Contributor Author

I realised that some deprecated exercises were included, so I added a check for that, as well as checking whether the version string references x-common.

Script: gist (rev 2)
Results: gist (rev 4)

Summary:

Up-to-date: 0
Up-to-date, but x-common: 33
Out-of-date: 8
Unknown: 26 (no version tag)
Unimplemented: 28
No canonical data: 8 (could be submitted to problem-specifications)
No data needed: 5
Deprecated: 11

@ilya-khadykin
Copy link
Contributor

@N-Parsons how do you think we should proceed, for which exercises we should open issues?

@N-Parsons
Copy link
Contributor Author

I've written a script to automate the update for the 33 exercises that reference x-common, so I'm ready to submit those PRs whenever you want - do you want issues to exist for them first? The script is set-up to create separate PRs for each, since I presume that this is preferable?

The "Out-of-date" and "Unknown" categories will need a more manual approach, so I think issues should be created for these. I'm happy to create the issues if you want, but I don't think that I can label them? (I'm not sure what labels you would want, either)

@ilya-khadykin
Copy link
Contributor

@N-Parsons that is awesome.
I think that you can create a single PR for all 33 exercises that reference x-common.
But for "Out-of-date" and "Unknown" seperate issues should be created for people to check them. I think I'll add hacktoberfest and beginner-friendly labels

Will it work for you?

@N-Parsons
Copy link
Contributor Author

N-Parsons commented Oct 14, 2017

@m-a-ge, that's fine - I'll tweak my script and do the pull request now. Do you want separate commits within the PR? Or just one big commit?

@ilya-khadykin
Copy link
Contributor

@N-Parsons I guess one commit message would be enough, since it solves one issue

@N-Parsons
Copy link
Contributor Author

@m-a-ge, all of the issues should now be created and correctly labelled :)

@ilya-khadykin
Copy link
Contributor

@N-Parsons, thanks!

@sjwarner-bp
Copy link
Contributor

sjwarner-bp commented Nov 13, 2017

I implemented the canonical data in problem-specifications for the protein-translation problem recently, and as such the Python test suite is out of date.

I'm about to open an issue that should be consistent with @N-Parsons similar issues. #1103

@cmccandless
Copy link
Contributor

cmccandless commented Feb 15, 2018

Updated lists:

Do not reference existing canonical-data:

No canonical data:

  • bank-account
  • dot-dsl
  • error-handling
  • ledger
  • linked-list
  • pythagorean-triplet
  • robot-name
  • simple-linked-list
  • tree-building

This list will no longer be kept up to date. Script bin/check_test_version.py can be used to determine which exercises are out-of-date.

@cmccandless
Copy link
Contributor

Closing this issue as all issues for which canonical data exists contain a reference, and a script has been put in use for detecting new canonical data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants