Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect handling of non-numeric prerelease identifiers with leading zeroes #45

Closed
Nemo157 opened this issue Apr 27, 2020 · 15 comments
Closed
Assignees

Comments

@Nemo157
Copy link

Nemo157 commented Apr 27, 2020

# SELECT '1.0.0-alpha.0a'::semver;
ERROR:  bad semver value '1.0.0-alpha.0a': semver version numbers can't start with 0

Only numeric identifiers of the prerelease version are forbidden from including leading zeroes. This should parse correctly as a version with base 1.0.0 and prerelease data [Alphanumeric("alpha"), Alphanumeric("0a")].

@Nemo157
Copy link
Author

Nemo157 commented Apr 27, 2020

Relatedly I'm not sure whether 1.0.0-alpha.01 is valid or not, I've opened semver/semver#563 to try and clarify.

@theory
Copy link
Owner

theory commented May 2, 2020

Reminds me of #38. What say you, @maspalio?

@jwdonahue
Copy link

jwdonahue commented May 3, 2020

Please see my comment on this in #38.

In addition:

Valid:
-alpha.0a
-0AEF

Not valid:
-alpha.010
-02799

Note that there is no provision in the spec for hex values in prerelease numeric fields. Because metadata is not included in the precedence order, their are no restrictions other than the allowed subset of ASCII characters.

@theory
Copy link
Owner

theory commented May 5, 2020

Thanks @jwdonahue. Confirm that these tests fail on 1.0.0-alpha.0a, which should be valid, and 1.0.0-02799, which should not be valid:

--- a/test/sql/base.sql
+++ b/test/sql/base.sql
@@ -4,7 +4,7 @@ BEGIN;
 \i test/pgtap-core.sql
 \i sql/semver.sql
 
-SELECT plan(299);
+SELECT plan(303);
 --SELECT * FROM no_plan();
 
 SELECT has_type('semver');
@@ -25,7 +25,9 @@ SELECT lives_ok(
     '1.0.0-1',
     '1.0.0-alpha+d34dm34t',
     '1.0.0+d34dm34t',
-    '20110204.0.0'
+    '20110204.0.0',
+    '1.0.0-alpha.0a',
+    '1.0.0-0AEF'
 ]) AS v;
 
 SELECT throws_ok(
@@ -43,7 +45,9 @@ SELECT throws_ok(
     '1.4b.0',
     '1v',
     '1v.2.2v',
-   '1.2.4b.5'
+    '1.2.4b.5',
+    '1.0.0-alpha.010',
+    '1.0.0-02799'
 ]) AS v;
 
 -- Test =, <=, and >=.

@jwdonahue
Copy link

Just be careful not to create orphaned versions with your corrected code. I made that mistake once with an internal tool that got adopted about about 10K users in the first few months it was out. I would flag them with a warning when used and maybe provide a configuration flag to determine whether the user needs to fall-back to the broken behavior for existing data, but disallow creation of additional invalid strings.

@theory
Copy link
Owner

theory commented May 6, 2020

Yeah, I'll look through the PGXN database, but don't know what to do for other users of the extension other than call it out loudly in the changes on release.

@theory
Copy link
Owner

theory commented May 6, 2020

Fortunately, all the prerelease versions on PGXN have boring SemVer 1.0-style strings:

   name    │    version     
───────────┼────────────────
 madlib    │ 0.3.0-alpha1
 madlib    │ 0.5.0-release1
 madlib    │ 0.6.0-release1
 multicorn │ 1.0.0-beta1
 pgmp      │ 1.0.0-b2
 pgmp      │ 1.0.0-b3
 pg_repack │ 1.1.8-alpha1
 pg_repack │ 1.1.8-beta1
 pg_repack │ 1.1.8-beta2
 pg_repack │ 1.2.0-beta1
 pg_repack │ 1.3.0-beta1
 pgvihash  │ 1.0.0-dirty
 plv8      │ 1.1.0-beta1
 plv8      │ 1.5.0-dev1
 tds_fdw   │ 2.0.0-alpha1
 variant   │ 1.0.0-beta
 variant   │ 1.0.0-beta2
 variant   │ 1.0.0-beta3

@maspalio
Copy link
Collaborator

maspalio commented May 6, 2020

I second @jwdonahue. Been there, done that.
My postgres-fu is not what it used to be but is there any way of having some kind of parameter to trigger the old-yet-wrong behavior?

@theory
Copy link
Owner

theory commented May 10, 2020

Okay, I've pushed a fix. The issue where it previously allowed 1.0.0-02799 was because the code was failing to check the very first part of a prerelease or metadata part. 1397e49 fixes both issues, but we need to think about how to handle the break in support for numeric-only prereleases with leading 0s. Do we need to provide a conversion function for folks to run? Release it and see if anyone complains? Thoughts?

@jwdonahue
Copy link

@theory, to be clear, the leading zero prohibition in numeric fields only applies to prerelease tags. Because the meta tag is not used for sorting, it does not have the distinction between numeric and alphanumeric fields.

@jwdonahue
Copy link

Also, there are many points in this documentation that are wrong. I can file bugs if you'd like, or wait for you take another pass at updating it with your current knowledge.

One thing I would point out is that SemVer(X.Y.Z) != NonSemVer(X.Y.Z), so 1.2.0 != 1.2.00, because the later is not a semver string. You may want to have strict and nonstrict modes for this sort of thing, but a non-conforming string is less likely to carry the same semantics as a conforming string. I will be pushing hard for any SemVer 3.0.0 to also include a VersionMeta tag, to help clear up these ambiguities.

@theory
Copy link
Owner

theory commented May 10, 2020

Verified that leading 0s are allowed for numeric metadata values in 1993013.

@theory
Copy link
Owner

theory commented May 10, 2020

As of right now:

david=# select '1.2.3'::semver = '1.2.3';
 ?column? 
----------
 t
(1 row)

Time: 0.660 ms
david=# select '1.2.3'::semver = '1.2.30';
 ?column? 
----------
 f
(1 row)

If we wanted to eliminate the first case, we'd have to remove an implicit cast. Which I can see both ways, frankly. What happens is it implicitly casts the lhs value to a SemVer before making the comparison, so it will only work for a valid SemVer.

I'll make a pass through the docs; I don't think we updated them when we added support for SemVer 2.0.

@theory
Copy link
Owner

theory commented May 10, 2020

I've made docs changes in #51. Let me know what I missed!

@theory theory closed this as completed in 1397e49 May 14, 2020
theory added a commit that referenced this issue May 14, 2020
In response to discussion on #45, properly prevent leading `0`s in prerelease parts only when there are no alphanumeric characters. Also fix an issue where it would ignore the first leading `0` in a prerelease no matter what else was in the part.
@theory
Copy link
Owner

theory commented May 15, 2020

Thoughts on the documentation section I added warning folks about upgrading?

pg-semver/doc/semver.mmd

Lines 72 to 98 in 97a7dfb

🚨 v0.23.0 Upgrade Compatibility Warning 🚨
-------------------------------------------
Prior to v0.23.0, the semver extension incorrectly allowed some invalid
prerelease and build metadata values. Details below, but *BEFORE YOU UPGRADE*
from an earlier version, we strongly recommend that you check for and repair any
invalid semvers. You can find them using the official
[SemVer regular expression](https://regex101.com/r/vkijKf/1/) like so
(replace `name`, `version`, and `packages` as appropriate for your database):
``` sql
SELECT name, version FROM packages
WHERE version::text !~ '^(0|[1-9]\d*)\.(0|[1-9]\d*)\.(0|[1-9]\d*)(?:-((?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\+([0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?$';
```
If no rows are returned, you should be good to go. If there are results, here are
Examples of invalid semantic versions and how they should be repaired.
```
Invalid Valid SemVer
----------- ----------------------------
1.0.0-02799 -> 1.0.0-2799
1.0.0-0.02 -> 1.0.0-0.2
1.0.0-.20 -> 1.0.0-0.20
1.0.0+0+20 -> 1.0.0+0-20 or 1.0.0+0.20
1.0.0+.af -> 1.0.0+0.af or 1.0.0+af
```

There's a variant of it in the Change file, too. I think for release I'll increment to v0.30.0 to denote the significance of the change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants