Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SWIFT-1126 / SWIFT-1137 Add legacy extended JSON parsing, fix Date-related crashes #64

Merged
merged 6 commits into from
Mar 23, 2021

Conversation

patrickfreed
Copy link
Contributor

SWIFT-1126 / SWIFT-1137

This PR adds support for legacy extended JSON v1 parsing, which the old libbson-based BSON library had support for. This PR also fixes potential crashes that may occur when trying to use Date values extremely far in the future (> Int64.max ms from epoch) or the past (< Int64.min ms before epoch).


base64Str = b64Str
subtype = s
case let .string(base64):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this parses the following format:

{
    "$binary": <base64 string>,
    "$type": <hex string or number>
}

@@ -17,8 +17,15 @@ public class ExtendedJSONDecoder {
}()

/// A set of all the possible extendedJSON wrapper keys.
/// This does not include the legacy extended JSON wrapper keys.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we allow things like "$type" to be in a regular document (e.g. as part of a query operator), the legacy wrapper keys can't be used to determine errors like we do with the reserved ones, so we need to separate them out.

@@ -62,6 +62,25 @@ public func sortedEqual(_ expectedValue: BSONDocument?) -> Predicate<BSONDocumen
}
}

public func sortedEqual(_ expectedValue: BSON?) -> Predicate<BSON> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a copy/paste that extends this matcher to BSON

Self.jsonTest(json: invalidStringNumberSubtype, expectation: .error())
}

func testLegacyExtendedJSONDate() throws {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these test cases are taken from libbson

)

// don't invalidate a "$regex" query operator stored in JSON
Self.jsonTest(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the first of these 3 were taken from libbson too, except libbson would return an error. That seemed problematic to me, since these could easily be valid $regex query operators, just stored in JSON. I figured no longer throwing an error in these cases would not really constitute a breaking change, and that the legacy parsing should defer to the non-legacy path if it ever is ambiguous.

self = BSONRegularExpression(pattern: patternStr, options: optionsStr)
return
} else {
// legacy / v1 extended JSON
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this parses the following format:

{
    "$regex": <string>,
    "$options": <string>,
}

which is unfortunately, the exact same format as a query operator.

@@ -48,6 +51,15 @@ extension Date: BSONValue {
)
}
self = date
case let .number(ms):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

legacy allows for a regular number instead of a $numberLong wrapper.

@patrickfreed patrickfreed requested a review from kmahar March 5, 2021 21:49
@patrickfreed patrickfreed marked this pull request as ready for review March 5, 2021 21:49
internal var msSinceEpoch: Int64 { Int64((self.timeIntervalSince1970 * 1000.0).rounded()) }
/// If the date is further in the future than Int64.max milliseconds from the epoch,
/// Int64.max is returned to prevent a crash.
internal var msSinceEpoch: Int64 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps we should mention this behavior somewhere more user-facing, maybe on the BSON.datetime case?

also, given that msSinceEpoch is only used internally, did you consider an alternative approach where we throw here instead? edit, ah I see why we can't do that from the tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea, done

let regex = BSONRegularExpression(pattern: "abc", options: "ix")

Self.jsonTest(
json: ["val": ["$regex": "abc", "$options": "ix"]],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it problematic that this query operator now gets parsed as a regex?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not ideal, but this is the existing behavior in libbson, so I think we need to preserve it.

Copy link
Contributor

@kmahar kmahar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@patrickfreed patrickfreed merged commit 558c7ff into mongodb:main Mar 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants