Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add reference docs about the expected behaviour of login record fields. #2158

Merged
merged 1 commit into from
Dec 3, 2019

Conversation

rfk
Copy link
Contributor

@rfk rfk commented Nov 13, 2019

Fixes #2155.

In trying to get my head around the "valid login records" bug (#695, #2156, #2157 and friends), my first instinct was to go to the logins component README in order to find docs on the details of what a "login record" should be. It didn't point me in an obvious direction, but I eventually found three sources:

But none of them capture the validity questions we're discussing in those bugs. So, in this PR I have:

  • Updated the logins component README to (IMHO) make it a bit easier to get a sense of "what are the main concepts I need to understand to use this module?", one of which is the format of a login record.
  • Pulled together information from the above three docs and our IRL discussions, to try to make a fourth doc describing the login record format in more detail.
  • Left a while bunch of "XXX TODO" comments in places where it wasn't obvious to me what expected behaviour should be.

I'm not really sure how to feel about the fourth doc, and the maintenance overhead of trying to keep all the different docs in sync. But I do think we need to capture this all somewhere. Suggestions welcome!

@mnoorenberghe @thomcc could you please take a look at my understanding of all these fields and see if it's accurate, and whether you can answer any of the "TODO" questions?

@lougeniaC64, could you please take a look at the general changes to the logins README (rendered view here based on our earlier discussion about adding more contextual pointers to help with onboarding?

Copy link
Contributor

@mnoorenberghe mnoorenberghe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for writing this up!

The Logins component ***does not*** offer, and we have no plans to offer:

1. Any form-autofill of other UI-level functionality.
1. Storage of other secret data, such as credit card numbers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I think this may change in the future if we want to store other secure data for Lockwise like secure notes, account recovery questions/answers, etc. I don't think a separate store for that would be great; many password managers including Keychain, show the records in one view and we would probably do the same. I agree that credit card records are a bit different as they are more structured.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's reasonable. I'll add a note that this might be considered in the future, and direct to an appropriate place for discussion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://bugzilla.mozilla.org/show_bug.cgi?id=1119558 is a related bug though it currently says "with a saved password".

You can read about the fields on a login record in the code [here](./src/login.rs).
* A **logins store** is a syncable encrypted database containing login records. In order to use the logins store,
the application must first *unlock* it by providing a secret key (preferably obtained from an OS-level keystore
mechanism). It can then create, read, update and delete login records from th database.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: s/ th / the /

When records are added, the logins component performs a three-way merge between the local record, the remote record and the shared parent (last update on the server). Details on the merging algorithm are contained in the [generic sync rfc](https://github.com/mozilla/application-services/blob/1e2ba102ee1709f51d200a2dd5e96155581a81b2/docs/design/remerge/rfc.md#three-way-merge-algorithm).

### Record de-duplication
#### Record de-duplication

De-duplication compares the records for same the username and same url, but with different passwords. De-duplication compares the records for same the username and same url, but with different passwords. Deduplication logic is based on age, the username and hostname.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While you're here would you mind fixing these sentences?

//! matches this origin. This field must be a valid origin.
//!
//! YES, THIS FIELD IS CONFUSINGLY NAMED. IT SHOULD BE AN ORIGIN, NOT A FULL URL.
//! WE INTEND TO RENAME THIS TO `formSubmitOrigin` IN A FUTURE RELEASE.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/formSubmitOrigin/formActionOrigin/ ?

@@ -36,7 +36,7 @@ data class ServerPassword(
val password: String,

/**
* The HTTP realm, which is the challenge string for HTTP Basic Auth). May be null in the case
* The HTTP realm, which is the challenge string for HTTP Basic Auth. May be null in the case
* that this login has a formSubmitURL instead.
*/
val httpRealm: String? = null,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should formSubmitURL docs also say that it can be "" to mean that it's a wildcard match. It can also be javascript: (we strip the body and leave just the 'javascript:' portion) if you have a form like <form action="javascript:validateForm()">

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, yes, these are exactly the sort of details I want to iron out!

Comment on lines 107 to 108
//! This field is managed internally by the logins store and any attempt by the
//! application to specify a value will be ignored.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again this is problematic for importing from other desktop browsers or from Fennec so I hope we can change this.

//! This field is managed internally by the logins store and any attempt by the
//! application to specify a value will be ignored.
//!
//! XXX TODO: are these local machine timestamps, and thus possibly wildly inaccurate?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes and yes

//! application to specify a value will be ignored.
//!
//! XXX TODO: are these local machine timestamps, and thus possibly wildly inaccurate?
//! XXX TODO: should we talk about merge strategies here?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe? Ideally we should take the min of the non-zero values to merge.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does a zero in this field indicate?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately GtiHub is dumb and I no longer can tell which property this was commenting on… I believe this was timeCreated… There should never be a 0 here (I guess that should be documented).

//! application to specify a value will be ignored.
//!
//! XXX TODO: are these local machine timestamps, and thus possibly wildly inaccurate?
//! XXX TODO: should we talk about merge strategies here?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be the max of non-zero here

//! XXX TODO: are these local machine timestamps, and thus possibly wildly inaccurate?
//! XXX TODO: should we talk about merge strategies here?
//!
//! - `timePasswordChanged`: A lower bound on the time that the `password` field was last changed, in integer milliseconds from the unix epoch.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should be explicit here and say that username (or other field) changes shouldn't be recorded here as that's what we want and it's been discussed a few times.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For more context, this property is useful for breach alerts of passwords since changing the username wouldn't have addressed the breach.

@rfk rfk force-pushed the logins-record-docs branch 5 times, most recently from e3f6073 to 06a3621 Compare November 22, 2019 10:26
@rfk
Copy link
Contributor Author

rfk commented Nov 22, 2019

Marking this ready for review. When viewing the changes, please be sure to expand the large diff that is not shown by default for logins.rs.

We have a lot of open bugs about validating logins data, normalizing it in various ways, and fixing up invalid data if found in existing records. In this essay I will seek to reach agreement on the rules that we expect logins data to follow, and propose a code structure in which we can conveniently start enforcing them one rule at a time.

The key ideas here are:

  • Provide a big written description of the behaviours we want to enforce, based on current desktop behaviour. There are surprisingly many of these, and they have important implications for how consumers should use the returned logins records.
  • Give the Login struct a new method called fixup() which can repair invalid login data as long as it's obvious what is The Right Thing To Do ™️.
    • For example in this PR, I've added logic to clear the usernameField and passwordField properties if it is not a form-based login record.
  • Attempt to fixup login records an ingress and egress from the store, without outright discarding records that are irreparable. The idea here is to prevent consumers of the store from writing irreparably bad data, but not prevent them from reading it if it's already in the system.

I'm not sure it's the right approach, but I think it gives a nice framework to pursue other pending cleanups, such as the punycode normalization.

@thomcc, I'd really appreciate a gut-check from you on the approach here :-)

@lougeniaC64, how would this fit with the additional validation work you've been taking on?

@eliserichards, could you please take a look at the description of the login record fields in logins.rs in this PR and see if they all seem right to you?

@rfk
Copy link
Contributor Author

rfk commented Nov 22, 2019

(I should add, the fixup logic is partly inspired by @thomcc's older work in #708, although structured a little differently).

@rfk rfk marked this pull request as ready for review November 22, 2019 10:43
@@ -45,6 +45,7 @@ pub mod error_codes {
pub const INVALID_LOGIN_DUPLICATE_LOGIN: i32 = 64 + 2;
pub const INVALID_LOGIN_BOTH_TARGETS: i32 = 64 + 3;
pub const INVALID_LOGIN_NO_TARGET: i32 = 64 + 4;
pub const INVALID_LOGIN_UNSPECIFIED: i32 = 64 + 5;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fear we may be about to proliferate error codes here, are we sure we want to individually expose each of these over the FFI?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was done intentionally, as the lockwise team indicated that they'd like to know a precise reason the records were invalid.

// * The android-components docs at
// https://github.com/mozilla-mobile/android-components/tree/master/components/service/sync-logins
//
// We'll figure out a more scalable approach to maintaining all those docs at some point...
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize I have not followed my own advice here, but I can once we're happy with the descriptions, if it seems valuable. Or, we could consider moving this out into a markdown document and just putting pointers at it in the various API docs.

};
// For now, we want to apply fixups but still return the record if
// there is unfixably invalid data in the db.
Ok(login.maybe_fixup().unwrap_or(None).unwrap_or(login))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'd eventually like to be confident our db only holds validated data, and not have to do this on every read. But that seems like a much later step in the process.

Some(record)
// If we can fixup incoming records from sync, do so.
// But if we can't then keep the invalid data.
record.maybe_fixup().unwrap_or(None).or(Some(record))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this is a good idea, just putting it out here for consideration...

* The way we [generate ffi bindings](../../docs/howtos/building-a-rust-component.md) and expose them to
[Kotlin](../../docs/howtos/exposing-rust-components-to-kotlin.md) and
[Swift](../../docs/howtos/exposing-rust-components-to-swift.md).
* The key ideas behind [how Firefox Sync works](../../docs/synconomicon/) and the [sync15 crate](../sync15/README.md).
Copy link
Contributor

@lougeniaC64 lougeniaC64 Nov 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: The sync15 README link was broken for me as there doesn't appear to be a README.md file in the sync15 directory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, good catch. I've added a very very small stub README, but we should flesh it out further in followup work.

editing logins using the code in `src`.
- `ffi`: The Rust public FFI bindings. This is a (memory-unsafe, by necessity)
API that is exposed to Kotlin and Swift. It leverages the `ffi_support` crate
- [`example`](./example): This contains example rust code that implements a command-line app
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This link is also broken. I think this should be [examples](./examples).

@lougeniaC64
Copy link
Contributor

lougeniaC64 commented Nov 25, 2019

@lougeniaC64, how would this fit with the additional validation work you've been taking on?

I think it makes a lot of sense to have functionality that fixes up logins before the validity check occurs in order to reduce the number of errors thrown for issues we could resolve internally. Particularly because the additional checks I've added are all for invalid characters, which could easily be removed before the check takes place.

Copy link
Contributor

@lougeniaC64 lougeniaC64 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Contributor

@eliserichards eliserichards left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks great. Fantastic job with the documentation and examples. It was extremely easy to absorb all of the context despite being such a complicated and dense subject. Also A+ for having tests and leaving todos to cover later 💯 👍

Comment on lines 79 to 80
#[fail(display = "Some unspecified problem; no error details sorry :-(")]
Unspecified,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guessing we'll be able to add more errors here as we iron out more of the details about what failures we need, correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. @lougeniaC64 is adding a better error type over in #2262 and we can expand it as necessary from there.

@@ -45,6 +45,7 @@ pub mod error_codes {
pub const INVALID_LOGIN_DUPLICATE_LOGIN: i32 = 64 + 2;
pub const INVALID_LOGIN_BOTH_TARGETS: i32 = 64 + 3;
pub const INVALID_LOGIN_NO_TARGET: i32 = 64 + 4;
pub const INVALID_LOGIN_UNSPECIFIED: i32 = 64 + 5;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm personally fine with a generic "this is breaking because of something" error for now.

Comment on lines 41 to 42
//! **YES, THIS FIELD IS CONFUSINGLY NAMED. IT SHOULD BE A FULL ORIGIN, NOT A HOSTNAME.
//! WE INTEND TO RENAME THIS TO `origin` IN A FUTURE RELEASE.**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏

Comment on lines 352 to 354
// Like `fixup()` above, but takes `self` by reference and returns
// an Option for the fixed-up version, allowing the caller to make
// more choices about what to do next.
Copy link
Contributor

@eliserichards eliserichards Nov 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

allowing the caller to make more choices about what to do next

Love it. I think a certain amount of autonomy will be useful (for now).

//! - "ftp://ftp.site.com"
//! - "moz-proxy://127.0.0.1:8888"
//! - "chrome://MyLegacyExtension"
//! - "file:///"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if github ate thom's comment or what, but he had a good point here that we need to clarify.

@mnoorenberghe what should the behaviour be for values that have an opaque origin per the WhatWG URL Spec?

Concrete example: if one record has origin="file:///" and another has origin="file:///hello/world", what happens?

My take is that we should treat them as valid and leave them alone, and that calling code should only consider them a match based on exact string matching.

I had a rummage around in the Desktop passwordmgr directory but couldn't find any testcases covering such cases.

Copy link
Contributor

@thomcc thomcc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Some concerns I had with earlier vesions that seem to be addressed now is that the distinction between empty strings, null, and missing seems very important (in terms of how other clients behave) and subtle. It's worth being certain that we get it right everywhere (I think an earlier version didn't, but I didn't notice anywhere that this had it wrong).

It's worth checking that the behavior documented around insertion/updates matches what we'll actually do. For example, we manage the metadata fields entirely, and don't allow the application to manage them. I suspect the main reason that logins does on desktop is for sync's benefit, (and perhaps for testing use cases) otherwise it's hard to see why the application would need to.

Ok(())
}

// Return either the existing login, a fixed-up verion, or an error.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: /// is for item-level doc comments in rust. Here and elsewhere (except for the module-level comment, which is correctly //!).

// A little helper to magic a Some(self.clone()) into existence when needed.
macro_rules! get_fixed_or_throw {
($err:expr) => {
if ! fixup { throw!($err) }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: space between ! and fixup. And general formatting. Rustfmt doesn't really touch macro bodies,unfortunately.

($err:expr) => {
if ! fixup { throw!($err) }
else { Ok(fixed.get_or_insert_with(|| self.clone())) }
as Result<&mut Login>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this as does anything (note that it's not type ascription -- it doesn't act as a type constraint, it's a type cast)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, I added it to stop rustc yelling at me about:

error[E0282]: type annotations needed
   --> components/logins/src/login.rs:392:17
    |
392 |                 get_fixed_or_throw!(InvalidLogin::Unspecified)?
    |                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ cannot infer type
error: aborting due to previous error                                                   

But I'll see if I can replace it with e.g. a local variable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, surprised that worked. It's fine then, usually that doesn't work (and my understanding of rust-lang/rfcs#803 is that the syntax for ascribing a type in this way is different and unstable)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a local variable with an explicit type seems to have worked, so I'll switch to that to avoid confusing future readers

//! may be wildly inaccurate if the client does not have an accurate clock.
//!
//! This field is managed internally by the logins store by default and does not need to
//! be set explicitly, although any application-provided value will be stored.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's true that we allow application provided values, unless that changed recently. We could in the future, but the only use case I can think of is test code.

@@ -45,6 +45,7 @@ pub mod error_codes {
pub const INVALID_LOGIN_DUPLICATE_LOGIN: i32 = 64 + 2;
pub const INVALID_LOGIN_BOTH_TARGETS: i32 = 64 + 3;
pub const INVALID_LOGIN_NO_TARGET: i32 = 64 + 4;
pub const INVALID_LOGIN_UNSPECIFIED: i32 = 64 + 5;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was done intentionally, as the lockwise team indicated that they'd like to know a precise reason the records were invalid.

//! When merging duplicate records, the largest non-zero value is taken.
//!
//! This field is managed internally by the logins store by default and does not need to
//! be set explicitly, although any application-provided value will be stored.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we will do this either.

@rfk
Copy link
Contributor Author

rfk commented Nov 27, 2019

For example, we manage the metadata fields entirely, and don't allow the application to manage them.
I suspect the main reason that logins does on desktop is for sync's benefit, (and perhaps for testing
use cases) otherwise it's hard to see why the application would need to.

IIRC Matt mentioned one other user-case, which was importing from other browsers. If we aspire to land this component in Desktop at some point we should probably support this behavior (although we don't need to add it in this PR).

@thomcc
Copy link
Contributor

thomcc commented Nov 27, 2019

importing from other browsers

IMO we might want a dedicated API for this case, but sure, barring that it's a valid case for the metadata.

It is worth noting that invalid metadata imported from other browsers has caused substantial problems with sync, causing us to have to even patch the firefox-for-android JSON parser implementation to handle it in the past. So I'd prefer we discouraged handling it externally, although I guess it's plausible we could have the same bug.

@rfk rfk force-pushed the logins-record-docs branch 3 times, most recently from 91806dc to aad4350 Compare November 28, 2019 05:29
@rfk
Copy link
Contributor Author

rfk commented Nov 28, 2019

I'm pretty happy with where this is at, but I'm going to wait for #2262 to land and then rebase on top of it.

@rfk rfk force-pushed the logins-record-docs branch 2 times, most recently from b6c3c5a to 98a1f45 Compare December 3, 2019 02:56
@rfk
Copy link
Contributor Author

rfk commented Dec 3, 2019

I'm going to wait for #2262 to land and then rebase on top of it.

To my great relief, this looks to have been a very very trivial rebase :-)

…t enforcing it.

Along with a whole lotta docs, this commit adds a `fixup` method to the Login
struct, which can be used to repair records that fail validation in ways
that it's obvious how to fix.
@rfk rfk merged commit 819f57b into master Dec 3, 2019
@eoger eoger deleted the logins-record-docs branch April 21, 2020 21:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Document the intended data model and restrictions for login records
5 participants