-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AUT-228 - Fixing monitor errors in stage #981
Conversation
tools/autograph-monitor/constants.go
Outdated
@@ -8,100 +8,83 @@ const autographDevRootHash = `5E:36:F2:14:DE:82:3F:8B:29:96:89:23:5F:03:41:AC:AF | |||
|
|||
// firefoxPkiStageRootHash is the SHA2 hash of the firefoxPkiStageRoot | |||
// cert raw bytes | |||
const firefoxPkiStageRootHash = `DB:74:CE:58:E4:F9:D0:9E:E0:42:36:BE:6C:C5:C4:F6:6A:E7:74:7D:C0:21:42:7A:03:BC:2F:57:0C:8B:9B:90` | |||
const firefoxPkiStageRootHash = `C0:F0:5D:59:B1:FD:E2:57:80:85:4C:32:FA:E8:FA:BA:84:81:C2:33:B4:C1:D3:90:CC:A5:F2:CE:A8:19:30:EE` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only one of the two XPI roots that are used in staging. You'll need to include both. The two were used to do initial "backwards compat" testing during cert succession because the prior staging root key had been lost to time.
Both are on https://bugzilla.mozilla.org/show_bug.cgi?id=1882192 (and soon https://github.com/mozilla-services/hsm/pull/31)
The cas_new is the "long-term" staging root we're using and is used for all of the cas_new_
signers in staging of types xpi
and contentsignaturepki
and will be in all new signers of those types.
cas_cur was the root that'll expire in a few months and is used for all of the cas_cur_
signers in staging of both xpi
and contentsignaturepki
types.
The easiest thing to do would be to add both as constants here and then add both to the truststore and do the sha checking for both. Key agility, baby.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, also, I don't think we've removed the old staging signer ids, yet, either, so you probably ALSO need to keep the current cert included in the truststore, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wellll, actually, that would be a good forcing function for us to remove the signers using that old staging root.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've converted this to an array for now and have the 3 root certs in there.
Maybe I'm thinking about this backwards. But we should remove the certs from the store before removing them from monitor, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're correct, imo. We should see this working and then see it continue to work after we remove them from the configs.
It could go the other way, but this lets us scream test at our leisure instead of gating the GCP migration on the scream test having without screams.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I keep forgetting if I had reviewed the latest changes here, so let me drop this that we discussed offline:
the rootHash var also needs this change.
160146f
to
7c64af2
Compare
switch conf.env { | ||
case "stage": | ||
conf.rootHash = firefoxPkiStageRootHash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactoring to just have truststore
and depTruststore
.
Also using arrays for multiple roots.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a helper function to populate and no longer need root hashes for stage and prod.
if rhash != certHash { | ||
return fmt.Errorf("hash does not match expected root: expected=%s; got=%s", rhash, certHash) | ||
var matchFound = false | ||
for _, rootHash := range rootHashes { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactored to check each root hash.
Argued with myself about removing colons ahead of time or in here. This seems like less code, the other way seems like less compute time. Minimal difference either way I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could use slices.ContainsFunc or slices.Contains (if you move the strings.Replace
to earlier) here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah, ContainsFunc
seems like the cleanest way.
tools/autograph-monitor/constants.go
Outdated
// autograph dev config: | ||
// https://github.com/mozilla-services/autograph/blob/b3081068f4a9c0c1de02150432f2d02887dd6722/autograph.yaml#L113-L126 | ||
// used on the normandy and remote settings development servers | ||
const autographDevRootHash = `5E:36:F2:14:DE:82:3F:8B:29:96:89:23:5F:03:41:AC:AF:A0:75:AF:82:CB:4C:D4:30:7C:3D:B3:43:39:2A:FE` | ||
var autographDevRootHashes = []string{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels a bit weird because arrays can't be constants in golang. But these are scoped to monitor so seems fine?
@@ -109,8 +109,8 @@ func signTestCert(options signOptions) *x509.Certificate { | |||
return certX509 | |||
} | |||
|
|||
func sha2Fingerprint(cert *x509.Certificate) string { | |||
return strings.ToUpper(fmt.Sprintf("%x", sha256.Sum256(cert.Raw))) | |||
func sha2Fingerprint(cert *x509.Certificate) []string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand turning this into a []string
return type is convenient, but this is not what I would expect to receive from a func taking a single cert
and would spend time investigating why. Could we do this outside of this func (and its similarly named friend elsewhere) or change this (and its similarly named friend) to take multiple certs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that makes sense. Adjusting this and fixed the property names too.
tools/autograph-monitor/constants.go
Outdated
xNCR3VBsgIgAhyO11hXiBo5nX5OHn2PG1MPV9+H0RQb6g5zOuKrWoh9CzB8mBl7d | ||
0JMkzy6TFAcAo3KL5ZNvRqqB8xRmIYLK | ||
-----END CERTIFICATE-----`, | ||
// current root cert |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"current" isn't accurate. I know we called it "cas_cur" but that was a temporary thing. It's now the one that isn't used by staging services. cas_new
is. I know, I know, this is confusing now, but we had to optimize for not confusing the folks doing backwards compat testing during cert succession.
Future signers won't be prefixed with cas_new_
but will use that "cas_new" intermediate cert.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, updating comments to make this (hopefully) a little less confusing.
And additional note that isn't strongly required: You know, you also could use a func to calculate the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is dope and very close! Thank you
tools/autograph-monitor/monitor.go
Outdated
@@ -113,6 +95,15 @@ func main() { | |||
} | |||
} | |||
|
|||
// Helper function to load a series of certificates and their hashes to a given truststore and hash list | |||
func LoadCertsToTruststore(hashArr *[]string, truststore *x509.CertPool, certificates []string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could create the CertPool
inside itself and return the hashArr
and the CertPool
, couldn't it? Passing a pointer to a slice like this is not typical in Go, and hints that we may be too side-effect-y. (Even if we did take the slice as param, typical style is to return a slice -- that might be unchanged -- from the func instead to avoid nil
pointer panics from being possible)
So the new type I'm proposing would be loadCertsToTrustStore(certificates []string) ([]string, *x509.CertPool)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
omg duh
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Chaa, been there. Go (and Rust) are interesting because you get pointers (unlike most languages us web-y folks work in these days), but you also don't have to weird C stuff with them for simple things like output params etc etc!
72598c8
to
088cbfa
Compare
if rhash != certHash { | ||
return fmt.Errorf("hash does not match expected root: expected=%s; got=%s", rhash, certHash) | ||
var matchFound = slices.ContainsFunc(rootHashes, func(rootHash string) bool { | ||
return certHash == strings.Replace(rootHash, ":", "", -1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we still need this Replace now that we have the LoadCertsToTruststore
func? I think no (or, if so, could we do this there?)
tools/autograph-monitor/monitor.go
Outdated
if err != nil { | ||
break | ||
} | ||
conf.depTruststore, conf.depRootHashes, err = LoadCertsToTruststore(firefoxPkiProdRoots) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
firefoxPkiStageRoots
I believe?
tools/autograph-monitor/monitor.go
Outdated
conf.depTruststore.AppendCertsFromPEM([]byte(firefoxPkiStageRoot)) | ||
conf.truststore, conf.rootHashes, err = LoadCertsToTruststore(firefoxPkiProdRoots) | ||
if err != nil { | ||
break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll admit I'm not stoked by having to reuse err
across the switch (yes, i didn't mention this on the zoom; it's not a huge deal), but since we do, we probalby want to make this err and the stage
one after it more specific with a
err = fmt.Errorf("unable to load production roots: %w", err)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah, that could get confusing. I think I can make this a bit more explicit since we have 2 trust stores, why not have 2 errors?
7724c24
to
35e678b
Compare
depErr = fmt.Errorf("failed to load depTruststore root certificates: %w", depErr) | ||
} | ||
if err != nil || depErr != nil { | ||
log.Fatalf("%s", errors.Join(err, depErr)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should allow us to know if either truststore is failing. Which could be useful in the future if we start using both for stage or dev for whatever reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
AUT-228
Working on resolving monitor errors in stage. This is part 1 of N.
verifyXPISignature
that allowed it to verify with anil
truststoreNeed to: