-
-
Notifications
You must be signed in to change notification settings - Fork 963
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
whoami performance #2402
Comments
if the orm allows it, could arguably reduce to a single network roundtrip to db via left joins which might even get latency down to 1/3 its current levels. |
Actually, I think this is #2338 |
I have few enough rows (hundreds) that execution time is <1ms on the postgres instance. I believe the 4 sequential network roundtrips to the db (+ maybe schema parsing & marshalling) to be the dominating factors. |
I see, the DB roundtrips will add to the overall latency for sure (at least 4x it as you said) and I agree that loading it in one go will most likely make it much faster. One option to be 100% certain would be to use the tracing adapter of Ory Kratos and see in Jaeger what takes the most time when calling this endpoint. https://www.ory.sh/docs/kratos/guides/tracing To run it, do in project root:
And then open in your browser http://localhost:16686 - but make sure to do some API requests to kratos first :) |
I was working on missing indices today and you're right, it appears that all indices are there and can be used. What we thus need to resolve indeed is fetching the identity in one larger transaction vs. iteratively. |
In general, the whole analysis is on point! |
@aeneasr how about the schema parsing/deserialization piece? It has been a while but I think I saw that the schemas get read out & parsed on every request too. Do you think it's reasonable to parse that at startup and cache it? I saw some for loops in the codebase doing a temporary local cache on the stack. Like I mentioned in the "additional context" section:
|
Ty a ton for the eager load change btw. It will have a big impact on the latency of all deployments :) |
We currently try to avoid caching as much as possible due to security concerns. If we set up caching, it will be in a much more powerful place, such as making I've made some changes so it no longer en and decodes the whole config but instead just the needed keys, which should significantly reduce the amount of time needed. Compared to two DB queries I think the amount of time we spend on this is currently negligible. IMO the next improvement would be to have only one large JOIN or alternatively not expand the identity on default but instead only when requested:
That would make this really fast IMO and only a bit slower when you need the user's full data |
Stripe does this a lot: https://stripe.com/docs/api/expanding_objects |
This would be a breaking change in the API 🤷 This change was a no-op from my understanding of gobuffalo/pop. It would make a difference only if we were to retrieve multiple identities at once. gobuffalo/pop doesn't have way of retrieving all the data in one query beside writing a raw query. |
Possible that you're right here :/
True, we should come up with a concept that isn't neccessarily breaking (defaults to expand for example) and add it to the PR I will send you via PN. |
Introduces an expand API to the identity persister which greatly improves whoami performance. Closes #2402
Introduces an expand API to the identity persister which greatly improves whoami performance. Closes ory#2402
Introduces an expand API to the identity persister which greatly improves whoami performance. Closes ory#2402
Preflight checklist
Describe your problem
I was doing some whoami load testing on oryd/kratos:v0.8.0-alpha.2 with (same datacenter in google cloud but not a localhost network) 2cpu 8gb postgres with 100s of sessions, 10s of identities.
min was 15ms, 50-percentile was 20ms, 99-percentile was 200ms under a regular traffic load -- variable 0-10qps.
whoami handler calls GetSessionByToken
GetSessionByToken does a
select * from sessions where token = ? and nid = ? limit 1
and then calls GetIdentity contingent on session being found.GetIdentity sequentially issues:
select * from identities where id = ? and nid = ?
then
p.findVerifiableAddresses :
select ... [verifiiable addresses] ...
then
p.findRecoveryAddresses :
select ... [recovery addresses] ...
then
p.injectTraitsSchemaURL
which is
ms, err := m.Config(ctx).IdentityTraitsSchemas()
marshal + get bytes + decode, quick loop to find matching schema, and then a read out of a schema.Describe your ideal solution
I'm a javascript guy but I assume there is something akin to async / await in go.
There are 4 sequential db round trips in the code path currently. Read session + read identities + read verifiable addresses + read recovery addresses + followed by what appears to be some schema parsing.
I'd be curious if this could be turned into (read session) + parallel (read identities, read addresses, read recovery addresses, do schema lookup). I think this would roughly half the median whoami latency. Session has identity id + nid which I think are good enough to parallelize all 3 of those other calls.
I could be wrong -- but the schema parsing step sounds expensive; I'm curious if that could be cached in some kind of global scope or registry on start-up. Maybe it is already cached and I can't read go -- hard for me to follow golang with stuff like
p.p.Marshal(kjson.Parser())
Workarounds or alternatives
Version
oryd/kratos:v0.8.0-alpha.2 + postgres
Additional Context
I saw what I think are a similar set of changes in the identities list call: 14762d2
Temporary schema cache in the local func scope (presumably because it actually is expensive?)
Eager("VerifiableAddresses", "RecoveryAddresses")
The text was updated successfully, but these errors were encountered: