Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

whoami performance #2402

Closed
4 of 6 tasks
fmmoret opened this issue Apr 15, 2022 · 13 comments · Fixed by #2995
Closed
4 of 6 tasks

whoami performance #2402

fmmoret opened this issue Apr 15, 2022 · 13 comments · Fixed by #2995
Assignees
Labels
feat New feature or request.

Comments

@fmmoret
Copy link

fmmoret commented Apr 15, 2022

Preflight checklist

Describe your problem

I was doing some whoami load testing on oryd/kratos:v0.8.0-alpha.2 with (same datacenter in google cloud but not a localhost network) 2cpu 8gb postgres with 100s of sessions, 10s of identities.

min was 15ms, 50-percentile was 20ms, 99-percentile was 200ms under a regular traffic load -- variable 0-10qps.


whoami handler calls GetSessionByToken

GetSessionByToken does a select * from sessions where token = ? and nid = ? limit 1 and then calls GetIdentity contingent on session being found.

GetIdentity sequentially issues:
select * from identities where id = ? and nid = ?
then
p.findVerifiableAddresses : select ... [verifiiable addresses] ...
then
p.findRecoveryAddresses : select ... [recovery addresses] ...
then
p.injectTraitsSchemaURL
which is
ms, err := m.Config(ctx).IdentityTraitsSchemas() marshal + get bytes + decode, quick loop to find matching schema, and then a read out of a schema.

Describe your ideal solution

I'm a javascript guy but I assume there is something akin to async / await in go.

There are 4 sequential db round trips in the code path currently. Read session + read identities + read verifiable addresses + read recovery addresses + followed by what appears to be some schema parsing.

I'd be curious if this could be turned into (read session) + parallel (read identities, read addresses, read recovery addresses, do schema lookup). I think this would roughly half the median whoami latency. Session has identity id + nid which I think are good enough to parallelize all 3 of those other calls.

I could be wrong -- but the schema parsing step sounds expensive; I'm curious if that could be cached in some kind of global scope or registry on start-up. Maybe it is already cached and I can't read go -- hard for me to follow golang with stuff like p.p.Marshal(kjson.Parser())

Workarounds or alternatives

Version

oryd/kratos:v0.8.0-alpha.2 + postgres

Additional Context

I saw what I think are a similar set of changes in the identities list call: 14762d2

Temporary schema cache in the local func scope (presumably because it actually is expensive?)

  • Eager("VerifiableAddresses", "RecoveryAddresses")
@fmmoret fmmoret added the feat New feature or request. label Apr 15, 2022
@fmmoret
Copy link
Author

fmmoret commented Apr 15, 2022

if the orm allows it, could arguably reduce to a single network roundtrip to db via left joins which might even get latency down to 1/3 its current levels.

@aeneasr
Copy link
Member

aeneasr commented Apr 15, 2022

Thank you for the great report!

IIRC there was a missing index which has since been resolved - could you please check with version branch 0.9.0-alpha.3? Alternatively, this might be a dupe of: #2338, #2262

@aeneasr
Copy link
Member

aeneasr commented Apr 15, 2022

Actually, I think this is #2338

@fmmoret
Copy link
Author

fmmoret commented Apr 15, 2022

I have few enough rows (hundreds) that execution time is <1ms on the postgres instance. I believe the 4 sequential network roundtrips to the db (+ maybe schema parsing & marshalling) to be the dominating factors.

@aeneasr
Copy link
Member

aeneasr commented Apr 15, 2022

I see, the DB roundtrips will add to the overall latency for sure (at least 4x it as you said) and I agree that loading it in one go will most likely make it much faster.

One option to be 100% certain would be to use the tracing adapter of Ory Kratos and see in Jaeger what takes the most time when calling this endpoint.

https://www.ory.sh/docs/kratos/guides/tracing

To run it, do in project root:

$ docker-compose -f quickstart.yml -f quickstart-postgres.yml -f quickstart-tracing.yml up

And then open in your browser http://localhost:16686 - but make sure to do some API requests to kratos first :)

@aeneasr
Copy link
Member

aeneasr commented May 12, 2022

I was working on missing indices today and you're right, it appears that all indices are there and can be used. What we thus need to resolve indeed is fetching the identity in one larger transaction vs. iteratively.

@aeneasr
Copy link
Member

aeneasr commented May 12, 2022

In general, the whole analysis is on point!

aeneasr added a commit that referenced this issue May 12, 2022
@fmmoret
Copy link
Author

fmmoret commented May 12, 2022

@aeneasr how about the schema parsing/deserialization piece? It has been a while but I think I saw that the schemas get read out & parsed on every request too. Do you think it's reasonable to parse that at startup and cache it?

I saw some for loops in the codebase doing a temporary local cache on the stack.

Like I mentioned in the "additional context" section:

I saw what I think are a similar set of changes in the identities list call: https://github.com/ory/kratos/commit/14762d24a3844be982f3b4208dcf5239239a4475

Temporary schema cache in the local func scope (presumably because it actually is expensive?)

@fmmoret
Copy link
Author

fmmoret commented May 12, 2022

Ty a ton for the eager load change btw. It will have a big impact on the latency of all deployments :)

@aeneasr
Copy link
Member

aeneasr commented May 12, 2022

We currently try to avoid caching as much as possible due to security concerns. If we set up caching, it will be in a much more powerful place, such as making whoami extremely performant by not having any type of request roundtrip :)

I've made some changes so it no longer en and decodes the whole config but instead just the needed keys, which should significantly reduce the amount of time needed. Compared to two DB queries I think the amount of time we spend on this is currently negligible. IMO the next improvement would be to have only one large JOIN or alternatively not expand the identity on default but instead only when requested:

GET /sessions/whoami

{
  "id": "...",
  ...
  "identity": {
    "id": "..."
   }
}
GET /sessions/whoami?expand=identity

{
  "id": "...",
  ...
  "identity": {
    "id": "..."
     "traits": "..."
     ...
   }
}

That would make this really fast IMO and only a bit slower when you need the user's full data

@aeneasr
Copy link
Member

aeneasr commented May 12, 2022

Stripe does this a lot: https://stripe.com/docs/api/expanding_objects

aeneasr added a commit that referenced this issue May 13, 2022
@alnr
Copy link
Contributor

alnr commented Oct 6, 2022

Stripe does this a lot: https://stripe.com/docs/api/expanding_objects

This would be a breaking change in the API 🤷

This change was a no-op from my understanding of gobuffalo/pop. It would make a difference only if we were to retrieve multiple identities at once.

gobuffalo/pop doesn't have way of retrieving all the data in one query beside writing a raw query.

@aeneasr
Copy link
Member

aeneasr commented Oct 6, 2022

This change was a no-op from my understanding of gobuffalo/pop. It would make a difference only if we were to retrieve multiple identities at once.

Possible that you're right here :/

This would be a breaking change in the API 🤷

True, we should come up with a concept that isn't neccessarily breaking (defaults to expand for example) and add it to the PR I will send you via PN.

aeneasr added a commit that referenced this issue Jan 2, 2023
aeneasr added a commit that referenced this issue Jan 2, 2023
aeneasr added a commit that referenced this issue Jan 3, 2023
@aeneasr aeneasr self-assigned this Jan 3, 2023
aeneasr added a commit that referenced this issue Jan 10, 2023
Introduces an expand API to the identity persister which greatly improves whoami performance.

Closes #2402
CNLHC pushed a commit to seekthought/kratos that referenced this issue May 16, 2023
Introduces an expand API to the identity persister which greatly improves whoami performance.

Closes ory#2402
peturgeorgievv pushed a commit to senteca/kratos-fork that referenced this issue Jun 30, 2023
peturgeorgievv pushed a commit to senteca/kratos-fork that referenced this issue Jun 30, 2023
Introduces an expand API to the identity persister which greatly improves whoami performance.

Closes ory#2402
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat New feature or request.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants