Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add server-side Redis caching layer to Relay store backend #5684

Merged
merged 17 commits into from
Jun 4, 2020

Conversation

damassi
Copy link
Member

@damassi damassi commented May 30, 2020

With the new AppShell we run into a fundamental incompatibility around full-page caching. A number of the issues have been outlined here. (See original PR: #5047.)

A better approach is to utilize the isomorphic relay caching middleware infra we already have in place (currently only used on the client), and do response caching backed by Redis on the server.

The flow:

  1. User hits /artist/id (or whatever route ATM, I haven't scoped things in this POC)
  2. Data-wise, this corresponds to a GraphQL query and variables (artistID)
  3. This is our lookup key for the data store, both for relay's cache and redis
  4. On the server, we store the response from metaphysics by said lookup key
  5. On next request, we look in redis to see if that key exists, and return the data
  6. Things then flow naturally through our relay network infrastructure with no further changes

I achieved this by taking existing cacheMiddleware and writing a light wrapper around it that communicates with Redis. Might be something to think about PR'ing upstream once the idea is fleshed out, some kind of callback hook that taps into the middleware's get / set lifecycle for a secondary backing datastore.

Stats

Some preliminary stats running from my local machine (I imagine it will be faster in prod):

Artist

No cache

GET /artist/pablo-picasso 200 1144.073 ms
GET /artist/pablo-picasso 200 1137.413 ms
GET /artist/pablo-picasso 200 1376.499 ms
GET /artist/pablo-picasso 200 1427.872 ms

Cache

GET /artist/pablo-picasso 200 746.666 ms
GET /artist/pablo-picasso 200 688.605 ms
GET /artist/pablo-picasso 200 645.010 ms
GET /artist/pablo-picasso 200 771.975 ms

Comparison with full page caching

See example implementation here: d69d4a0

GET /artist/pablo-picasso 304 461.376 ms - -
GET /artist/pablo-picasso 304 570.946 ms - -
GET /artist/pablo-picasso 304 463.629 ms - -
GET /artist/pablo-picasso 304 738.698 ms - -
GET /artist/pablo-picasso 304 670.906 ms - -
GET /artist/pablo-picasso 304 632.285 ms - -
GET /artist/pablo-picasso 304 690.733 ms - -
GET /artist/pablo-picasso 304 548.512 ms - -
GET /artist/pablo-picasso 304 791.889 ms - -

Artwork

No cache

GET /artwork/pablo-picasso-guernica 200 1441.899 ms
GET /artwork/pablo-picasso-guernica 200 1879.929 ms
GET /artwork/pablo-picasso-guernica 200 1980.859 ms
GET /artwork/pablo-picasso-guernica 200 1434.589 ms

Cache

GET /artwork/pablo-picasso-guernica 200 780.627 ms
GET /artwork/pablo-picasso-guernica 200 803.653 ms
GET /artwork/pablo-picasso-guernica 200 850.341 ms
GET /artwork/pablo-picasso-guernica 200 837.793 ms

Collection

No cache

GET /collection/painting 200 2598.225 ms
GET /collection/painting 200 1753.363 ms
GET /collection/painting 200 2432.924 ms
GET /collection/painting 200 1980.165 ms

Cache

GET /collection/painting 200 1058.667 ms
GET /collection/painting 200 1054.286 ms
GET /collection/painting 200 1080.394 ms
GET /collection/painting 200 1123.331 ms

@damassi damassi force-pushed the caching-sketch branch 3 times, most recently from 27c6ec9 to 92d5379 Compare May 31, 2020 19:31
export interface CacheConfig {
size: number
ttl: number
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@damassi damassi force-pushed the caching-sketch branch 3 times, most recently from c897224 to add2d48 Compare May 31, 2020 19:47
@damassi damassi requested a review from dzucconi May 31, 2020 20:08
const client = redis.createClient()

this.redisCache = {
del: promisify(client.del).bind(client),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@mzikherman mzikherman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sick! We should probably roll with this (and have it work for all our V2 apps), and then also disable and remove the page cache stuff.

src/v2/Apps/Artist/routes.tsx Outdated Show resolved Hide resolved
src/lib/environment.js Show resolved Hide resolved
src/v2/Artsy/Relay/middleware/cache/Cache.ts Show resolved Hide resolved
Copy link
Contributor

@joeyAghion joeyAghion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two other notes:

src/v2/Artsy/Relay/middleware/cache/Cache.ts Show resolved Hide resolved
src/v2/Artsy/Relay/middleware/cache/Cache.ts Outdated Show resolved Hide resolved
@damassi damassi force-pushed the caching-sketch branch 2 times, most recently from 8cbd06d to ceff36d Compare June 3, 2020 01:38
@damassi damassi force-pushed the caching-sketch branch 2 times, most recently from c6dda7f to 0fcdccf Compare June 3, 2020 01:48
Copy link
Contributor

@joeyAghion joeyAghion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • We shouldn't merge this until the datadog tracing is restored since that will be necessary to evaluate its impact.
  • I can't tell if this is scoped to any particular requests (like for the artist page). I think it should be, at least initially. There are too many state- or time-dependent pages that would need special care otherwise.

src/v2/Artsy/Relay/middleware/cache/Cache.ts Outdated Show resolved Hide resolved
src/v2/Artsy/Relay/middleware/cache/Cache.ts Outdated Show resolved Hide resolved
@damassi
Copy link
Member Author

damassi commented Jun 3, 2020

@joeyAghion

We shouldn't merge this until the datadog tracing is restored since that will be necessary to evaluate its impact.

@icirellik and I discussed some possibilities here but not quite sure what next steps are or even if there's a straightforward path forward. We currently are tracing all requests, but they happen to be forwarded to /* as described in this slack convo. I opened this PR which should hopefully address: #5700.

I can't tell if this is scoped to any particular requests (like for the artist page). I think it should be, at least initially. There are too many state- or time-dependent pages that would need special care otherwise.

Right now it applies to all routes that do not have cacheConfig.force = true set -- if and only if the user is logged out. When the user is logged in we don't cache on the server. Routes that have cacheConfig.force = true turned on (such as /artwork/id) don't cache on the server. Artwork page contains all the state, whether logged in or logged out.

Can you think of some other state or time dependent pages where we would want this turned off when logged out? The only apps that this covers are those located here. Most stateful behavior requires a logged-in state. But either way, we can configure this however.

@joeyAghion
Copy link
Contributor

Regarding which apps might not be OK to cache, I'm thinking of things like viewing rooms, shows, features, auctions, and fairs, which all have time dependent states and treatments. I agree that this approach might work in most cases but I'd rather scope it more narrowly to start. Not a blocker if you feel strongly that this is a better default though.

If an app does want to opt out, would it just set force: true somewhere, or would there be some logic in the middleware for that route?

@damassi
Copy link
Member Author

damassi commented Jun 3, 2020

viewing rooms, shows, features, auctions, and fairs

Aside from viewing rooms and our new features, the other apps are all in our old system and don't apply. Features don't have any real state, and viewing rooms have an expiration (but the exipiration +- 15m seems like an ok tradeoff). With our cache expiry currently set to 15 minutes I'm not too worried about things when logged out, but should return to this once it's out and we get a better look at the numbers. Lets see how it goes for a minute, we don't have much state in the new AppShell other than the Artwork page.

If an app does want to opt out, would it just set force: true somewhere, or would there be some logic in the middleware for that route?

All config is co-located on the route so we can tune this however we see fit, eg:

// Simple version 
path: '/artist/:id',
Component: ArtistApp,
cacheConfig: { 
  force: true 
}

// or 
path: '/artist/:id',
Component: ArtistApp,
getCacheConfig: (route, { context }) => {
  if (route.pathname === 'foo' && context.user) {
    return {
      force: true 
    } 
  }
}

@damassi
Copy link
Member Author

damassi commented Jun 3, 2020

(Handing this off to @mzikherman - ty!)

})

describe("server", () => {
// it("does not set cache if enableServerSideCaching=false", async () => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are good specs, just need to figure out how to mock the new cacheClient properly (and keep these specs/assertions).

Wanted to comment out to see if there was other test issues/flakiness in the global suite on CI before revisiting this known one.

@mzikherman
Copy link
Contributor

Ok, i think this is good to go (🤞 🍏 ).

Since the Datadog monitoring has been fixed, I think we can merge this in, and check on staging + merge to production? We can monitor latency by endpoint (we might expect artist page latency to increase, since we'll be disabling the page cache, but the other non-artwork V2 apps should have their latency improve). We can also check to see how the cache is working by doing Papertrail searches (can see the writes/reads, similar to page caching. The log lines are prefixed with "[RelayCache]", etc.).

@damassi damassi merged commit f0019a0 into master Jun 4, 2020
@damassi damassi deleted the caching-sketch branch June 4, 2020 19:18
@artsy-peril artsy-peril bot mentioned this pull request Jun 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants