Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/garbage collect cache #4681

Conversation

thomassuckow
Copy link

This is an attempt at implementing a garbage collection capability to the inmemory cache.

A request for GC: #3965
Discussion about an eviction mechanism: apollographql/apollo-feature-requests#4

This PR includes both a gc() method that removes entries not being watched as well as an attempt at evict(...). I suspect evict will need more work as I haven't really tested it. I also suspect much critique could be made about code style. Though I did this effectively in a "spike" so I wanted to get what I can out here so others can guide making this a mergeable product.

Personally, I am most interested in the gc() method as it would resolve my immediate needs for relieving memory pressure. My initial experiments were very promising and I was able to run it in an interval timer without ill effect.

Copy link
Member

@benjamn benjamn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this proof-of-concept @thomassuckow! I agree there would be a lot of value in an explicit optional gc() method. In fact, there are a couple of other caches we could empty out at the same time, reclaiming quite a bit of memory.

Just to be totally transparent with you, we probably will not merge this PR until we're ready to begin the cache invalidation work in Apollo Client 3.0, but we will be sure to use as much of this code as possible. I know that's not the best news, but I didn't want you to think there was something secretly wrong with this PR that kept us from merging it.

@@ -516,7 +529,7 @@ export class StoreReader {
Object.freeze(result);
}

return { result, missing };
return { result, missing, involvedFields };
Copy link
Member

@benjamn benjamn Apr 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really glad you figured out this information should be returned from the execute* functions, since that allows us to cache it along with the result and missing fields.

return { success: true };
}

//Garbage Collect Cache entries that are not being watched
Copy link
Contributor

@wtrocki wtrocki Jun 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering about reason why this is actually connected to cache.watch.
Would this be expected to remove items that are not watched currently.

Would it make sense to gc elements that are simply not used in queries - means that they have no ability to be fetched anymore?
It is common to watch for certain queries within view but after navigating to different components moving to another set of elements to watch.

Does this mean that doing gc will mean that at the moment we will need to
watch all queries that should stay in cache?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this be expected to remove items that are not watched currently.

Yes

I'm not sure I fully understand what point you are trying to make, so I am going to answer the rest of your questions backwards.

Does this mean that doing gc will mean that at the moment we will need to
watch all queries that should stay in cache?

Yes, if desired it should be possible to allow this method to take a list of queries + variables and not remove them as well.

Would it make sense to gc elements that are simply not used in queries - means that they have no ability to be fetched anymore?

Nothing in the cache has "no ability to be fetched anymore", if you were to do a one off query then do the same query again it would pull it from the cache. So as a compromise, I assume anything cared about is being watched. I would be interested to hear use cases where things need to remain in the cache and are accessed via non-watched queries.

It is common to watch for certain queries within view but after navigating to different components moving to another set of elements to watch.

That is the use case this is targeting, we have queries returning large amounts of data but once they leave the view it is unlikely the user will access the same data again. Or if they do, it is acceptable to incur the cost of hitting the server again.

This is also why i think gc should have to be called explicitly, then the app creator can decide the most ideal time to perform this operation, it could be after 2min or 20min of user inactivity in a requestidlecallback.

Copy link
Contributor

@wtrocki wtrocki Jun 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My use case will be strongly coming from projects that offer offline support where all cache is persisted in apollo-cache-persist.
All data that lands in the cache will stay there even after the restart which obviously brings the requirement to remove items from the cache once they were removed from the query as this will mean that they are no longer needed. For example, when executing deleteItem mutation we know that item should no longer be there and we can remove it from query, but sadly item will still stay on the root - workarounds we have used in many projects is to write empty object with just ID.

When making a query to the server we sometimes need to handle the situation when objects are removed and simply there is no way to do that now in apollo-client. While this PR is amazing it will be nice to extend it to handle rootId on evict as well so when performing cache updates we can remove that item. I have intentionally skipped code from this PR and added just idea. More info here:

#4917 and in apollo-feature-requests/issues/4

Copy link
Author

@thomassuckow thomassuckow Jun 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are totally right, I spaced the case where the query is refreshed and may leave dangling things in the cache.

I believe a modified version of cleanupCacheReferences() in my proof of concept may cover your case. Where cleanupCacheReferences is looking for missing keys and nuking the parent item, you would want to traverse the cache and find things that are not referenced at all.

Edit: Your use case may be a more appropriate to the name "gc" than mine.

@tatchi
Copy link

tatchi commented Jul 10, 2019

Don't sure that it could be useful in this scenario but just putting this here:

https://v8.dev/features/weak-references

benjamn added a commit that referenced this pull request Sep 11, 2019
Eviction always succeeds if the given options.rootId is contained by the
cache, but it does not automatically trigger garbage collection, since the
developer might want to perform serveral evictions before triggering a
single garbage collection.

Resolves apollographql/apollo-feature-requests#4.
Supersedes #4681.
benjamn added a commit that referenced this pull request Sep 12, 2019
Eviction always succeeds if the given options.rootId is contained by the
cache, but it does not automatically trigger garbage collection, since the
developer might want to perform serveral evictions before triggering a
single garbage collection.

Resolves apollographql/apollo-feature-requests#4.
Supersedes #4681.
@benjamn
Copy link
Member

benjamn commented Sep 12, 2019

Closing in favor of #5310. Feedback welcome!

@benjamn benjamn closed this Sep 12, 2019
benjamn added a commit that referenced this pull request Sep 12, 2019
Eviction always succeeds if the given options.rootId is contained by the
cache, but it does not automatically trigger garbage collection, since the
developer might want to perform serveral evictions before triggering a
single garbage collection.

Resolves apollographql/apollo-feature-requests#4.
Supersedes #4681.
StephenBarlow pushed a commit that referenced this pull request Oct 1, 2019
Eviction always succeeds if the given options.rootId is contained by the
cache, but it does not automatically trigger garbage collection, since the
developer might want to perform serveral evictions before triggering a
single garbage collection.

Resolves apollographql/apollo-feature-requests#4.
Supersedes #4681.
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 1, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants