Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eager / proactive sitemap generation #149

Closed
rudolfbyker opened this issue Oct 12, 2023 · 7 comments
Closed

Eager / proactive sitemap generation #149

rudolfbyker opened this issue Oct 12, 2023 · 7 comments
Labels
enhancement New feature or request

Comments

@rudolfbyker
Copy link

Clear and concise description of the problem

Currently, the sitemap is generated on-demand / lazily, i.e. when a request for /sitemap_index.xml comes in. My sitemap takes a few minutes to generate, since it is HUGE. What I would like to see is a proactive (re)generation of the sitemap, so that it's always fast when the request comes in.

Suggested solution

Config in nuxt.config.ts:

{
  ...,
  sitemap: {
    eager: true,
    cacheTtl: 1000 * 60 * 60 * 24, // 1 day
  },
  ...
}

Expected behavior for eager=true:

  • Start generating the sitemap as soon as the server starts
  • Start re-generating the sitemap as soon as the cache expires.
  • Always serve the latest available sitemap – never delay!
  • If the server has started recently, and the sitemap is not available yet, await it – never give 404! This is the worst case scenario, but is still better than starting to generate the sitemap when the first request comes in.

Alternative

The only alternative I can think of is writing my own nitro plugin that does this. But I would rather contribute something here.

Additional context

No response

@rudolfbyker rudolfbyker added the enhancement New feature or request label Oct 12, 2023
@harlan-zw
Copy link
Collaborator

Hi @rudolfbyker, thanks for the suggestion.

This makes perfect sense. Ideally, the cache would be built at build time and then recreated with a SWR caching policy.

I believe there are some blockers upstream for the build time caching so unfortunately it's not a quick fix. You could experiment with implementing a SWR cache route rules in the meantime.

@rudolfbyker
Copy link
Author

Thanks for your time.

I did not even think about build time caching. My suggestion would be that we make build time caching a separate feature. On startup, if it was not generated during build time, we simply do it when the server starts.

I tried this as a workaround (using very short cache times just for testing):

routeRules: {
  "/sitemap_index.xml": {
    swr: 60, // Y
  },
  "/*-sitemap.xml": {
    swr: 60, // Y
  },
},
sitemap: {
  cacheTtl: 10, // X
}

This totally works, except for two things:

  • I always get x-sitemap-cache: MISS in the response, which is understandable given our workaround, but weird for others. But maybe this is harmless?
  • No warmup on server start, so the first request takes a while.

The warmup can be implemented easily using a server plugin, e.g. /server/plugins/cacheWarmer.ts:

export default defineNitroPlugin((nitroApp) => {
  process.stdout.write("Warming up sitemap …\n");
  $fetch("/sitemap_index.xml").then(() => {
    process.stdout.write("Sitemap is ready.\n");
  });
});

Regarding build-time caching: Do you have links handy to the upstream issues that might be blocking this?

@rudolfbyker
Copy link
Author

Update:

  • It seems that /api/_sitemap-urls.ts is called for every page of the sitemap, not just the index. Therefore I had to use defineCachedEventHandler instead of defineEventHandler there.
  • If I warm up /api/_sitemap-urls.ts (as soon as the server starts, and hopefully before the first request to any sitemap page comes in), it's actually fast enough for me, so maybe I don't need the eager cache on the level of this module like I initially thought.
  • I need to wait for the nitroApp to be ready before calling $fetch from the cacheWarmup nitro plugin, just like you did here harlan-zw/nuxt-link-checker@646e6ff . I'm wondering if there is a better way to wait for it. 200ms may be too much or too little, depending on the infrastructure.

@harlan-zw
Copy link
Collaborator

harlan-zw commented Nov 11, 2023

In v4 when using multi sitemaps and the index sitemap is requested, it won't load all the subsequent child sitemap data anymore.

Theoretically, this should be the main blocker? You should be able to split up all of your URLs in other sitemaps to decrease the loading time. Also with the SWR cache enabled it should solve some issues.

Happy to re-open if I missed something

@rudolfbyker
Copy link
Author

I think this is fine, although it will be a few days before I can look into this again. You just missed my question above:

I need to wait for the nitroApp to be ready before calling $fetch from the cacheWarmup nitro plugin, just like you did here harlan-zw/nuxt-link-checker@646e6ff . I'm wondering if there is a better way to wait for it. 200ms may be too much or too little, depending on the infrastructure.

Any ideas? :)

@harlan-zw
Copy link
Collaborator

Made an upstream PR to hopefully resolve unjs/nitro#1906

@harlan-zw
Copy link
Collaborator

harlan-zw commented Nov 13, 2023

I've pushed up experimental support for this natively, may implement by default in the future.

You can try it out with experimentalWarmUp, you can see the code here: https://github.com/harlan-zw/nuxt-simple-sitemap/blob/main/src/runtime/plugins/warm-up.ts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants