Skip to content
This repository has been archived by the owner on Feb 9, 2022. It is now read-only.

Update Docusaurus search configs for sitemap and start urls #438

Merged

Conversation

JoelMarcey
Copy link
Contributor

Use db60b72 for the actual Docusaurus site

@s-pace s-pace merged commit 5f8fff9 into algolia:master Jun 6, 2018
s-pace pushed a commit that referenced this pull request Jun 6, 2018
@s-pace
Copy link
Contributor

s-pace commented Jun 6, 2018

👋 @JoelMarcey

I had to put the stop_urls on the numeral version pages since they introduce a lot of duplicates. These pages are really similar

"stop_urls": [
    "help",
    "users",
    "https://docusaurus.io/docs/en/[0-9].*"
  ],

Could it be possible to only index major versions?


The integration will change:

<!-- at the end of the HEAD -->
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/docsearch.js@2/dist/cdn/docsearch.min.css" />

<!-- at the end of the BODY -->
<script type="text/javascript" src="https://cdn.jsdelivr.net/npm/docsearch.js@2/dist/cdn/docsearch.min.js"></script>
<script type="text/javascript"> docsearch({
  apiKey: '3eb9507824b8be89e7a199ecaa1a9d2c',
  indexName: 'docusaurus',
  inputSelector: '### REPLACE ME ####',
  algoliaOptions: { 'facetFilters': ["lang:$LANG", "version:$VERSION", "tags:$TAGS"] },
  debug: false // Set debug to true if you want to inspect the dropdown
});
</script>
  • Add a search input in your page if you don't have any yet. Then update the inputSelector value in JS snippet to a CSS selector that targets your search input field.

  • Replace $LANG with the lang you want to search on.
    The list of possible lang is hardcoded in the config.
    So as of today you have: en

  • Replace $VERSION with the version you want to search on.
    The list of possible version is hardcoded in the config.
    So as of today you have: latest, next

  • Replace $TAGS with the tags you want to search on.
    The list of possible tags is hardcoded in the config.
    So as of today you have: blog

@s-pace
Copy link
Contributor

s-pace commented Jun 6, 2018

(Changes are live, you can check it now)

@JoelMarcey
Copy link
Contributor Author

Hi @s-pace 👋

Thanks for your help here.

Currently for Docusaurus, the default documentation is corresponds to the latest version of Docusaurus. So in our current case, it is 1.1.5. Updates to the docs before we publish a new version are shown in next. As it stands, with the current Algolia search config, next is what is searched upon. latest (which would correspond to 1.1.5 now), can be searched upon.

In https://docusaurus.io/en/versions.html, that corresponds to:

<h3 id="latest">Current version (Stable)</h3>

Could it be possible to only index major versions?

So to answer your question, maybe, we only search latest since that is the default? And we don't even search next?

Replace $TAGS with the tags you want to search on. The list of possible tags is hardcoded in the config. So as of today you have: blog

Should we also have docs as an option here too?

s-pace pushed a commit that referenced this pull request Jun 7, 2018
s-pace pushed a commit that referenced this pull request Jun 7, 2018
…without .html (redirection should be needed) #438
@s-pace
Copy link
Contributor

s-pace commented Jun 7, 2018

👋 @JoelMarcey

Thank you for your feedback.

I have checked with the team and we can index every versions of the website https://docusaurus.io if you need it. Otherwise we can only do the latest one. Up to you and let me know.

As for the other forked projects, we will firstly only crawl the latest version.

Should we also have docs as an option here too?

Done. It is live.

In order to avoid duplicates, please leverage:

'facetFilters': ["lang:$LANG", "version:$VERSION", "tags:$TAGS"]

We have also noticed that pages from the latest version are available under URLs with or without trailing .html. Could it be possible to add redirection on these pages? I have prevent our tool to crawl pages without the trailing pages .html thanks to the regex ^((?!\\.html).)*$

@JoelMarcey
Copy link
Contributor Author

Hi @s-pace

Thanks for updating the config...

A couple of questions:

Right now, on https://docusaurus.io, we are seeing duplicates (from the current version and next). Are you saying, if I add 'facetFilters': ["lang:$LANG", "version:$VERSION", "tags:$TAGS"] exactly as is to our Algolia config, that would stop? I don't need to replace $LANG with anything?

Could it be possible to add redirection on these pages?

Redirection within the Algolia config?

Or redirection in the core Docusaurus code/site?

By default, all Docusaurus sites have .html. However, we provide an option to where you can have your URLs be clean without the .html. If that option is set, both URLs work. Are you saying we should update the Docusaurus core code is set that if the clean URL option is set, we should redirect to the clean url instead of allowing both?

Thanks!

@s-pace
Copy link
Contributor

s-pace commented Jun 7, 2018

👋 @JoelMarcey

1

In order to leverage the version you can only use the right version facet:

'facetFilters': ["version:latest"]

This is the classic usage of algolia parameter facetFilters.

2

The redirection will be within your core website. Since two different URLs are available under HTTP 200, we do scrap both of them (<url> and <url>.html).

If you are fine with the DocSearch Search UI redirecting to .html pages, you don't have to bother.

By default, all Docusaurus sites have .html. However, we provide an option to where you can have your URLs be clean without the .html. If that option is set, both URLs work. Are you saying we should update the Docusaurus core code is set that if the clean URL option is set, we should redirect to the clean url instead of allowing both?

👍 You got it. Since we pick up the whole URL, we are redirecting to this one. Thank you for the insight.

Feel free to ask us anything

@JoelMarcey
Copy link
Contributor Author

👋 @s-pace

Does facebook/docusaurus#744 seem reasonable and like the right way to go?

@s-pace
Copy link
Contributor

s-pace commented Jun 11, 2018

@JoelMarcey Yep you will only need to replace VERSION and LANGUAGE with the right one.

Merging it and I will comment facebook/docusaurus#744

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants