Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revise the lang attribute problem #1542

Closed
alrra opened this issue May 10, 2014 · 14 comments
Closed

Revise the lang attribute problem #1542

alrra opened this issue May 10, 2014 · 14 comments

Comments

@alrra
Copy link
Member

alrra commented May 10, 2014

Problem Overview

Developers often forgot to change the value of the lang attribute, so we decided to remove it from <html>.

The change isn't quite good enough as some users:

  • won't read the docs, and thus, they will find the absence of lang attribute confusing, or won't even know that they should be adding it
  • won't even notice the absence of the lang attribute, and thus will forget to add it anyway (maybe expecting it to be there)

Specification and Browsers

From http://www.whatwg.org/specs/web-apps/current-work/multipage/elements.html#attr-lang:

To determine the language of a node, user agents must look at the nearest ancestor element (including the element itself if the node is an element) that has a lang attribute in the XML namespace set or is an HTML element and has a lang in no namespace attribute set. That attribute specifies the language of the node (regardless of its value).

If both the lang attribute in no namespace and the lang attribute in the XML namespace are set on an element, user agents must use the lang attribute in the XML namespace, and the lang attribute in no namespace must be ignored for the purposes of determining the element's language.

If neither the node nor any of the node's ancestors, including the root element, have either attribute set, but there is a pragma-set default language set, then that is the language of the node. If there is no pragma-set default language set, then language information from a higher-level protocol (such as HTTP), if any, must be used as the final fallback language instead. In the absence of any such language information, and in cases where the higher-level protocol reports multiple languages, the language of the node is unknown, and the corresponding language tag is the empty string.

If the resulting value is not a recognized language tag, then it must be treated as an unknown language having the given language tag, distinct from all other languages.

So, it turns out that not specifying the lang attribute is the same as specifying it with the value of "", in both case the language being treated as unknown.

From what I've tested, browser respect the empty string rule (if you find one that doesn't, please let us know).

Solution?

Having the language being treated as unknown isn't really much better then lang="en" (e.g.: VoiceOver on iOS seems to be defaulting to English when the language is unknown, no matter what the default language is).

So, I think we should make it more obvious to the user that they need to specify the lang attribute, by maybe:

    1. Adding lang=""
    <!doctype html>
    <html class="no-js" lang="">
    1. Adding a comment
    <!doctype html>
    <!-- Specify the language of your content by adding the `lang` attribute to <html> -->
    <html class="no-js">
    1. Adding lang="" and a comment
    <!doctype html>
    <!-- Specify the language of your content by providing a value for the `lang` attribute -->
    <html class="no-js" lang="">
    1. Other (Which?)

Thoughts?

@reynaert1250
Copy link

I think number 3 is the best solution.

@ugogo
Copy link

ugogo commented May 10, 2014

The third option seems the best to me.

@mathiasbynens
Copy link
Member

I like number 3 too, but it would mean we’d partially revert our previous removal of all comments from the example HTML in favor of the docs.

If people don’t remove the comment, they add bloat to the <head> which is terrible for performance as described in the docs.

@TheDutchCoder
Copy link

I'm in favor of either 1 or 3. Not the biggest fan of putting in comments everywhere, but it might be justified in this case.

@alrra
Copy link
Member Author

alrra commented May 10, 2014

but it would mean we’d partially revert our previous removal of all comments from the example HTML in favor of the docs.

If people don’t remove the comment, they add bloat to the <head> which is terrible for performance as described in the docs.

@mathiasbynens yeah, but like with the GA tracking code, we should also make this somehow obvious. However, If I think more about it, providing just lang="" could also be ok, as the empty string can be a good indication that something is missing.

@peterblazejewicz
Copy link

+1 for option 3
+1 for option 1 if there is a checklist before going to production (I think there is no on yet for this project)

  • set language attribute of your page
  • change UA-XXXXX-X to be your site's ID.
  • ...

@roblarsen
Copy link
Member

Option 3, for sure. The empty lang attribute value would be glossed over by a lot of people, I think. This is a tricky situation in which we're providing something that isn't quite ready for prime time so adding a comment is more than justified.

@roblarsen
Copy link
Member

As an aside, I just looked back through a couple of projects I've spun up over the past few months and I had no lang attribute at all in them. This wasn't production code (unless you count book code samples,) so I might have caught it eventually, but removing the attribute tripped me up and I know better.

@mathiasbynens
Copy link
Member

Option 3, for sure. The empty lang attribute value would be glossed over by a lot of people, I think. This is a tricky situation in which we're providing something that isn't quite ready for prime time so adding a comment is more than justified.

Well, the empty attribute value is not really problematic, even if you were to use this in production – you just wouldn’t have the benefits of having defined the language.

I’m starting to lean towards option 1, if only because it doesn’t cause any harm if it’s ignored (i.e. if people leave the attribute value empty).

@QWp6t
Copy link
Contributor

QWp6t commented May 12, 2014

👍 for (1) or (3)

@JoshuaJones
Copy link

Leaning more for option 1. My preference is to have as few comments in the boilerplate on the get go. We use to explain the meta charset="utf-8" with a comment but that got removed I believe.

@alrra
Copy link
Member Author

alrra commented May 13, 2014

For now, we'll be going with just adding lang="" (same as it is with <title></title>).

Thanks for the feedback everyone!

@alrra alrra closed this as completed in d916a72 May 13, 2014
alrra added a commit that referenced this issue May 13, 2014
In the past we decided¹ to remove the `lang` attribute due to the fact
that developers were often forgetting to update its value. This change
turned out not to be the best solution because some of the users:

 * didn't read the documentation, and thus, they found the absence
   of the `lang` attribute confusing, or didn't even knew they had
   to add it

 * didn't notice the absence of the `lang` attribute, and thus, forgot
   to include it (expecting it to be there)

To make things more clearer and to remind users that they need to
specify the primary language of the document, this commit reintroduces
the `lang` attribute, while leaving its value to `""`.

Using `lang=""` has the exact same effect as not specifying the `lang`
attribute at all, in both cases the language being treated as unknown².

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

¹ #1110
² From WHATWG (http://www.whatwg.org/specs/web-apps/current-work/multipage/elements.html#attr-lang):

  "To determine the language of a node, user agents must look at the
   nearest ancestor element (including the element itself if the node
   is an element) that has a lang attribute in the XML namespace set
   or is an HTML element and has a lang in no namespace attribute set.
   That attribute specifies the language of the node (regardless of
   its value).

   If both the lang attribute in no namespace and the lang attribute
   in the XML namespace are set on an element, user agents must use
   the lang attribute in the XML namespace, and the lang attribute in
   no namespace must be ignored for the purposes of determining the
   element's language.

   If neither the node nor any of the node's ancestors, including the
   root element, have either attribute set, but there is a pragma-set
   default language set, then that is the language of the node. If there
   is no pragma-set default language set, then language information from
   a higher-level protocol (such as HTTP), if any, must be used as the
   final fallback language instead. In the absence of any such language
   information, and in cases where the higher-level protocol reports
   multiple languages, the language of the node is unknown, and the
   corresponding language tag is the empty string.

   If the resulting value is not a recognized language tag, then it
   must be treated as an unknown language having the given language tag,
   distinct from all other languages."

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Close #1542.
@drublic
Copy link
Member

drublic commented May 14, 2014

  1. is a decent solution and we should not add more documentation with HTML comments in index.html anymore I believe. Devs who look into changing lang="" will figure out to look into the docs.

alrra added a commit to use-init/init that referenced this issue May 17, 2014
In the past we decided¹ to remove the `lang` attribute due to the fact
that developers were often forgetting to update its value. This change
turned out not to be the best solution because some of the users:

 * didn't read the documentation, and thus, they found the absence
   of the `lang` attribute confusing, or didn't even knew they had
   to add it

 * didn't notice the absence of the `lang` attribute, and thus, forgot
   to include it (expecting it to be there)

To make things more clearer and to remind users that they need to
specify the primary language of the document, this commit reintroduces
the `lang` attribute, while leaving its value to `""`.

Using `lang=""` has the exact same effect as not specifying the `lang`
attribute at all, in both cases the language being treated as unknown².

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

¹ h5bp/html5-boilerplate#1110
² From WHATWG (http://www.whatwg.org/specs/web-apps/current-work/multipage/elements.html#attr-lang):

  "To determine the language of a node, user agents must look at the
   nearest ancestor element (including the element itself if the node
   is an element) that has a lang attribute in the XML namespace set
   or is an HTML element and has a lang in no namespace attribute set.
   That attribute specifies the language of the node (regardless of
   its value).

   If both the lang attribute in no namespace and the lang attribute
   in the XML namespace are set on an element, user agents must use
   the lang attribute in the XML namespace, and the lang attribute in
   no namespace must be ignored for the purposes of determining the
   element's language.

   If neither the node nor any of the node's ancestors, including the
   root element, have either attribute set, but there is a pragma-set
   default language set, then that is the language of the node. If there
   is no pragma-set default language set, then language information from
   a higher-level protocol (such as HTTP), if any, must be used as the
   final fallback language instead. In the absence of any such language
   information, and in cases where the higher-level protocol reports
   multiple languages, the language of the node is unknown, and the
   corresponding language tag is the empty string.

   If the resulting value is not a recognized language tag, then it
   must be treated as an unknown language having the given language tag,
   distinct from all other languages."

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Ref: h5bp/html5-boilerplate#1542.
appleboy added a commit to appleboy/html5-template-engine that referenced this issue Jun 7, 2014
@peterblazejewicz
Copy link

active bug topic elsewhere:
Bug 26942 - why do these examples of lack the lang attribute?

mrmartineau added a commit to TryKickoff/kickoff that referenced this issue Sep 29, 2015
eleanor-byhook pushed a commit to eleanor-byhook/html5-boilerplate that referenced this issue Feb 29, 2016
In the past we decided¹ to remove the `lang` attribute due to the fact
that developers were often forgetting to update its value. This change
turned out not to be the best solution because some of the users:

 * didn't read the documentation, and thus, they found the absence
   of the `lang` attribute confusing, or didn't even knew they had
   to add it

 * didn't notice the absence of the `lang` attribute, and thus, forgot
   to include it (expecting it to be there)

To make things more clearer and to remind users that they need to
specify the primary language of the document, this commit reintroduces
the `lang` attribute, while leaving its value to `""`.

Using `lang=""` has the exact same effect as not specifying the `lang`
attribute at all, in both cases the language being treated as unknown².

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

¹ h5bp/html5-boilerplate#1110
² From WHATWG (http://www.whatwg.org/specs/web-apps/current-work/multipage/elements.html#attr-lang):

  "To determine the language of a node, user agents must look at the
   nearest ancestor element (including the element itself if the node
   is an element) that has a lang attribute in the XML namespace set
   or is an HTML element and has a lang in no namespace attribute set.
   That attribute specifies the language of the node (regardless of
   its value).

   If both the lang attribute in no namespace and the lang attribute
   in the XML namespace are set on an element, user agents must use
   the lang attribute in the XML namespace, and the lang attribute in
   no namespace must be ignored for the purposes of determining the
   element's language.

   If neither the node nor any of the node's ancestors, including the
   root element, have either attribute set, but there is a pragma-set
   default language set, then that is the language of the node. If there
   is no pragma-set default language set, then language information from
   a higher-level protocol (such as HTTP), if any, must be used as the
   final fallback language instead. In the absence of any such language
   information, and in cases where the higher-level protocol reports
   multiple languages, the language of the node is unknown, and the
   corresponding language tag is the empty string.

   If the resulting value is not a recognized language tag, then it
   must be treated as an unknown language having the given language tag,
   distinct from all other languages."

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Close h5bp/html5-boilerplate#1542.
haoqunjiang pushed a commit to vuejs/vue-cli that referenced this issue Oct 26, 2020
Fixes #5945

`lang="en"` may be wrong to users, use empty string instead. See : h5bp/html5-boilerplate#1542
haoqunjiang pushed a commit to vuejs/vue-cli that referenced this issue Jan 6, 2021
Fixes #5945

`lang="en"` may be wrong to users, use empty string instead. See : h5bp/html5-boilerplate#1542
ZanderOlidan pushed a commit to ZanderOlidan/vue-cli-service-chalkfix that referenced this issue Feb 5, 2024
Fixes #5945

`lang="en"` may be wrong to users, use empty string instead. See : h5bp/html5-boilerplate#1542
ZanderOlidan pushed a commit to ZanderOlidan/vue-cli-service-chalkfix that referenced this issue Feb 5, 2024
Fixes #5945

`lang="en"` may be wrong to users, use empty string instead. See : h5bp/html5-boilerplate#1542
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants