Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Visible label is part of accessible name (2ee8b8): Expectation seems to have unintended consequences #1458

Open
kasperisager opened this issue Sep 23, 2020 · 12 comments · May be fixed by #2075
Assignees

Comments

@kasperisager
Copy link
Contributor

The expectation currently reads:

For each target element, all text nodes in the visible text content either match or are contained within the accessible name of this target element, except for characters in the text nodes used to express non-text content. Leading and trailing whitespace and difference in case sensitivity should be ignored.

According to this expectation, something like the following passes the rule:

<button aria-label="How are you">
  <span>you</span>
  <span>How</span>
  <span>are</span>
</button>

That seems a little odd 🤔 Shouldn't the rule instead be looking at the concatenation of the data of the relevant text nodes?

Summoning @WilcoFiers as you authored #1419.

@kasperisager
Copy link
Contributor Author

Both Alfa, aXe, and QualWeb concatenate the text nodes so I'm guessing we'll want to reflect that in the rule 🙈 That does bring up an interesting case though for code such as this:

<div role="button" aria-label="Hello world">
  <p>Hello</p><p>world</p>
</div>

That button visually renders as two separate words, Hello and world, but the concatenated text node data is Helloworld. We're currently seeing a handful of cases like this across customer sites in Siteimprove and I'm leaning towards considering them false positives. A more realistic case, which causes issues when minified, is this:

<a href="#" aria-label="Some article by John Doe">
  <h6>Some article</h6>
  <p>by John Doe</p>
</a>

When minified, the concatenated text will be Some articleby John Doe.

@WilcoFiers
Copy link
Member

I tried to dig up the use case for this, but I can't find it. I remember why we did it though. If we're concatenating, we need to make the assumption that the text will be part of the same piece of text, and that it isn't rearranged with CSS in some way to appear in a different order.

@Jym77
Copy link
Collaborator

Jym77 commented Nov 25, 2021

Example 2 of Failure technique F96 seems to close the case of adding text in the middle of a label to create an accessible name (which, as far as I understand, is the shopping cart example @WilcoFiers mentioned during call):

A download link reads "Download specification" but there is invisible link text so that the accessible name of that link is "Download gizmo specification". While the visible label text is contained in the accessible name, there is no string match which may prevent the link from being activated by speech input.
<a href="#">Download <span class="accessibly-hidden">gizmo</span> specification</a>

@dan-tripp-siteimprove
Copy link
Collaborator

I have some ideas on this. Maybe I could volunteer. It seems to me that this rule needs a normalization algorithm, and to run it on both the label and the name, then do a substring check. Something like this:

To normalize a label or name:

  • Concatenate all text nodes
    • (Do this only for the label, not the name.)
    • For each HTML element start/stop, insert a space.
  • Replace each non-text character (eg. punctuation, emoji) with a space
    • Judgement of "non-text" can't be fully automated eg. "X" for "close", "+" for "zoom in"
  • Insert a space before and after each digit
  • Replace each run of multiple spaces with single space
    • i.e. do a regex replacement like s/ +/ /g

Then do the check "is the normalized label a substring of the normalized name"?

I'll have to check how the above algorithm behaves on all the cases in #1615. I think that this algorithm will do okay and err on the side of 'no false positives'. There are cases there which will fail the rule according to this algorithm, and which speech-to-text accepts. That might be unavoidable.

@Jym77
Copy link
Collaborator

Jym77 commented Mar 23, 2023

I think that this should globally work.

  • For each HTML element start/stop, insert a space.

This might actually depends on its display or something like that. <span>He</span><span>llo</span> should not have extra spaces. Which ends up being a tricky problem, also for accessible name computation: w3c/accname#15 🙈

@dan-tripp-siteimprove
Copy link
Collaborator

That's an interesting discussion over there at the W3C. Based on that, how about something like this:

Let 'label' be the inner text of the target element as per the innertText algorithm. ('label' is a string.)
Let 'name' be the accessible name. ('name' is also a string.)

To normalize a string:

  • Replace each non-text character (eg. punctuation, emoji) with a space
    • Judgement of "non-text" can't be fully automated eg. "X" for "close", "+" for "zoom in"
  • Insert a space before and after each digit
  • Replace each run of whitespace (of all kinds) with single space character
    • i.e. do a regex replacement like s/\s+/ /g

Then do the check "is the normalized 'label' a substring of the normalized 'name'"?

@dan-tripp-siteimprove
Copy link
Collaborator

Here's an updated idea for an algorithm (to deal with this case: <a href="#" aria-label="Discover Italy">Discover it</a>):

Let 'label' be the inner text of the target element as per the innertText algorithm. ('label' is a string.)
Let 'name' be the accessible name. ('name' is also a string.)

Algorithm to tokenize a string:

  • Replace each non-text character (eg. punctuation, emoji) with a space
    • Judgement of "non-text" can't be fully automated eg. "X" for "close", "+" for "zoom in"
  • Insert a space before and after each digit
  • Split the string into a list of strings, using a whitespace regex as the separator.

Then do the check: is the tokenized 'label' a sublist of the tokenized 'name'?

@dan-tripp-siteimprove
Copy link
Collaborator

Updating my idea for an algorithm again. And adding some test cases, from this issue and others.

Test cases:

  • <a href="#" aria-label="Discover Italy">Discover it</a>
    • Desired behaviour: fail this rule
  • <a href="#" aria-label="non-standard">nonstandard</a>
  • <div role="button" aria-label="Hello world"><p>Hello</p><p>world</p></div>
    • Desired behaviour: pass this rule
  • <a href="#" aria-label="Some article by John Doe"><h6>Some article</h6><p>by John Doe</p></a>
    • Desired behaviour: pass this rule
  • <a aria-label="Download specification" href="#">Download <span class="accessibly-hidden">gizmo</span> specification</a>
    • Desired behaviour: fail this rule
    • The "accessibly-hidden" class should:
  • <a aria-label="Download specification" href="#">Download <span style="display: none">gizmo</span> specification</a>
    • Desired behaviour: pass this rule
  • <button aria-label="anything">X</button>
    • Desired behaviour: pass this rule
  • <a aria-label="Call 1 2 3. 4 5 6. 7 8 9 0." href="tel:1234567890">123.456.7890</a>
    • Desired behaviour: pass this rule
  • <a aria-label="just ice" href="#">justice</a>
    • Desired behaviour: fail this rule
  • <a aria-label="justice" href="#">just ice</a>
    • Desired behaviour: fail this rule
  • <a aria-label="WAVE" href="#">W A V E</a>
    • Desired behaviour: fail this rule
  • <button aria-label="Next Page in the list">Next Page</button>
    • Desired behaviour: pass this rule
  • <a aria-label="fibonacci: 0 1 1 2 3 5 8 13 21 34">fibonacci: 0112358132134</a>
    • Desired behaviour: fail this rule
  • <a href="#2021" aria-label="20 21">2021</a>
    • Desired behaviour: fail this rule
  • <a href="#2021" aria-label="twenty twenty-one">two thousand twenty-one</a>
    • Desired behaviour: fail this rule
  • <a aria-label="1a" href="#">1</a>
    • Desired behaviour: pass this rule
  • <a aria-label="compose email" href="#">Compose &nbsp;&nbsp;<br> email</a>
    • Desired behaviour: pass this rule
  • <a aria-label="two zero two three" href="#">2 0 2 3</a>
    • Desired behaviour: fail this rule

Algorithm:

Let 'label' be the inner text of the target element as per the innertText algorithm. ('label' is a string.)
Let 'name' be the accessible name. ('name' is also a string.)

Algorithm to tokenize a string:

  • For each character that either a) represents non-text content, or b) isn't a letter or a digit: replace that character with a space character.
    • For a) Judgement of "non-text" probably can't be fully automated. eg. "X" for "close" probably can be, but presumably there are more cases than this.
    • For b) Use the unicode classes Letter, Mark, and "Number, Decimal Digit [Nd]".
  • Insert a space character before and after each digit
    • As per the unicode class "Number, Decimal Digit [Nd]".
  • Split the string into a list of strings, using a whitespace regex as the separator.
    • This 'split' operation should:
      • Effectively remove leading and trailing whitespace as a pre-processing step.
      • If the string was all whitespace before this operation: result in an empty list.

Then do the check: is the tokenized 'label' a sublist of the tokenized 'name'?

  • This 'sublist' check has these properties:
    • It checks whether elements are consecutive or not. i.e. it checks for a substring, in the computer science sense of the term. Not a subsequence.
    • An empty list is a sublist of any list.

@dan-tripp-siteimprove
Copy link
Collaborator

dan-tripp-siteimprove commented Apr 26, 2023

Another draft:

Test cases:

  • <a href="#" aria-label="Discover Italy">Discover it</a>
    • Desired behaviour: fail this rule
  • <a href="#" aria-label="non-standard">nonstandard</a>
  • <button aria-label="how are you"><span>you</span><span>how</span><span>are</span></button>
    • Desired behaviour: fail this rule
  • <button aria-label="AbCdE">aBcDe</button>
    • Desired behaviour: pass this rule
  • <div role="button" aria-label="Hello world"><p>Hello</p><p>world</p></div>
    • Desired behaviour: pass this rule
  • <a href="#" aria-label="Some article by John Doe"><h6>Some article</h6><p>by John Doe</p></a>
    • Desired behaviour: pass this rule
  • <a aria-label="Download specification" href="#">Download <span class="accessibly-hidden">gizmo</span> specification</a>
    • Desired behaviour: fail this rule
    • The "accessibly-hidden" class should:
  • <a aria-label="Download specification" href="#">Download <span style="visibility: hidden">the</span> <span style="display: none">gizmo</span> specification</a>
    • Desired behaviour: pass this rule
  • <button aria-label="anything">X</button>
    • Desired behaviour: pass this rule
  • <a aria-label="Call 1 2 3. 4 5 6. 7 8 9 0." href="tel:1234567890">123.456.7890</a>
    • Desired behaviour: pass this rule
  • <a aria-label="just ice" href="#">justice</a>
    • Desired behaviour: fail this rule
  • <a aria-label="justice" href="#">just ice</a>
    • Desired behaviour: fail this rule
  • <a aria-label="WAVE" href="#">W A V E</a>
    • Desired behaviour: fail this rule
  • <button aria-label="Next Page in the list">Next Page</button>
    • Desired behaviour: pass this rule
  • <a aria-label="fibonacci: 0 1 1 2 3 5 8 13 21 34">fibonacci: 0112358132134</a>
    • Desired behaviour: pass this rule
  • <a href="#2021" aria-label="20 21">2021</a>
    • Desired behaviour: pass this rule
  • <a href="#2021" aria-label="twenty twenty-one">two thousand twenty-one</a>
    • Desired behaviour: fail this rule
  • <a aria-label="1a" href="#">1</a>
    • Desired behaviour: pass this rule
  • <a aria-label="compose email" href="#">Compose &nbsp;&nbsp;<br> email</a>
    • Desired behaviour: pass this rule
  • <a aria-label="two zero two three" href="#">2 0 2 3</a>
    • Desired behaviour: fail this rule

Algorithm:

Let 'label' be the inner text of the target element as per the innertText algorithm. ('label' is a string.)
Let 'name' be the accessible name. ('name' is also a string.)

Algorithm to tokenize a string:

  • Convert the string to lower case.
  • For each character that either a) represents non-text content, or b) isn't a letter or a digit: replace that character with a space character.
    • For a) Judgement of "non-text" probably can't be fully automated. eg. "X" for "close" probably can be, but presumably there are more cases than this.
    • For b) Use the unicode classes Letter, Mark, and "Number, Decimal Digit [Nd]". (This will exclude hyphens, punctuation, emoji, and more.)
  • Insert a space character before and after each digit
    • As per the unicode class "Number, Decimal Digit [Nd]".
  • Split the string into a list of strings, using a whitespace regex as the separator.
    • This 'split' operation should:
      • Effectively remove leading and trailing whitespace as a pre-processing step.
      • If the string was all whitespace before this operation: result in an empty list.

Then do the check: is the tokenized 'label' a sublist of the tokenized 'name'?

  • This 'sublist' check has these properties:
    • It checks whether elements are consecutive or not. i.e. it checks for a substring, in the computer science sense of the term. Not a subsequence.
    • An empty list is a sublist of any list.

@carlosapaduarte
Copy link
Member

carlosapaduarte commented May 11, 2023

@dan-tripp-siteimprove in the last CG group meeting we agreed to update some of those examples. These are the ones:

  • <a aria-label="Call 1 2 3. 4 5 6. 7 8 9 0." href="tel:1234567890">123.456.7890</a>
    • Desired behaviour: fail this rule
  • <a aria-label="fibonacci: 0 1 1 2 3 5 8 13 21 34">fibonacci: 0112358132134</a>
    • Desired behaviour: fail this rule
  • <a href="#2021" aria-label="20 21">2021</a>
    • Desired behaviour: fail this rule
  • <a aria-label="1a" href="#">1</a>
    • Desired behaviour: fail this rule

The reasoning is summarised in Jean-Yves' comment on issue 1615

@dan-tripp-siteimprove
Copy link
Collaborator

Okay, I think I'm starting to get it. Thank you. I'll try to follow up soon.

@dan-tripp-siteimprove
Copy link
Collaborator

dan-tripp-siteimprove commented May 19, 2023

Here's another draft, in light of these recent discussions:
#1458 (comment)
https://github.com/w3c/wcag/pull/2725/files

Test cases:

  • <a href="#" aria-label="Discover Italy">Discover it</a>
    • Desired behaviour: fail this rule
  • <a href="#" aria-label="non-standard">nonstandard</a>
  • <button aria-label="how are you"><span>you</span><span>how</span><span>are</span></button>
    • Desired behaviour: fail this rule
  • <button aria-label="AbCdE">aBcDe</button>
    • Desired behaviour: pass this rule
  • <div role="button" aria-label="Hello world"><p>Hello</p><p>world</p></div>
    • Desired behaviour: pass this rule
  • <a href="#" aria-label="Some article by John Doe"><h6>Some article</h6><p>by John Doe</p></a>
    • Desired behaviour: pass this rule
  • <a aria-label="Download specification" href="#">Download <span class="accessibly-hidden">gizmo</span> specification</a>
    • Desired behaviour: fail this rule
    • The "accessibly-hidden" class should:
  • <a aria-label="Download specification" href="#">Download <span style="visibility: hidden">the</span> <span style="display: none">gizmo</span> specification</a>
    • Desired behaviour: pass this rule
  • <button aria-label="anything">X</button>
    • Desired behaviour: pass this rule
  • <a aria-label="Call 1 2 3. 4 5 6. 7 8 9 0." href="tel:1234567890">123.456.7890</a>
    • Desired behaviour: fail this rule
  • <a aria-label="just ice" href="#">justice</a>
    • Desired behaviour: fail this rule
  • <a aria-label="justice" href="#">just ice</a>
    • Desired behaviour: fail this rule
  • <a aria-label="WAVE" href="#">W A V E</a>
    • Desired behaviour: fail this rule
  • <button aria-label="Next Page in the list">Next Page</button>
    • Desired behaviour: pass this rule
  • <a aria-label="fibonacci: 0 1 1 2 3 5 8 13 21 34">fibonacci: 0112358132134</a>
    • Desired behaviour: fail this rule
  • <a href="#2021" aria-label="20 21">2021</a>
    • Desired behaviour: fail this rule
  • <a href="#2021" aria-label="twenty twenty-one">two thousand twenty-one</a>
    • Desired behaviour: fail this rule
  • <a aria-label="1a" href="#">1</a>
    • Desired behaviour: fail this rule
  • <a aria-label="compose email" href="#">Compose &nbsp;&nbsp;<br> email</a>
    • Desired behaviour: pass this rule
  • <a aria-label="two zero two three" href="#">2 0 2 3</a>
    • Desired behaviour: fail this rule
  • <button aria-label="Search by date">Search by date (YYYY-MM-DD)</button>
  • <button aria-label="Next">Next…</button>
    • Desired behaviour: pass this rule
  • <button aria-label="11 times 3 equals 33">11×3=33</button>
    • Desired behaviour: fail this rule

The algorithm below implements all of the "desired behaviours" above correctly, I think.

Algorithm:

Let 'label' be the inner text of the target element as per the innertText algorithm. ('label' is a string.)
Let 'name' be the accessible name. ('name' is also a string.)

To tokenize a string:

  • Convert the string to lower case.
  • For each character that either a) represents non-text content, or b) isn't a letter or a digit: replace that character with a space character.
    • For a) Judgement of "non-text" probably can't be fully automated. eg. "X" for "close" probably can be, but presumably there are more cases than this.
    • For b) Use the unicode classes Letter, Mark, and "Number, Decimal Digit [Nd]". (This will exclude hyphens, punctuation, emoji, and more.)
  • Remove all characters that are within parentheses (AKA round brackets).
    • Ignore square brackets and braces.
  • Split the string into a list of strings, using a whitespace regex as the separator.
    • This 'split' operation should:
      • Effectively remove leading and trailing whitespace as a pre-processing step.
      • If the string was all whitespace before this operation: result in an empty list.

Then do the check: is the tokenized 'label' a sublist of the tokenized 'name'?

  • This 'sublist' check has these properties:
    • It checks whether elements are consecutive or not. i.e. it checks for a substring, in the computer science sense of the term. Not a subsequence.
    • An empty list is a sublist of any list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants