Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Requirements for a WP title #24

Closed
deborahgu opened this issue Aug 9, 2017 · 24 comments
Closed

Requirements for a WP title #24

deborahgu opened this issue Aug 9, 2017 · 24 comments

Comments

@deborahgu
Copy link

deborahgu commented Aug 9, 2017

Issue #20 addresses the question of whether a minimal viable manifest requires a title. This issue is to separate that out from questions of a manifest. What are the minimum title requirements for a title in a WP? Regardless of whether that title is encoded in a manifest or not, what are a WP's title requirements?

This is not necessarily a blocker for first public working draft.

Proposal

  • A WP requires a title.
  • Sufficient: a title is defined somewhere in the WP's metadata.
  • Fallback: a WP contains a single primary resource with a title element appropriate for its type (eg. SVG, HTML titles), and that becomes the WP's title.
  • Fallback: a WP contains multiple primary resources with a title element appropriate for its type (eg. SVG, HTML titles). The first one becomes the WP's title
    • This one is imperfect from a usability and accessibility standpoint, but is an adequate fallback.
  • A WP contains no title attribute and no primary resources with a title element appropriate for its type (eg. SVG, HTML titles). This is non-conformant.
    • A URL is not a fallback title
    • A filename is not a fallback title

Rationale

  1. Discovery / Organization
  2. Accessibility: EPUB Accessibility uses title in its example of how accessibility needs to apply to a publication as a whole, not merely the component documents:

Consequently, when evaluating the accessibility of an EPUB Publication, individual pages — or Content Documents, as they are known in EPUB nomenclature — cannot be reviewed in isolation. Rather, their overall accessibility as parts of a larger work also has to be evaluated.

For example, it is not sufficient for individual Content Documents to have a logical reading order if the publication presents them in the wrong order. Likewise, including a title for every Content Document is complementary to providing a title for the publication: the overall accessibility is affected if either is missing.

Authoring

An authoring tool can create a title, when one isn't provided, however it chooses. Just as Word used to make the first string of a text document into the doc title, and some blogging platforms used to make the filename into the alt text, an authoring tool can choose to enforce useful titles or put in less useful fallbacks.

Therefore what @dauwhe called 'documents' made in WP tools, and what @lrosenthol called 'ad hoc publications' have no onerous work for the user in an authoring tool which makes it non-onerous.

@dauwhe
Copy link
Contributor

dauwhe commented Aug 9, 2017

This sounds good to me. There have been proposals for web publications that consist only of images (like various comic formats). I think we do need to have a human-readable title somewhere, and I agree that a filename fallback is entirely inadequate.

@rdeltour
Copy link
Member

rdeltour commented Aug 9, 2017

👍 on the proposal.

A WP contains no title attribute and no primary resources with a title element appropriate for its type (eg. SVG, HTML titles). This is non-conformant.

I'm just not sure about the meaning of "non-conformant".
In any case, the spec could say that the UA would then have to give a title to the publication, whether generating one (e.g. from the content, the first h1, or whatever other option), or by prompting a user for a title.

@danielweck
Copy link
Member

I am unsure about "non-conformant" either. Does this mean that a user agent (let's say, a web browser) would implement some kind of user experience / interface to emit an error message when attempting to load the empty-title Web Publication? Or are you alluding to conformance in terms of validating processors? (e.g. content validation as per a specific profile, e.g. EPUB4+, or perhaps even PWP)

@BillKasdorf
Copy link

I suggest rewording that bullet point to say:

A WP that contains no title attribute and no primary resources with a title element appropriate for its type (eg. SVG, HTML titles) is non-conformant.

@dauwhe
Copy link
Contributor

dauwhe commented Aug 9, 2017

Rather than nonconformance, perhaps the final fallback title should be "The publication author thought so little of their audience that they didn't provide a title." This would be localized, of course.

@lrosenthol
Copy link

lrosenthol commented Aug 9, 2017 via email

@baldurbjarnason
Copy link
Contributor

Would it be helpful if we added something like this?:

"The behaviour of non-conformant publications is intentionally left unspecified. It is the User Agent's responsibility to handle non-conformant publications in whichever manner provides the best user experience and accessibility as is possible in each scenario."

@mattgarrish
Copy link
Member

"The behaviour of non-conformant publications is intentionally left unspecified. It is the User Agent's responsibility to handle non-conformant publications in whichever manner provides the best user experience and accessibility as is possible in each scenario."

I tend to prefer this to defining our own heuristics that may or may not net anything useful for the reader. We're not doing any better than letting the UA figure it out.

If we can't figure out a solid case for a title in the manifest, either it isn't important or including will become a de facto standard thing to do as people start using the standard.

I tend to prefer the consistency of just saying all manifests need a title, if only because the content is not guaranteed to carry one, or it be easily machine discovered.

I understand Leonard's argument that they may be meaningless, but in those cases I don't really care if a meaningless title is inserted, and neither will the person receiving the document. The author can decide when a meaningful one is necessary, like for search engine optimization.

It feels like we're spending way too much energy on this.

@baldurbjarnason
Copy link
Contributor

@mattgarrish

I intended that note to be a suggestion for an addition to what @deborahgu wrote in the issue itself, not as a replacement. I think having a clear heuristic for common cases is really valuable especially as it potentially simplifies the authorship of single primary resource publications.

The only question in my mind is what to do when the outlined fallbacks fail and the publication has no discernible title. And my comment was a suggestion that we leave the exact response to that scenario up to the user agent as there is no solution in that scenario that is universally acceptable.

I'm fine with labelling no-title-whatsoever publications as non-conformant as authors should be aware that their behaviour are not covered by any of these (hypothetical) specifications and therefore unpredictable.

@rdeltour
Copy link
Member

rdeltour commented Aug 9, 2017

Would it be helpful if we added something like this?:

"The behaviour of non-conformant publications is intentionally left unspecified. It is the User Agent's responsibility to handle non-conformant publications in whichever manner provides the best user experience and accessibility as is possible in each scenario."

@baldurbjarnason do you mean as a general statement for the whole spec? or specific to this section?
The latter is preferable IMO, and I would be more specific to say something along the lines of "must provided one of their own chosing" (as @lrosenthol suggests, but with a must).

@baldurbjarnason
Copy link
Contributor

@rdeltour

I meant specific to this section, yes. I'd also be fine with just adding "if a User
Agent needs to provide a user with the title for a WP, and one is not
provided, the UA MUST provide one of its own choosing" to the process outlined in the issue itself, as per your and @lrosenthol's suggestion.

@mattgarrish
Copy link
Member

as it potentially simplifies the authorship of single primary resource publications

If that's really what we're trying to do, we should allow embedding of the manifest with a clear rule that the title can be omitted only when the carrying document provides it.

@baldurbjarnason
Copy link
Contributor

@mattgarrish

If that's really what we're trying to do, we should allow embedding of the manifest with a clear rule that the title can be omitted only when the carrying document provides it.

You can solve multiple things at the same time. The proposed solution increases the odds of any given publication having a useful, accessible title (by providing a couple of fallbacks), increases consistency across User Agents in handling edge cases, and makes authorship of a single resource easier. Each of these things is individually a positive.

Also, embedding is a completely different topic which has numerous consequences on its own for processing, authorship, and general rendering. And it presupposes a non-HTML format which isn't something we have consensus on at the moment. Bringing embedding in makes things much more complicated again for this issue.

It would be helpful if you could outline specifically why you object to the proposed solution and why we can't just accept it, with the suggested alterations, and move on.

It isn't clear from your comments why you think this discussion is a waste of our energies.

@mattgarrish
Copy link
Member

I don't object to guidance if a required title is missing.

What I don't find palatable is making the user agent have to solve a piece of metadata that the author should have provided in the first place, so that the author doesn't have to specify it.

How many authors are going to read the specification closely enough to figure out that the reason they don't have to specify a title is because of these other rules? How many are going to realize they're supposed to be taking care to get the title right in their first primary resource? What kind of consistency does it really lead to if different user agents rate the likelihood of finding a title in a resource differently?

And what use cases does this computational hoop-jumping make more difficult? What does a spider do with a manifest with no name? What if you want to share a manifest? Why does title skipping rate as such an important objective?

I'm under no illusion that the title being required isn't going to make everyone put one in the manifest, or make it meaningful, but I'm not clear why that is a compelling argument not to mandate one. It hasn't stopped HTML from mandating one.

A manifest without a title should be invalid, and if you choose to ignore the error so be it. If you can go to the effort of making a manifest, including a title is not exactly a deal breaker in my book.

@baldurbjarnason
Copy link
Contributor

baldurbjarnason commented Aug 9, 2017

@mattgarrish

I don't object to guidance if a required title is missing.

What I don't find palatable is making the user agent have to solve a piece of metadata that the author should have provided in the first place, so that the author doesn't have to specify it.

So your objection is a moral one, not practical one? (ETA: apologies, this remark was uncalled for)

How many authors are going to read the specification closely enough to figure out that the reason they don't have to specify a title is because of these other rules? How many are going to realize they're supposed to be taking care to get the title right in their first primary resource? What kind of consistency does it really lead to if different user agents rate the likelihood of finding a title in a resource differently?

If they follow the guidance as proposed in this issue (which is specific about the HTML file's title as specified in <title> and not some sort of grab-bag heuristic) the variation between UAs should be limited to cases where there is no discernible title.

Authors shouldn't have to know why something works, just that it works and is consistent.

And what use cases does this computational hoop-jumping make more difficult? What does a spider do with a manifest with no name? What if you want to share a manifest? Why does title skipping rate as such an important objective?

People are going to omit titles from the manifest, no matter what. Look how unusable atom files are without titles and yet it happens all the time. The crawler can follow the spec to get the same results as the UAs. And most crawlers are going to be fetching at least the text-based primary resources if they are doing anything interesting.

Having a title fallback heuristic is important because the title is important, but this is also a question of a general guiding design principle. My objective is to have a robust format that has realistic expectations of its authors based on how events have panned out in the web community in the past and provides a consistent and useful behaviour even under adverse circumstances. Like I've said elsewhere, this is a general design principle: don't rely on author conformance; but promote conformance as being the best, most featured, and reliable method of authorship; make as many features as possible optional; and tell UAs how to handle variances in predictable ways.

I'm under no illusion that the title being required isn't going to make everyone put one in the manifest, or make it meaningful, but I'm not clear why that is a compelling argument not to mandate one. It hasn't stopped HTML from mandating one.

HTML mandates a title. But it lets files omit it under some circumstances when the title can be recreated from other sources:

The title element is a required child in most situations, but when a higher-level protocol provides title information, e.g. in the Subject line of an e-mail when HTML is used as an e-mail authoring format, the title element can be omitted. https://html.spec.whatwg.org/multipage/semantics.html#the-head-element

and

If the document is an iframe srcdoc document or if title information is available from a higher-level protocol: Zero or more elements of metadata content, of which no more than one is a title element and no more than one is a base element. https://html.spec.whatwg.org/multipage/semantics.html#the-head-element

It also defines in extreme detail how exactly to parse invalid documents to recreate HTML's required structure. And it does not expect authors to understand why their invalid documents work, just that valid documents are better and more reliable. It's an approach that's completely different to ePub's and XML's and it has worked quite well for the web as a whole.

We can scold all we want in our specs but we have to author them to be robust in predictable ways.

We could just say "don't do that" in the spec but that doesn't help anybody figure out what's supposed to happen when somebody does do that, which will happen frequently. Leaving it entirely up to chance leads to big messes. Epub's assumed rigidity and lack of clarity as to what to do in edge cases is a large part of what makes it a nightmare to work with.

A manifest without a title should be invalid, and if you choose to ignore the error so be it. If you can go to the effort of making a manifest, including a title is not exactly a deal breaker in my book.

This isn't a question of just the title but of a general principle. If a feature is important, as title is for accessibility, do we try to maximise the odds of the end user having that feature or do we try to force authors to include it with moralising and lectures?

@iherman
Copy link
Member

iherman commented Aug 10, 2017

@baldurbjarnason, I am not sure where you are going with your #24 (comment). What is your proposal in conjunction to this issue? I have the impression that @deborahgu gave a clear set of fallbacks in her startup of the issue, but there is a point where those fallbacks do not really work (filenames, for example, are false solutions as far as accessibility goes). What do you propose we would do at that point? Or do you agree with her proposal? I was not sure having read your comment.

I agree with you that, whenever it is reasonable and feasible, we should provide a well documented set of fallbacks in our spec; @deborahgu does just that. But even the HTML spec does not do that everywhere. You quote the section on head, but, in fact, what it says about the title element is fairly vague:

NOTE:
The title element is a required child in most situations, but when a higher-level protocol provides title information, e.g., in the Subject line of an e-mail when HTML is used as an e-mail authoring format, the title element can be omitted.

(see https://www.w3.org/TR/html51/document-metadata.html#the-head-element)

I did not find anything more on that subject, maybe I missed some.

@iherman
Copy link
Member

iherman commented Aug 10, 2017

On the conformance issue: just as a reminder, the W3C spec rules require to include a conformance clause. What that clause will include is up to the Working Group, and I find it perfectly fine if we do separate the conformance on the WP as a publication and the conformance of a UA. We have to keep in mind, though, that conformance clauses should be, in theory, checkable, and we should be able to demonstrate that, in some way, during our Candidate Recommendation phase.

However. This is way down the line. I would propose we should not aim at this point to clarify all these issues in all details right now. Our next milestone is a First Public Working Draft, ie, our set of stakes in the ground that tell the world about what we want to do, asking/hoping for public comments as well as other institutions joining the group to add their knowledge and experience once they have a clearer idea on what we want to do.

B.t.w., @baldurbjarnason proposed, in #24 (comment):

Would it be helpful if we added something like this?:

"The behaviour of non-conformant publications is intentionally left unspecified. It is the User Agent's responsibility to handle non-conformant publications in whichever manner provides the best user experience and accessibility as is possible in each scenario."

Although he stated (in #24 (comment)) that he meant this for the original issue, I find this statement perfectly fine for the document at large, provided it is clear what is and what is not conformant.

@mattgarrish
Copy link
Member

I think you're reading too much into my posts. I didn't say this is a "waste of energy". I said we're spending a lot of time finding a solution to something that I personally think is a necessary piece of information, and that I don't think should be dependent on digging into the publication resources.

The question of the issue is what are the titling requirements, and my response is a title is required in the manifest. Not for accessibility reasons, but because it's an important piece of information, or important enough that the reasons for omitting it don't outweigh the costs of it being considered optional.

Whatever wants to learn about the publication shouldn't have to dig to get the title, except as a last resort because the creator has intentionally or accidentally omitted it. And if a user agent just wants to call the publication "untitled" and move on when it doesn't find the title in the manifest, it should be a conforming UA. It shouldn't have to do any additional steps.

This kind of guidance is useful for those UAs that do want to try and compile a title, as I agreed with you in another thread the other day. I'm not objecting to there being a way to compile a title in cases where it isn't present. But I think this sort of title generation belongs on the authoring tool side. If it comes up with a meaningless title, at least the author knows what meaningless title it came up with.

Having a small set of requirements is not moralizing or lecturing authors. It's establishing a specific design principle for publications, in this case that they have a name and that it is easily found.

And more as an aside, I'm aware of the HTML prose about a higher-level protocol, but it doesn't apply to this situation, IMO. A corollary here would be that you don't need to specify a title in the publication content because it's available in the manifest (although I don't agree with this because of the vanilla browser scenario). The manifest is the higher-level protocol because it's clearly identifiable and would be known to have a title. It could also be stretched that the title can be inferred from a document that contains the manifest. I still think this latter case would be one such scenario we could have an equivalent rule for omitting titles, but that's a problem we haven't gotten to yet.

@mattgarrish
Copy link
Member

But I think this sort of title generation belongs on the authoring tool side.

Sorry, meant to add a "too" here. Not exclusive.

@GarthConboy
Copy link
Contributor

GarthConboy commented Aug 10, 2017

A belated "+1" to the original @deborahgu proposal.

@mattgarrish
Copy link
Member

Not to rehash this, but looking at these:

A URL is not a fallback title
A filename is not a fallback title

WCAG in fact states:

Examples of text that are not titles include:
...
Filenames that are not descriptive in their own right, such as "report.html" or "spk12.html"

https://www.w3.org/TR/2016/NOTE-WCAG20-TECHS-20161007/F25

Any fallback heuristics should avoid making judgments about the quality of what is found, as the user agent cannot be responsible for what it creates if the author abrogates responsibility.

A user agent also cannot use its own heuristics and not be allowed to default to a url or filename, which is what the current wording seems to suggest:

otherwise, the UA uses its own heuristics. Note that a URL or a filename is not considered to be a valid title.

What is it supposed to do if it can't find a title? If you review the common failures, authoring tool default titles are failures, so what does that leave the user agent to use? Or are we saying it must use a clear failure in place of a potential one?

What I would like to do here is replace the note with one that authors need to ensure that a meaningful title is provided (with reference to WCAG for meaningful), or easily found in the fallback chain to avoid a user agent having to generate one that is not meaningful.

@lrosenthol
Copy link

lrosenthol commented Aug 20, 2017 via email

@TzviyaSiegman
Copy link
Contributor

This issue is resolved with #20

@iherman
Copy link
Member

iherman commented Aug 29, 2017

See telco discussion on closure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests