Requirements for a WP title #24

deborahgu · 2017-08-09T14:34:01Z

Issue #20 addresses the question of whether a minimal viable manifest requires a title. This issue is to separate that out from questions of a manifest. What are the minimum title requirements for a title in a WP? Regardless of whether that title is encoded in a manifest or not, what are a WP's title requirements?

This is not necessarily a blocker for first public working draft.

Proposal

A WP requires a title.
Sufficient: a title is defined somewhere in the WP's metadata.
Fallback: a WP contains a single primary resource with a title element appropriate for its type (eg. SVG, HTML titles), and that becomes the WP's title.
Fallback: a WP contains multiple primary resources with a title element appropriate for its type (eg. SVG, HTML titles). The first one becomes the WP's title
- This one is imperfect from a usability and accessibility standpoint, but is an adequate fallback.
A WP contains no title attribute and no primary resources with a title element appropriate for its type (eg. SVG, HTML titles). This is non-conformant.
- A URL is not a fallback title
- A filename is not a fallback title

Rationale

Discovery / Organization
Accessibility: EPUB Accessibility uses title in its example of how accessibility needs to apply to a publication as a whole, not merely the component documents:

Consequently, when evaluating the accessibility of an EPUB Publication, individual pages — or Content Documents, as they are known in EPUB nomenclature — cannot be reviewed in isolation. Rather, their overall accessibility as parts of a larger work also has to be evaluated.

For example, it is not sufficient for individual Content Documents to have a logical reading order if the publication presents them in the wrong order. Likewise, including a title for every Content Document is complementary to providing a title for the publication: the overall accessibility is affected if either is missing.

Authoring

An authoring tool can create a title, when one isn't provided, however it chooses. Just as Word used to make the first string of a text document into the doc title, and some blogging platforms used to make the filename into the alt text, an authoring tool can choose to enforce useful titles or put in less useful fallbacks.

Therefore what @dauwhe called 'documents' made in WP tools, and what @lrosenthol called 'ad hoc publications' have no onerous work for the user in an authoring tool which makes it non-onerous.

dauwhe · 2017-08-09T15:55:58Z

This sounds good to me. There have been proposals for web publications that consist only of images (like various comic formats). I think we do need to have a human-readable title somewhere, and I agree that a filename fallback is entirely inadequate.

rdeltour · 2017-08-09T17:36:19Z

👍 on the proposal.

A WP contains no title attribute and no primary resources with a title element appropriate for its type (eg. SVG, HTML titles). This is non-conformant.

I'm just not sure about the meaning of "non-conformant".
In any case, the spec could say that the UA would then have to give a title to the publication, whether generating one (e.g. from the content, the first h1, or whatever other option), or by prompting a user for a title.

danielweck · 2017-08-09T17:43:25Z

I am unsure about "non-conformant" either. Does this mean that a user agent (let's say, a web browser) would implement some kind of user experience / interface to emit an error message when attempting to load the empty-title Web Publication? Or are you alluding to conformance in terms of validating processors? (e.g. content validation as per a specific profile, e.g. EPUB4+, or perhaps even PWP)

BillKasdorf · 2017-08-09T17:43:29Z

I suggest rewording that bullet point to say:

A WP that contains no title attribute and no primary resources with a title element appropriate for its type (eg. SVG, HTML titles) is non-conformant.

dauwhe · 2017-08-09T17:46:08Z

Rather than nonconformance, perhaps the final fallback title should be "The publication author thought so little of their audience that they didn't provide a title." This would be localized, of course.

lrosenthol · 2017-08-09T18:19:19Z

I was all on board with the proposal up until this whole "non-conforming" stuff. As already mentioned in a separate thread on this topic, and partially evidenced here, there are two types of requirements: Those on the format (or in this case, the manifest) and those on the processor (UA, in this case). So if I were to redo the bullets with those two in mind, I would approach it as follows: A WP MUST have a title either directly in the manifest (*details TBD*) or in one of the primary resources that are listed in the manifest. If a User Agent needs to provide a user with the title for a WP, and one is not provided, the UA may provide one of its own choosing. I don't see a need to say anything else, because missing title handling for WP by a UA should not be (IMO) any different than missing title handling for a "content document" (aka HTML/SVG).

baldurbjarnason · 2017-08-09T18:29:47Z

Would it be helpful if we added something like this?:

"The behaviour of non-conformant publications is intentionally left unspecified. It is the User Agent's responsibility to handle non-conformant publications in whichever manner provides the best user experience and accessibility as is possible in each scenario."

mattgarrish · 2017-08-09T18:38:55Z

"The behaviour of non-conformant publications is intentionally left unspecified. It is the User Agent's responsibility to handle non-conformant publications in whichever manner provides the best user experience and accessibility as is possible in each scenario."

I tend to prefer this to defining our own heuristics that may or may not net anything useful for the reader. We're not doing any better than letting the UA figure it out.

If we can't figure out a solid case for a title in the manifest, either it isn't important or including will become a de facto standard thing to do as people start using the standard.

I tend to prefer the consistency of just saying all manifests need a title, if only because the content is not guaranteed to carry one, or it be easily machine discovered.

I understand Leonard's argument that they may be meaningless, but in those cases I don't really care if a meaningless title is inserted, and neither will the person receiving the document. The author can decide when a meaningful one is necessary, like for search engine optimization.

It feels like we're spending way too much energy on this.

baldurbjarnason · 2017-08-09T18:57:16Z

@mattgarrish

I intended that note to be a suggestion for an addition to what @deborahgu wrote in the issue itself, not as a replacement. I think having a clear heuristic for common cases is really valuable especially as it potentially simplifies the authorship of single primary resource publications.

The only question in my mind is what to do when the outlined fallbacks fail and the publication has no discernible title. And my comment was a suggestion that we leave the exact response to that scenario up to the user agent as there is no solution in that scenario that is universally acceptable.

I'm fine with labelling no-title-whatsoever publications as non-conformant as authors should be aware that their behaviour are not covered by any of these (hypothetical) specifications and therefore unpredictable.

rdeltour · 2017-08-09T18:59:25Z

Would it be helpful if we added something like this?:

"The behaviour of non-conformant publications is intentionally left unspecified. It is the User Agent's responsibility to handle non-conformant publications in whichever manner provides the best user experience and accessibility as is possible in each scenario."

@baldurbjarnason do you mean as a general statement for the whole spec? or specific to this section?
The latter is preferable IMO, and I would be more specific to say something along the lines of "must provided one of their own chosing" (as @lrosenthol suggests, but with a must).

baldurbjarnason · 2017-08-09T19:06:26Z

@rdeltour

I meant specific to this section, yes. I'd also be fine with just adding "if a User
Agent needs to provide a user with the title for a WP, and one is not
provided, the UA MUST provide one of its own choosing" to the process outlined in the issue itself, as per your and @lrosenthol's suggestion.

mattgarrish · 2017-08-09T19:59:17Z

as it potentially simplifies the authorship of single primary resource publications

If that's really what we're trying to do, we should allow embedding of the manifest with a clear rule that the title can be omitted only when the carrying document provides it.

baldurbjarnason · 2017-08-09T20:08:15Z

@mattgarrish

If that's really what we're trying to do, we should allow embedding of the manifest with a clear rule that the title can be omitted only when the carrying document provides it.

You can solve multiple things at the same time. The proposed solution increases the odds of any given publication having a useful, accessible title (by providing a couple of fallbacks), increases consistency across User Agents in handling edge cases, and makes authorship of a single resource easier. Each of these things is individually a positive.

Also, embedding is a completely different topic which has numerous consequences on its own for processing, authorship, and general rendering. And it presupposes a non-HTML format which isn't something we have consensus on at the moment. Bringing embedding in makes things much more complicated again for this issue.

It would be helpful if you could outline specifically why you object to the proposed solution and why we can't just accept it, with the suggested alterations, and move on.

It isn't clear from your comments why you think this discussion is a waste of our energies.

mattgarrish · 2017-08-09T20:30:43Z

I don't object to guidance if a required title is missing.

What I don't find palatable is making the user agent have to solve a piece of metadata that the author should have provided in the first place, so that the author doesn't have to specify it.

How many authors are going to read the specification closely enough to figure out that the reason they don't have to specify a title is because of these other rules? How many are going to realize they're supposed to be taking care to get the title right in their first primary resource? What kind of consistency does it really lead to if different user agents rate the likelihood of finding a title in a resource differently?

And what use cases does this computational hoop-jumping make more difficult? What does a spider do with a manifest with no name? What if you want to share a manifest? Why does title skipping rate as such an important objective?

I'm under no illusion that the title being required isn't going to make everyone put one in the manifest, or make it meaningful, but I'm not clear why that is a compelling argument not to mandate one. It hasn't stopped HTML from mandating one.

A manifest without a title should be invalid, and if you choose to ignore the error so be it. If you can go to the effort of making a manifest, including a title is not exactly a deal breaker in my book.

baldurbjarnason · 2017-08-09T21:08:16Z

@mattgarrish

I don't object to guidance if a required title is missing.

What I don't find palatable is making the user agent have to solve a piece of metadata that the author should have provided in the first place, so that the author doesn't have to specify it.

~~So your objection is a moral one, not practical one?~~ (ETA: apologies, this remark was uncalled for)

How many authors are going to read the specification closely enough to figure out that the reason they don't have to specify a title is because of these other rules? How many are going to realize they're supposed to be taking care to get the title right in their first primary resource? What kind of consistency does it really lead to if different user agents rate the likelihood of finding a title in a resource differently?

If they follow the guidance as proposed in this issue (which is specific about the HTML file's title as specified in <title> and not some sort of grab-bag heuristic) the variation between UAs should be limited to cases where there is no discernible title.

Authors shouldn't have to know why something works, just that it works and is consistent.

And what use cases does this computational hoop-jumping make more difficult? What does a spider do with a manifest with no name? What if you want to share a manifest? Why does title skipping rate as such an important objective?

People are going to omit titles from the manifest, no matter what. Look how unusable atom files are without titles and yet it happens all the time. The crawler can follow the spec to get the same results as the UAs. And most crawlers are going to be fetching at least the text-based primary resources if they are doing anything interesting.

Having a title fallback heuristic is important because the title is important, but this is also a question of a general guiding design principle. My objective is to have a robust format that has realistic expectations of its authors based on how events have panned out in the web community in the past and provides a consistent and useful behaviour even under adverse circumstances. Like I've said elsewhere, this is a general design principle: don't rely on author conformance; but promote conformance as being the best, most featured, and reliable method of authorship; make as many features as possible optional; and tell UAs how to handle variances in predictable ways.

I'm under no illusion that the title being required isn't going to make everyone put one in the manifest, or make it meaningful, but I'm not clear why that is a compelling argument not to mandate one. It hasn't stopped HTML from mandating one.

HTML mandates a title. But it lets files omit it under some circumstances when the title can be recreated from other sources:

The title element is a required child in most situations, but when a higher-level protocol provides title information, e.g. in the Subject line of an e-mail when HTML is used as an e-mail authoring format, the title element can be omitted. https://html.spec.whatwg.org/multipage/semantics.html#the-head-element

and

If the document is an iframe srcdoc document or if title information is available from a higher-level protocol: Zero or more elements of metadata content, of which no more than one is a title element and no more than one is a base element. https://html.spec.whatwg.org/multipage/semantics.html#the-head-element

It also defines in extreme detail how exactly to parse invalid documents to recreate HTML's required structure. And it does not expect authors to understand why their invalid documents work, just that valid documents are better and more reliable. It's an approach that's completely different to ePub's and XML's and it has worked quite well for the web as a whole.

We can scold all we want in our specs but we have to author them to be robust in predictable ways.

We could just say "don't do that" in the spec but that doesn't help anybody figure out what's supposed to happen when somebody does do that, which will happen frequently. Leaving it entirely up to chance leads to big messes. Epub's assumed rigidity and lack of clarity as to what to do in edge cases is a large part of what makes it a nightmare to work with.

A manifest without a title should be invalid, and if you choose to ignore the error so be it. If you can go to the effort of making a manifest, including a title is not exactly a deal breaker in my book.

This isn't a question of just the title but of a general principle. If a feature is important, as title is for accessibility, do we try to maximise the odds of the end user having that feature or do we try to force authors to include it with moralising and lectures?

iherman · 2017-08-10T12:56:01Z

@baldurbjarnason, I am not sure where you are going with your #24 (comment). What is your proposal in conjunction to this issue? I have the impression that @deborahgu gave a clear set of fallbacks in her startup of the issue, but there is a point where those fallbacks do not really work (filenames, for example, are false solutions as far as accessibility goes). What do you propose we would do at that point? Or do you agree with her proposal? I was not sure having read your comment.

I agree with you that, whenever it is reasonable and feasible, we should provide a well documented set of fallbacks in our spec; @deborahgu does just that. But even the HTML spec does not do that everywhere. You quote the section on head, but, in fact, what it says about the title element is fairly vague:

NOTE:
The title element is a required child in most situations, but when a higher-level protocol provides title information, e.g., in the Subject line of an e-mail when HTML is used as an e-mail authoring format, the title element can be omitted.

(see https://www.w3.org/TR/html51/document-metadata.html#the-head-element)

I did not find anything more on that subject, maybe I missed some.

iherman · 2017-08-10T13:04:09Z

On the conformance issue: just as a reminder, the W3C spec rules require to include a conformance clause. What that clause will include is up to the Working Group, and I find it perfectly fine if we do separate the conformance on the WP as a publication and the conformance of a UA. We have to keep in mind, though, that conformance clauses should be, in theory, checkable, and we should be able to demonstrate that, in some way, during our Candidate Recommendation phase.

However. This is way down the line. I would propose we should not aim at this point to clarify all these issues in all details right now. Our next milestone is a First Public Working Draft, ie, our set of stakes in the ground that tell the world about what we want to do, asking/hoping for public comments as well as other institutions joining the group to add their knowledge and experience once they have a clearer idea on what we want to do.

B.t.w., @baldurbjarnason proposed, in #24 (comment):

Would it be helpful if we added something like this?:

"The behaviour of non-conformant publications is intentionally left unspecified. It is the User Agent's responsibility to handle non-conformant publications in whichever manner provides the best user experience and accessibility as is possible in each scenario."

Although he stated (in #24 (comment)) that he meant this for the original issue, I find this statement perfectly fine for the document at large, provided it is clear what is and what is not conformant.

mattgarrish · 2017-08-10T13:16:10Z

I think you're reading too much into my posts. I didn't say this is a "waste of energy". I said we're spending a lot of time finding a solution to something that I personally think is a necessary piece of information, and that I don't think should be dependent on digging into the publication resources.

The question of the issue is what are the titling requirements, and my response is a title is required in the manifest. Not for accessibility reasons, but because it's an important piece of information, or important enough that the reasons for omitting it don't outweigh the costs of it being considered optional.

Whatever wants to learn about the publication shouldn't have to dig to get the title, except as a last resort because the creator has intentionally or accidentally omitted it. And if a user agent just wants to call the publication "untitled" and move on when it doesn't find the title in the manifest, it should be a conforming UA. It shouldn't have to do any additional steps.

This kind of guidance is useful for those UAs that do want to try and compile a title, as I agreed with you in another thread the other day. I'm not objecting to there being a way to compile a title in cases where it isn't present. But I think this sort of title generation belongs on the authoring tool side. If it comes up with a meaningless title, at least the author knows what meaningless title it came up with.

Having a small set of requirements is not moralizing or lecturing authors. It's establishing a specific design principle for publications, in this case that they have a name and that it is easily found.

And more as an aside, I'm aware of the HTML prose about a higher-level protocol, but it doesn't apply to this situation, IMO. A corollary here would be that you don't need to specify a title in the publication content because it's available in the manifest (although I don't agree with this because of the vanilla browser scenario). The manifest is the higher-level protocol because it's clearly identifiable and would be known to have a title. It could also be stretched that the title can be inferred from a document that contains the manifest. I still think this latter case would be one such scenario we could have an equivalent rule for omitting titles, but that's a problem we haven't gotten to yet.

mattgarrish · 2017-08-10T13:19:14Z

But I think this sort of title generation belongs on the authoring tool side.

Sorry, meant to add a "too" here. Not exclusive.

GarthConboy · 2017-08-10T15:44:48Z

A belated "+1" to the original @deborahgu proposal.

mattgarrish · 2017-08-19T01:21:07Z

Not to rehash this, but looking at these:

A URL is not a fallback title
A filename is not a fallback title

WCAG in fact states:

Examples of text that are not titles include:
...
Filenames that are not descriptive in their own right, such as "report.html" or "spk12.html"

https://www.w3.org/TR/2016/NOTE-WCAG20-TECHS-20161007/F25

Any fallback heuristics should avoid making judgments about the quality of what is found, as the user agent cannot be responsible for what it creates if the author abrogates responsibility.

A user agent also cannot use its own heuristics and not be allowed to default to a url or filename, which is what the current wording seems to suggest:

otherwise, the UA uses its own heuristics. Note that a URL or a filename is not considered to be a valid title.

What is it supposed to do if it can't find a title? If you review the common failures, authoring tool default titles are failures, so what does that leave the user agent to use? Or are we saying it must use a clear failure in place of a potential one?

What I would like to do here is replace the note with one that authors need to ensure that a meaningful title is provided (with reference to WCAG for meaningful), or easily found in the fallback chain to avoid a user agent having to generate one that is not meaningful.

lrosenthol · 2017-08-20T16:38:42Z

On Fri, Aug 18, 2017 at 9:21 PM, Matt Garrish ***@***.***> wrote: Not to rehash this, but looking at these: A URL is not a fallback title A filename is not a fallback title Those are not *good* titles, but they are perfectly valid titles. Any fallback heuristics should avoid making judgments about the quality of what is found, as the user agent cannot be responsible for what it creates if the author abrogates responsibility.

Amen!

A user agent also cannot use its own heuristics and not be allowed to default to a url or filename, which is what the current wording seems to suggest: otherwise, the UA uses its own heuristics. Note that a URL or a filename is not considered to be a valid title. What is it supposed to do if it can't find a title? If you review the common failures, authoring tool default titles are failures, so what does that leave the user agent to use? Or are we saying it must use a clear failure in place of a potential one?

UA should do *exactly the same thing* that it would do for a missing title on a web page. No more - no less.

TzviyaSiegman · 2017-08-28T17:18:42Z

This issue is resolved with #20

iherman · 2017-08-29T04:59:23Z

See telco discussion on closure.

TzviyaSiegman added the topic:metadata label Aug 9, 2017

TzviyaSiegman added this to the Define publication-level metadata milestone Aug 21, 2017

TzviyaSiegman closed this as completed Aug 28, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Requirements for a WP title #24

Requirements for a WP title #24

deborahgu commented Aug 9, 2017 •

edited

Loading

dauwhe commented Aug 9, 2017

rdeltour commented Aug 9, 2017

danielweck commented Aug 9, 2017

BillKasdorf commented Aug 9, 2017

dauwhe commented Aug 9, 2017

lrosenthol commented Aug 9, 2017 via email

baldurbjarnason commented Aug 9, 2017

mattgarrish commented Aug 9, 2017

baldurbjarnason commented Aug 9, 2017

rdeltour commented Aug 9, 2017

baldurbjarnason commented Aug 9, 2017

mattgarrish commented Aug 9, 2017

baldurbjarnason commented Aug 9, 2017

mattgarrish commented Aug 9, 2017

baldurbjarnason commented Aug 9, 2017 •

edited

Loading

iherman commented Aug 10, 2017

iherman commented Aug 10, 2017

mattgarrish commented Aug 10, 2017

mattgarrish commented Aug 10, 2017

GarthConboy commented Aug 10, 2017 •

edited

Loading

mattgarrish commented Aug 19, 2017

lrosenthol commented Aug 20, 2017 via email

TzviyaSiegman commented Aug 28, 2017

iherman commented Aug 29, 2017

Requirements for a WP title #24

Requirements for a WP title #24

Comments

deborahgu commented Aug 9, 2017 • edited Loading

Proposal

Rationale

Authoring

dauwhe commented Aug 9, 2017

rdeltour commented Aug 9, 2017

danielweck commented Aug 9, 2017

BillKasdorf commented Aug 9, 2017

dauwhe commented Aug 9, 2017

lrosenthol commented Aug 9, 2017 via email

baldurbjarnason commented Aug 9, 2017

mattgarrish commented Aug 9, 2017

baldurbjarnason commented Aug 9, 2017

rdeltour commented Aug 9, 2017

baldurbjarnason commented Aug 9, 2017

mattgarrish commented Aug 9, 2017

baldurbjarnason commented Aug 9, 2017

mattgarrish commented Aug 9, 2017

baldurbjarnason commented Aug 9, 2017 • edited Loading

iherman commented Aug 10, 2017

iherman commented Aug 10, 2017

mattgarrish commented Aug 10, 2017

mattgarrish commented Aug 10, 2017

GarthConboy commented Aug 10, 2017 • edited Loading

mattgarrish commented Aug 19, 2017

lrosenthol commented Aug 20, 2017 via email

TzviyaSiegman commented Aug 28, 2017

iherman commented Aug 29, 2017

deborahgu commented Aug 9, 2017 •

edited

Loading

baldurbjarnason commented Aug 9, 2017 •

edited

Loading

GarthConboy commented Aug 10, 2017 •

edited

Loading