Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: an HTML-first Table of Contents approach to Web Publication #35

Closed
BigBlueHat opened this issue Aug 16, 2017 · 32 comments
Closed

Comments

@BigBlueHat
Copy link
Member

@dauwhe and I have been working on a proposal to use HTML's <nav> element as a web publication manifest: https://github.com/dauwhe/html-first

TL:DR define the primary resources of a WP to be the files referenced in the first

element of an "index" file. This file would also host WP metadata.

We feel this approach has many benefits:

  1. Human-focused. User agents need a list of primary resources and their default ordering, but so do actual users. Most web publications would benefit from a human-readable table of contents. TOCs are crucial for accessibility.

  2. Simplicity. Given the broad need for a TOC, using that as manifest is a straightforward way to avoid duplication (as in EPUB's nav/manifest/spine/ncx). And we've discovered a huge benefit, as we don't need a list of secondary resources to facilitate offline caching via service workers (see the demo books)!

  3. Ubiquity. Everyone in the web space is already familiar with HTML, and there is a large and mature ecosystem around authoring, rendering, and validating HTML.

  4. Expressiveness. HTML's language and styling support allows for a richer experience for humans.

  5. Progressive enhancement. Existing web user agents know what to do with HTML.

  6. A Path to the future. Every EPUB3 has a nav document. Many "web books" already use such a design pattern.

Note we've created a couple of demo books that work offline, based on the HTML manifest.

Thanks,
🎩 and 🐳

@lrosenthol
Copy link

lrosenthol commented Aug 16, 2017 via email

@BigBlueHat
Copy link
Member Author

@lrosenthol great question! HTML saves the day again. 😄
https://www.w3.org/TR/html52/dom.html#the-lang-and-xmllang-attributes

I'd expect that the lang attribute on the ToC would be considered the "primary language" of the ToC document as well as the publication. Additionally, lang is a global attribute, so is usually on (nearly) everything.

So...

<html 📖 lang="en">
...
<nav>
  <ol>
    <li lang="jp"><a href="hello.html">今日は <span lang="en">Konnichiwa</span></a></li>
    <li><a href="author.html">About the Author</li>
  </ol>
</nav>
...
</html>

Additionally, authors would also have the power of <ruby> and decades of I18N work available throughout their publication.

Hope that helps!

@GarthConboy
Copy link
Contributor

Echoing Leonard, thanks for an insightful proposal, replete with demos!

My main issue (and one that could be resolved with additional metadata or links) is around secondary resources -- we need to be able to use whatever we come up with to take a WP offline, and to transmogrify it into a PWP. Yes, the demo does some caching, but that can't be done deterministically without an explicit list of secondary resources (e.g., to avoid pulling in the entire web by following a link to wikipedia). We have (I think) agreed that such a list is a WP requirement -- whether that list is a MAY, SHOULD or MUST has been purposely left TBD.

So, we would need to augment this proposal with a way to specify secondary resources -- either metadata, or additional (non-displayed) nav sub-elements, or header links, or some such.

@BigBlueHat
Copy link
Member Author

@GarthConboy the whole publication is available offline--including secondary resources. 😄 Also, CORS, CSP, the Single Origin Policy, and many other such things set a boundary on ServiceWorker's (on which the demo is based). Additional constraints--beyond those being defined elsewhere by browsers--could certainly be specified.

That said, there's no additional need to re-state the secondary resources. The browser knows how to get them, and does.

Packaging's a separate topic, but one we don't prevent with this approach afaict.

@dauwhe
Copy link
Contributor

dauwhe commented Aug 16, 2017

That said, I have a big concern with it - that it doesn't actually provide
a way to define "publication wide" information. Let's consider the
discussion around Language definition (#29). If I have a publication that
is multilingual, where some primary resources are in English and others are
in French (such as in parts of Canada) - I can't do that in your method
since the lang needs to be that of the resource in question.

Several things:

  1. The language on the "index" file would describe essentially what language you want any UI to be in. This is like For manifest in FPWD: Should Natural Language be Required per WCAG 2 #29 where there's a "manifest language".
  2. The nav could indicate the language of a linked resource via the (unknown to me until minutes ago hreflang attribute: <li lang="fr"><a href="c002.html" hreflang="fr">Entre Terre et Ciel</a></li>
  3. Each resource can use the techniques of HTML to describe their language, which may vary from element to element.
  4. If you want to create metadata that describes the language of the intended audience of a page, rather than the language of a specific range of text, do so by getting the server to send the information in the HTTP Content-Language header. If your intended audience speaks more than one language, the HTTP header allows you to use a comma-separated list of languages. source

  5. Any general-purpose metadata vocabulary (RDFa, etc.) could provide more granular or detailed information about language.

@lrosenthol
Copy link

lrosenthol commented Aug 16, 2017 via email

@dauwhe
Copy link
Contributor

dauwhe commented Aug 16, 2017

So that's a great example. What language is "hello.html" going to be? No
idea, because you can't put a language on the link element (IIRC).

hreflang!

@lrosenthol
Copy link

lrosenthol commented Aug 16, 2017 via email

@baldurbjarnason
Copy link
Contributor

Quick thoughts:

I keep changing my mind back and forth on the ‘book’ attribute. I’m told that browser vendors are not fond of mode switching in general (i.e. some sort of toggle that changes how the page works as a whole) or HTML profiles specifically so relying on that sort of functionality might be risky. It works well when you know you aren’t going to rely on browsers implementing the toggled features (like AMP) but creating a ‘fork’ of sorts of HTML as a format might be a hindrance to browser support, long term.

I wonder, since Digital Publishing WAI-ARIA seems to be a fait accompli, whether the presence of role='doc-toc' on the nav element might not be enough to indicate that the nav should be parsed as a publication nav?

rel values

I also wonder if we wouldn’t benefit from minting publication-specific rel values? rel=doc-contents, rel=doc-previous, rel=doc-next. That way you don’t need mode switching because their behaviour can be unambiguously specified and you aren’t resurrecting existing values with pre-existing and ambiguous usage. You could even add something like rel=doc-text to indicate the main text element that should be extracted from the page in case the author wants to provide baked in navigation for regular Web UAs but let publication UAs strip it away.

The Big Issue Number One: Omitting Secondary Resources

While I think that it’s a good idea for reading systems to try to automatically discover secondary resources for caching if none are listed, making that the only method seems a bit onerous.

It mitigates the author’s ability to be explicit about which resources the publication needs and should always store and which resources should always be fetched from the network. E.g. some JS libraries are likelier than others to get exploited in some way. When secondary resources are explicitly listed, you could deliberately put the risky library ‘outside’ of the publication, so to speak, and make sure that it is always fetched and always has the latest security updates.

Having only implicit secondary resources leaves out in the cold (i.e. uncached) publications who have HTML resources that aren’t in the reading order: secondary material that can be navigated to, should be stored with the publication, but isn’t a part of ToC. Which happens quite often with reference publications and documentation.

And this pre-emptively excludes hypertext as a publication genre entirely where other proposals don’t. Hypertext (i.e. publications with no reading order, only entry points and links) is kind of a big thing on the web and it seems counter-productive to exclude it as a genre. In other proposals I’ve seen hypertext would be supported by providing a single primary resource and then listing the remaining HTML entries as secondary resources—benefitting from the additional publication metadata and offline features without having to provide a fake and misleading reading order.

With the Readium Web Publication JSON proposal there wouldn't be anything preventing a company like Eastgate, for example, from re-releasing classic hypertexts (like afternoon, a story or Patchwork Girl) as web publications. This proposal would make that impossible. There is a lot of academic, archival, and research value in making it easier to port historically important hypertexts over to an open publication format.

That's without getting into the issue of current hypertexts which are common in technical documentation, fandom websites, and more. I don't see how the value of a marginal increase in the ease of authorship justifies completely ruling out entire forms and genres of publications.

The Big Issue Number Two: Not Supporting Purely Navigational ToCs

There’s also the problem where the most accessible contents navigation for a hypertext often looks very different from a hierarchical table of contents. In many cases the best ToC for these has been a semantically marked up SVG map of the nodes involved. Supporting purely navigational ToCs would let more dynamic and complex publications include navigational maps that serve their purpose better than an ordered and hierarchical list of links.

And what do people do who need the human-readable ToC and the machine-readable ToC to differ for whatever reason? This was a thing that came up regularly when I was still making ebooks. So often that I ended up defaulting to keeping machine-readable nav.html and the in-book ToC entirely separate. This is going to be an issue, especially if the ToC has to serve a dual purpose of proving both spine (reading order) and navigation. Modifying a single ToC with styles and scripts to create a separate ‘people’ version is a hassle, especially for complex publications.

Again, Readium's proposal—by keeping ToC, reading order, and secondary resources separate—had the benefit of not having to care at all about any of this stuff in any way. By conflating these three concepts, you are making a bunch of forms impossible to express as publications. These forms may not be important to the publishing industry, but they are important to interactive media and the web.

Sidebar

(I have a concern with the implementation here in that it caches everything and always serves from the cache first. This is actually worse than how the appcache works in that once you’ve stored the book offline, you’re stuck with that version unless you force an update somehow. Which means that insecure code stays insecure for virtually forever, bugs are eternal, etc. etc. But since the implementation isn’t the proposal I’ll be satisfied with just leaving this parenthetical remark. 😊)

@dauwhe
Copy link
Contributor

dauwhe commented Aug 16, 2017

Having only implicit secondary resources leaves out in the cold (i.e. uncached) publications who have HTML resources that aren’t in the reading order: secondary material that can be navigated to, should be stored with the publication, but isn’t a part of ToC. Which happens quite often with reference publications and documentation.

I just did a little experiment with one of our demo books, and removed several chapters from the nav, but those chapters were referenced from some chapters that were in the nav. And they were cached by the Service Worker. It does seem to cache same-origin stuff if it finds a link somewhere.

My real hope is that it would be fine to enumerate secondary resources in complex situations, but that an author would not be required to in simple situations.

@HadrienGardeur
Copy link

Since we've moved from an email thread to Github, I'll repost my initial comment here.

  1. Human-focused. User agents need a list of primary resources and their default ordering, but so do actual users. Most web publications would benefit from a human-readable table of contents. TOCs are crucial for accessibility.

  2. Simplicity. Given the broad need for a TOC, using that as manifest is a straightforward way to avoid duplication (as in EPUB's nav/manifest/spine/ncx). And we've discovered a huge benefit, as we don't need a list of secondary resources to facilitate offline caching via service workers (see the demo books)!

You're conflating two different things here:

  • list of primary resources in reading order (spine in EPUB)
  • table of contents (which is navigation)

They can be the same thing for a novel like Moby Dick but they can also be vastly different, for instance a ToC could:

  • only point to some, not all primary resources
  • point to the same resource multiple times (for example using fragments to specific locations in a resource)
  • point to resources in an order that's not the reading order

Saying that a list of primary resources duplicates a ToC (or any other navigation) is therefore incorrect in the general case.

The "huge benefit" for caching secondary resources comes at a very expensive cost:

  • the whole book is rendered in the background, which can be slow (CPU and network intensive) and takes a lot of resources (CPU, RAM and storage)
  • if you leave the first page in the middle of the process, this could potentially interrupt caching
  • this strategy won't work with platforms that do not support Service Workers (for instance on iOS)
  • without a list of secondary resources, there's no easy way to know what's part of the publication or not and you can't optimize preloading/prefetching (by only requesting the most important resources, such as fonts/CSS/JS)
  • there are many issues with serving video/audio from a SW Cache Storage (byte ranges, the cache could be way too small) which can't be easily solved since you don't have a list of such resources

@GarthConboy
Copy link
Contributor

GarthConboy commented Aug 16, 2017

@BigBlueHat responding to:

@GarthConboy the whole publication is available offline--including secondary resources. 😄 Also, CORS, CSP, the Single Origin Policy, and many other such things set a boundary on ServiceWorker's (on which the demo is based). Additional constraints--beyond those being defined elsewhere by browsers--could certainly be specified.

That said, there's no additional need to re-state the secondary resources. The browser knows how to get them, and does.

I don't think that's the case, the specification of secondary resources is needed to constrain what's taken offline or coalesced into a PWP. Not an issue of getting too little, it's preventing getting too much.

But, happy to defer this discussion until the call on Monday.

Perhaps @dauwhe 's comment of:

My real hope is that it would be fine to enumerate secondary resources in complex situations, but that an author would not be required to in simple situations.

can save the day, but I'm not sure that "complex situations" won't be most/all situations. :-)

@HadrienGardeur
Copy link

HadrienGardeur commented Aug 16, 2017

Also after carefully reviewing the proposal, there's absolutely nothing that I haven't seen proposed before (during the BFF work for example), aside from the fancy 📖 .

I still think that this is a bad idea and that HTML is poorly suited for our use case:

  • it's easier for a UA to work with JSON than HTML in any language
  • you absolutely need to separate the list of primary resources, secondary resources and navigation: they're three completely separate concepts
  • <ul>, <ol> and <a> can't compete in terms of expressiveness with what we could do with a syntax inspired from hypermedia APIs
  • the proposal is misusing a lot of elements/attributes to achieve its goal which can be very confusing (as @lrosenthol has already pointed out here)
  • RDFa for metadata is far from author or UA friendly, JSON-LD would be a vastly superior option (which means that we'll need JSON anyway)
  • rel="next" and rel="previous" only work if you don't re-use resources across publications and can be a pain to deal with whenever you update a publication (need to update three files instead of a single manifest)
  • multiple people in this group (including @laudrain, @baldurbjarnason and several content producers that @llemeurfr talked with) have said before that mixing up a machine-readable ToC with a user-styled one is problematic and doesn't work in practice for their production

@rdeltour
Copy link
Member

Bravo 🎩 and 🐳 for the very interesting concrete proposal! 🎉

I'm ambivalent about the conflation of ToC and resource lists, for the same reasons @HadrienGardeur and @baldurbjarnason already exposed. I like the approach's simplicity, but I'm not (yet?) convinced that it scales well to the diversity of non-trade publications, and the complexity of ToCs.

Another use case to consider –and maybe it's a stupid idea– is one where the ToC itself is dynamic, and is updated dynamically following the users' reading (for instance, the reader discovers new chapters when reading, or there's some in-book purchase options, or some content in educational material is unlocked by a student, etc). HTML's inherent dynamicity could be a deal-breaker for what is essentially a static bit of info.
Having a static JSON list of resource would be more helpful there.

In any case, I believe the two approaches can probably be combined: @dauwhe’s and @BigBlueHat’s approach doesn't prevent the linking to a static JSON manifest, which could include a list of resources.
Of course, we'd need to define what's the authoritative list of resources in that case... but baby steps 👶

@dauwhe
Copy link
Contributor

dauwhe commented Aug 17, 2017

Perhaps @dauwhe 's comment of:

My real hope is that it would be fine to enumerate secondary resources in complex situations, but that an author would not be required to in simple situations.

can save the day, but I'm not sure that "complex situations" won't be most/all situations. :-)

Warning: I'm in the mood for a rant before I head out on holiday and leave the internet behind :)

  1. Let's try to keep the simple things simple. "Hello, World" in EPUB is insane. "Hello, World" in HTML is 42 (!) characters. Don't require media types on every secondary resource. Don't put the same information in three different places.

  2. Don't forget that humans will need to troubleshoot whatever we specify. Remember how fun this was with the ncx? Something with easily-available validators and dev tools is better than something without. Something that doesn't break when you forget a comma is nice. There are advantages to formats with defined error handling. Commenting code is a best practice. Don't forbid it.

  3. Progressive enhancement matters. Tomorrow’s web publications should be at least readable in some fashion by today's browsers.

  4. Accessibility matters. Let's not back away from EPUB's commitment to accessibility. I'm hearing so many reasons why we don't need titles or languages or navigation or any of the things that so many people depend on.

  5. Readers and Authors >> Implementors and Specifiers

  6. We have lots of prior art, good and bad. Let's try to learn from it.

@HadrienGardeur
Copy link

I'm always in a mood to reply to a rant too.

Let's try to keep the simple things simple. "Hello, World" in EPUB is insane. "Hello, World" in HTML is 42 (!) characters. Don't require media types on every secondary resource. Don't put the same information in three different places.

I don't think anyone ever suggested 3 different XML files as the foundation for a WP. But at the same time, our goal is to create a format that can handle various types of publications, not just novels.

Progressive enhancement matters. Tomorrow’s web publications should be at least readable in some fashion by today's browsers.

... and this has nothing to do with going full HTML. It's perfectly possible to have progressive enhancements with a JSON based external manifest linked from primary resources too.

Accessibility matters. Let's not back away from EPUB's commitment to accessibility. I'm hearing so many reasons why we don't need titles or languages or navigation or any of the things that so many people depend on.

Which is unrelated to our discussions in this specific issue.

I would point out that sacrificing primary and secondary resources just to achieve "HTML purity" can be quite destructive and impactful too.

Readers and Authors >> Implementors and Specifiers

Good luck to authors dealing with RDFa.

@iherman
Copy link
Member

iherman commented Aug 17, 2017

Wow, people have been busy while I was asleep:-)

I try to avoid things that have been said by others. Just a few, hopefully additional remarks. I put them into separate comments, to make it easier to respond and followup.

On the problem of secondary resources: finding them can be drag on a User Agent. @dauwhe said that in his experiments this happened automatically; that is reconformting. However, I know that in respec2epub tool (turning respec documents into EPUB) locating those secondary resources was the main problem. We know that listing all those is an issue for authors, that we may need fallbacks, use URL patterns (separate discussion on this with @HadrienGardeur notwithstanding), etc, in a concrete manifest, but we should not underestimate the problem.

@iherman
Copy link
Member

iherman commented Aug 17, 2017

I want to take a step back, however, because there are aspects that I do not really understand. We do have the concept of an abstract manifest and this proposal concentrates on one or two particular manifest items, and the serialization thereof. That is one part of the discussion.

The abstract manifest referst to other items like the language tag. There was a separate discussion with @lrosenthol on the language tag issue, which seems to suggest that the proposal is to serialize the whole of the abstract manifest in an HTML file, more exactly the index.html file.

What does this mean for manifest items that do not have direct counterpart as HTML elements or attributes (i.e., in contrast to language)? An example may be the canonical identifier. Let alone more complex metadata that would have to be expressed in its own syntax anyway. I presume the idea would be to use the meta or link elements of HTML. But that raises some problems.

  1. The link types are listed in the HTML spec[1]. The WG would have to, in some cases, define new link types, register them somehow, etc, and retrofit it into the HTML spec, because the link types would be valid for documents that are not meant for WP-s. The same holds for what HTML calls the metadata names for the meta element.
  2. Using the link/meta element (but also the language attribute) for the purpose of a WP is a bit semantically touchy, shall we say. At least in my view, a, say, meta element provides metadata for the enclosing HTML content. As the spec says in [2]: "The meta element can represent document-level metadata with the name attribute," (emphasis is mine). However, if used for a manifest item, it represents WP level metadata. That is a very different, and makes me very uncomfortable (even if, pragmatically, it may work).

There may be other issues as well, like the fact that the meta element does not have a content...

  1. https://www.w3.org/TR/html51/links.html#sec-link-types
  2. https://www.w3.org/TR/html51/document-metadata.html#the-meta-element

@iherman
Copy link
Member

iherman commented Aug 17, 2017

However... I see a lots of merits in the proposal insofar as making it easy to author simple WP-s. We should not underestimate the power of this.

There is an interesting section in the Web App manifest that does refer to the issue in general. It also contains the following:

Lastly, this specification does not make the standardized solutions found in [HTML] redundant. When members like the name or icons is missing from the manifest, user agents can search in a manifest's owner [HTML] document for things like icons and the application name (or a user agent might even fallback to proprietary tags/metadata, if they are present in a document).

What this tells me is that this may be an avenue to expand on the fallback idea, which we discussed before (like on our last meeting). Just like we said that if the concrete manifest does not have a title, the UA would make an attempt to locate the title in one of the primary resources, can't we expand this in general: something like

  1. We define a syntax for a concrete manifest in some syntax that can be externalized in a separate file.
  2. For some manifest items we would define its counterpart that can be embedded into an HTML file using bona fide (and already specified!) HTML elements and/or attributes. It would not be the goal to be able to express all manifest items in such a manner (think of my semantic issue in Proposal: an HTML-first Table of Contents approach to Web Publication #35 (comment)).
  3. The generic fallback is that if the UA does not find a manifest item in the externalized file, it would make an attempt to locate that in a primary resource, like
    1. in a primary resource with a well specified name, like index.html
    2. the first primary resource if the order is in the manifest already
    3. something else tbd.
  4. the externalized manifest has a precedence over the items found elsewhere

I guess that would make it possible to rely on the undeniable attractive aspect of the proposal (make it easy for simple cases) but make it possible to express more complex cases. It is a bit of an additional drag on UA-s, but if it makes life easier for publishers, it may be worth it...

I am not sure this line works, but it may be worth exploring it imho.

@laudrain
Copy link

FWIW, it seems that Service Workers are now "In Development" in Webkit:
https://webkit.org/status/#?search=service

@llemeurfr
Copy link
Contributor

I also thank Dave and Benjamin for this concrete proposal and prototypes.
I won't repeat a set of comments about some important limitations of the proposal (the need to list essential secondary resources for packaging purposes, the incapacity of html at representing extensible and complex metadata, the difference btw a list of primary resources in default reading order and a human navigable TOC is many occasions), and just add a remark and a question:

remark: this proposal enforces the view that Web Applications and Web Publications are different levels of distribution: a "save to homescreen" feature can be added to a Web Publication, a Web Publication Manifest is not necessarily mixed with a Web Application Manifest. I personally like it.

question: How do Dave/Benjamin represent alternative navigation structures (e.g. list of illustrations) ?

@llemeurfr
Copy link
Contributor

Plus another question: How do Dave/Benjamin satisfy the Requirement 21 of https://www.w3.org/TR/pwp-ucr/ ?

Req. 21: There should be a way to discover that one or more new components have been added to or deleted from a Web Publication.

@llemeurfr
Copy link
Contributor

llemeurfr commented Aug 17, 2017 via email

@BigBlueHat
Copy link
Member Author

First, huge thanks to everyone for taking the time to read the explainer and try the demo!

Second, there's no way on earth that I (or Dave...when he gets back) will be able to wrangle, answer, and address issues as they arise (from each of you) in a single thread--here or on email. So. I'd like to propose the following:

  • please file any broadly wpub related issues/conversations/questions on this repo (ex: "Is the ToC sufficient to provide reading order?"
  • please file any demo or explainer related issues on the html-first repo

I hope that's a sensible approach that will keep us from re-stating things and tripping each other up. 😃

Thanks again!
🎩

@JayPanoz
Copy link

OK, first and foremost, sorry if I’m missing some info, it’s super difficult to keep up as an outsider since pieces are scattered all over the place (including the mailing-list) and it’s sometimes difficult to understand where discussions are going (multiple topics).

What worries me at third sight.

Human-focused. User agents need a list of primary resources and their default ordering, but so do actual users. Most web publications would benefit from a human-readable table of contents. TOCs are crucial for accessibility.

Which is “Design for humans.” in the proposal repo.

Straw man argument. Depends on the human involved.

Jodie Swagger, React.js expert, will find it difficult to read and use, while Johnny Tumbler, good old front-end chap, will prefer that over any other option. And mommy Panoz won’t understand it al all because HTML is gibberish to her.

It’s about comfort, fluency, etc., it’s not absolute.

And, in the proposal repo,

Make authoring easy. […] It’s easy to see what you’re doing, even with tricky things like nesting lists.

Another straw man argument as it completely obfuscates context. It ignores a lot of human beings use CMS, which does make authoring easier for them, it also ignores a lot of devs actually use Markdown, etc.

Also, given sufficiently deep nesting, everything will suck, be it HTML, CSS or JS, even if you’re fluent in those languages. At some point, it’s just about an awful amount of delimiters you can’t process (tags, characters, etc.).

Back to toc.ncx.

Remember how fun this was with the ncx?

Actually, I never had any problem with that, because tools dealt with it for me. On the other hand, when EPUB 3 was released and tools didn’t deal with nav.xhtml yet, it was terrible. I’ve had my fair chair of complex publications, and quite frankly, I would use anything else than a monolithic piece of HTML with attributes everywhere. Long story short, I ended up building a Mac applet: drop your EPUB file on its icon, let it build nav.xhtml from toc.ncx and content.opf’s guide and call it a day.

The easiest authoring is the one you don’t have to deal with (e.g. automate). I urge you to take that into account when designing a proposal.

What I find most disturbing though are the cavalier ways in which authors are regularly mentioned. I can hear a lot about them, but cannot read very much from them. The ReadMe in the proposal repo antagonized me to be honest, because it feels like it is using authors as a mere protection, not human beings asked for feedback.

I’ve discovered this proposal this morning, discussed it with other authors this afternoon and all of them didn't know it existed.

What worries me a little bit more.

And we've discovered a huge benefit, as we don't need a list of secondary resources to facilitate offline caching via service workers (see the demo books)!

This is not a huge benefit. Offline storage is complex, it’s not just about service workers, it’s also about DOM storage, persistent storage, indexed database, etc. but that’s another issue.

More importantly, service worker storage is limited:

  • hard limit (you can’t use more);
  • soft limit (user prompt, which Mommy Panoz will deny because it looks like a virus warning).

UA stands for User Agent, which implies authors have to do things responsibly.

As a user,

we don't need a list of secondary resources to facilitate offline caching via service workers!

sounds utterly terrible. I would expect you carefully listed which secondary resources should be cached offline, because I don’t want you to bloat my storage.

Since

We’ve drawn inspiration from Jeremy Keith’s Resilient Web Design.

Please also note there was criticism from the dev community because the whole book was cached offline, as opposed as progressively cached.

Sorry if that sounds harsh, but quite frankly, some moves are super hostile to authors. If this is going to be EPUB all over again, then I have no interest in caring about this spec. It is not acceptable that authors should be presented with faits accomplis, advocated on their behalf de surcroît.

P.S.: Sorry mom.

@BigBlueHat
Copy link
Member Author

@JayPanoz first, thanks for being here! Given your experience with Blitz and EPUB "wrangling" I'm certain your contributions will be valuable.

Second, please keep in mind the html-first idea is a proposal. It's nothing like a faits accomplis. In fact, its a proposal for the consideration of a very early-state W3C Working Group who has only just begun writing the very beginnings of a spec that won't be ready for First Public Working Draft status for sometime to come...let alone for it to be published as an official Technical Recommendation. Apologies if that was somehow unclear.

Also, the demo is a demo. 😃 It wasn't every meant to be a "this is how browsers will do it." The browsers and reading systems will implement their own offline systems/plans/strategies based on whatever this group produces (regardless of its format). The ServiceWorker demo was simply to show that it could be done no. No, it's not sufficient. Yes, there are limitations. It's a demo. 😁

You also pointed out some concerns about HTML authoring and/or generating. Those I'd very much like to discuss and address. However, that's probably best done on the html-first repo--with references back here (ideally).

It's early days yet. In every possible way. Hang in there!

@TzviyaSiegman
Copy link
Contributor

See comments about Web App Manifest's decision to use JSON at #7 (comment)

@tcole3
Copy link
Contributor

tcole3 commented Aug 21, 2017

Based on this demo / experiment, HTML seems a good serialization option for ToC. But given that the extent of what might be included in the manifest is still open-ended (see still open #15, #20, #21, #22, #23, #29) , and given the as yet unclear relationship with Web App Manifest, I do not believe that HTML is an optimum choice for Manifests. At the very least the Manifest will be a superset of the ToC and may contain much more than demonstrated here. As a working decision on serialization, I think JSON the better choice for Manifests, in part for the reasons cited by Web App Manifest as explained in #7 (as mentioned above) - see also the discussion in Appendix A of the the Web App Manifest Living Doc itself. Note json can be embedded in HTML so there is potential to consider the option of conflating an HTML view of the ToC and the Manifest in a single file for simple use cases. And even if working decision is made to require JSON as the only serialization for Manifests, we could still decide to allow more than one serialization for ToC.

@BigBlueHat
Copy link
Member Author

At the very least the Manifest will be a superset of the ToC and may contain much more than demonstrated here.
@tcole3 thanks for the input. I would love to hear more about what you feel the "much more" would be.

Currently, this HTML-first approach covers all the MUSTs outlined in the forthcoming "infoset" PR. I've not yet seen something that couldn't be accommodated by the HTML-first approach. That doesn't mean it doesn't exist, however. 😃

@baldurbjarnason
Copy link
Contributor

After thinking about this for a while I've come to the conclusion that I'm very much against the direction this proposal would take web publications as a format. It's been very valuable in the discussion and has been a useful counter-balance to other proposals. It has helped me enormously in clarifying my own thoughts on the subject. But, in the end, I think the proposal itself is absolutely not the approach we should take.


While HTML is a very powerful and useful format and in many ways underestimated in this day and age, it is very much not easy to author and I think @BigBlueHat and @dauwhe are vastly overstating it's author-friendliness. It's actually a huge pain.

HTML is sort of manageable when you have several years of familiarity, but even then most of the time people author HTML to make sure they can minimise their need to author HTML:

They author templates, which when combined with simple data structures, and either rich text UIs or minimal markup languages like markdown, mean you don't have to deal with HTML on a regular basis and your users never.

Going from ePub's XML to HTML is only a marginal improvement as HTML's more forgiving parsing isn't going to be much of a benefit when authoring the manifest's data. Not having to deal with namespaces is a plus but, like I wrote above, only a marginal improvement.

Add all of that to the information in @TzviyaSiegman comment #7 (comment) the issues I raised earlier #35 (comment) (i.e. relying on implicit browser behaviours substantially disadvantages a number of really useful edge cases) and other concerns people have raised, the case against HTML as a manifest format becomes quite strong from my perspective.


This proposal has weaknesses of its own that are not inherent to HTML as a general approach for the manifest. Its reliance on the browser engine to fill in the gaps of the manifest is going to be risky in practice, in addition to diminishing the format's overall usefulness for less linear publications.

The approach:

  • Disadvantages server-based clients who would have to reimplement a bunch of browser behaviours to get at the same data as the browser does and even then it'll often be an approximation at best. While server-based crawlers or clients are not going to be the primary consumers of the format, making things easy for them has a halo effect as that broadens the value of the format beyond strictly reading and can help enormously with discovery.

  • Makes testing harder as it relies on browser behaviours that can vary dramatically from platform to platform (i.e. what it fetched and thus cached is going to vary based on the browser and whether or not you are on a mobile OS). If testing is an ease of authorship issue, and I think it is, the direction this proposal takes us towards would make authorship harder, even when compared to other possible ways of using HTML as a serialisation format.

  • As @HadrienGardeur has mentioned it also disadvantages browser-like environments that do not have access to all of the features of a fully fledged browser. WebViews are not guaranteed to get Service Worker support, as an example, which means that clients built using them will have to implement their own caching behaviours. Those behaviours will differ from clients that use Service Workers and the end result will not be easily predictable by authors without testing.

Many of these ambiguities here are a consequence of caching in this proposal being an incidental side effect of pre-rendering the chapters as opposed to being a pre-defined algorithm that happens to be implemented as a Service Worker. And how you are supposed to derive a set of unique and normalised URLs from the ToC is also ambiguous since it relies on the DOM for href value normalisation (which doesn't handle fragments IIRC).

To resolve these ambiguities you need to normalise the data you get from the HTML to clearly defined and specified data structures (primary resources, secondary resources, reading order, basic metadata). And in the web world mapping HTML to data structures the browser is supposed to act on means specifying those structures as a DOM API. Which in turn means we'd need to define how variations in the HTML get interpreted as a consistent API across the web platform. And then we need to tackle the whole WebIDL thing and we're already well into the weeds.

Immediately, we are in for a much more complicated format (and API) to specify and implement than with any of the other approaches that have been proposed. And once you add in testing and HTML's inherent complexity when used for basic data structures I am strongly of the opinion that this makes authorship harder in too many cases for me to be comfortable.

I don't think this is simpler overall than other proposals I've seen so far. Once you add in the inherent complexities common to all SGML-style markup formats and the work we need to do to remove the built-in ambiguities here, we get to 'just as complicated' at best for all parties involved.

When you're dealing with a small number of flat lists of simple objects (like most of the bits in the manifest so far except for the ToC) JSON is, in my not so humble opinion, easier to teach, author, and consume, all else being equal.

So, based on all of the discussions so far my ideal—striking a balance between flexibility, complexity, ease of use, and ease of implementation—would be:

  • Defined using JSON-LD but with JSON-LD processing being completely optional for the client (i.e. follow the example of Activity Streams 2.0).
  • Mostly a collection of flat lists of very simple objects or simple key-value structures.
  • Individually, every feature is optional (as much as is possible without compromising accessibility).
  • Has a very, very narrow range of built-in terms and properties.
  • Complex metadata is delegated to pre-existing specialised formats external to the manifest itself.
  • The ToC is an HTML file but a simple ToC can be generated from the provided titles of the files in the reading order.
  • Primary resources and, if present, secondary resources are cached at the user's discretion. If no secondary resources are listed the consequences are left up to the UA.

All of the above is irrespective of whether the HTML files are used as fallbacks when data is missing from the manifest which is an idea that I firmly support. Specifically, making single HTML publications as easy to author as is possible is a worthwhile goal whose benefits would be far-reaching. But I also think that the simplest way of making that work reliably is by defining a JSON format first that has clearly defined HTML fallbacks.

Once you have that then it becomes almost trivial to predictably and automatically transform any given single HTML file into a useful and accessible web publication, as long as the file has no javascript or is of a strict subset like AMP. That capability on its own dramatically increases the usefulness of every single tool, library, app, script or service that is built to support web publications.

@iherman
Copy link
Member

iherman commented Aug 22, 2017

Whereas I agree with most of @baldurbjarnason's comments, I think I disagree with some of the final conclusions. I made a separate set of comments in #32 (comment) which, for the time being, makes me believe that we should start by the Web App Manifest, extend it for our needs (which is possible), but rely therefore on the work being done around that spec in how a WP management could be incorporated into the browser world. Otherwise we will have to reinvent the wheel.

Just for the good order, I believe this discussion should take place on issue #7, though, an not here.

@iherman
Copy link
Member

iherman commented Mar 13, 2018

@iherman iherman closed this as completed Mar 13, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests