Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Looking for IIIF 3.0 annotation example #186

Open
joesong168 opened this issue Sep 20, 2021 · 11 comments
Open

Looking for IIIF 3.0 annotation example #186

joesong168 opened this issue Sep 20, 2021 · 11 comments
Assignees
Labels
enhancement New feature or request

Comments

@joesong168
Copy link

joesong168 commented Sep 20, 2021

I've tried following annotationPage with no luck

{
    "id": "https://api.dev.etu.wiki/c/3/9a2febe6-8569-4ab3-b4e1-19e0ad53cbd8/ap/c",
    "type": "AnnotationPage",
    "items": [
        {
            "id": "https://api.dev.etu.wiki/c/3/9a2febe6-8569-4ab3-b4e1-19e0ad53cbd8/ap/c/a/043a7bb7-d77b-44bd-9517-71bf6f551a1a",
            "type": "Annotation",
            "motivation": "supplementing",
            "body": {
                "type": "TextualBody",
                "value": "authorA",
                "format": "text/plain",
                "language": "zh-Hants"
            },
            "target": "https://api.dev.etu.wiki/c/3/9a2febe6-8569-4ab3-b4e1-19e0ad53cbd8#xywh=958,5101,493,493"
        },
        {
            "id": "https://api.dev.etu.wiki/c/3/9a2febe6-8569-4ab3-b4e1-19e0ad53cbd8/ap/c/a/24453f8f-9c8b-4c82-8ae9-ffa8a779f8a6",
            "type": "Annotation",
            "motivation": "supplementing",
            "body": {
                "type": "TextualBody",
                "value": "authorB",
                "format": "text/plain",
                "language": "zh-Hants"
            },
            "target": "https://api.dev.etu.wiki/c/3/9a2febe6-8569-4ab3-b4e1-19e0ad53cbd8#xywh=968,4544,490,462"
        }
    ],
    "@context": "http://iiif.io/api/p/3/context.json"
}
@jbaiter jbaiter self-assigned this Sep 29, 2021
@jbaiter jbaiter added the enhancement New feature or request label Sep 29, 2021
@jbaiter
Copy link
Member

jbaiter commented Sep 29, 2021

Thanks for reporting,I think I've only really tested it with v2 annotations so far, although the code already has paths for v3 support. Should probably be only a question of fixing some small bugs/inconsistencies.
Can you also provide a Manifest URL for your fixture so I can test it end-to-end?

@HenryH09
Copy link

Hello
Have you had the time to take a look into this ?
I am having some issues as well with the Presentation API V3 semantics for the OCR data (supplementing annotation) => nothing is displayed not even the textoverlay tool box. You can find a complete manifest example in the official IIIF cookbook :
https://iiif.io/api/cookbook/recipe/0068-newspaper/newspaper_issue_1-manifest.json
(https://iiif.io/api/cookbook/recipe/0068-newspaper/)

@jbaiter
Copy link
Member

jbaiter commented Oct 18, 2021

Thanks for providing the full manifest, I'll try to find time to work on this, this week or next.

@sauterl
Copy link

sauterl commented Oct 28, 2021

Hi There

With the hope to help @jbaiter to find and fix the culprits faster, here are some findings we had debugging what goes wrong when using IIIF 3.0:

  • The plugin assumes the existence of a resources array on annotations, which was renamed to items in IIIF 3.0. See the according definition.
  • In [saga.js:83] there is a non-version specific @id (IIIF v2), which should be replaced by seeAlso.id ?? seeAlso['@id']
  • fetchExternalAnnotationResources is also very specific to IIIF v2, in particular the usage of resources and resource instead of items and body.
  • Last but not least, as far as I understood the IIIF v3 documentation, supplementing motivated Annotations could also directly link to external OCR resources (e.g. a hocr file), however processTextsFromAnnotations assumes directly OCR content. (i.e. the condition anno.motivation === 'supplementing' is too weak for this particular use.

@sauterl
Copy link

sauterl commented Oct 28, 2021

I'm not entirely sure whether this is proper IIIF v3 usage, please bare with me, but adding a seeAlso on each canvas with an id pointing towards an external OCR resource (i.e. hOcr or ALTO) the plugin works as intended with some fixes as outlined in my previous comment.

Particularly what did the trick for our use case and manifest is really to tweak the IIIF v3 condition.

@stone12379 For your use case, I guess the findings from the previous comment should already help a lot.
However, as far as I can tell, for a more robust IIIF v3 support, processTextsFromAnnotations (as written above), needs a stricter condition in order to filter out external OCR resources (which it currently does not).

sauterl pushed a commit to sauterl/mirador-textoverlay that referenced this issue Oct 30, 2021
@jbaiter
Copy link
Member

jbaiter commented Jun 22, 2022

So, some long overdue updates on this front, sorry it took so long, thanks to everybody for the feedback!

The example from the IIIF Cookbook now renders the annotations, but:

  • By default it will use the ALTO in seeAlso for rendering ('proper' OCR always is preferrered to annotations in this plugin)
  • The Annotations are not line-level and thus text rendering is pretty much broken by design, since we rely on the text to be at least structured into lines for some rendering hints that make text selection in SVG work. Additionally, the segmentation in the annotations is not even at the word-level, some annotations contain parts of multiple words.
  • The annotations do not match the canvas, so the overlay does not match the underlying image. For example, the first word annotation is 84 at xywh=182,476,59,43, but it's actually at approximately xywh=143,377,51,39.
    I assume this happens because the annotations were generated 1:1 from the ALTO which targets a 4562x6282 image, while the IIIF Canvas is 3602x500.
    The plugin scales down the coordinates when it renders text from the ALTO XML, so it renders just fine. As per the spec annotations are always relative to the dimensions of the canvas they target, so this adjustment is not done for annotations.
    tl;dr The Annotations in the cookbook example are broken and should be fixed

For comparison, here are two screenshots, one showing the text rendering from the ALTO and one with the annotations:

ALTO text

image

Annotation text

image

I have pushed my changes to the iiifv3 branch, could you please test this version with your manifests @sauterl @joesong168?

@glenrobson
Copy link

Hi @jbaiter, we've recently updated the Newspaper recipe with the following changes:

  • (hopefully) Fixed the annotations
  • Moved the Alto to rendering rather than seeAlso
  • Changed the target of the annotations to include a link to the Manifest

Let us know if you spot any further problems.

Also would it be possible to add a iiif-content parameter to your demo so that we can pass in a manifest and include a link from the cookbook to your plugin? For info Mirador uses the following:

https://projectmirador.org/embed/?iiif-content=https://iiif.io/api/cookbook/recipe/0068-newspaper/newspaper_title-collection.json

@jbaiter
Copy link
Member

jbaiter commented Aug 8, 2023

Thanks for letting me know! I've updated the code to also look at rendering to discover referenced OCR files and fixed some other IIIF3 stuff related to annotations along the way.
The iiif-content parameter is now included in the demo as well.

The ALTO from the Cookbook example, however, doesn't fully match up with the Canvas anymore, something's off:

grafik

https://iiifv3--mirador-textoverlay.netlify.app/?iiif-content=https://iiif.io/api/cookbook/recipe/0068-newspaper/newspaper_issue_2-manifest.json

@glenrobson
Copy link

Thanks for including the iiif-content link and looking at the rendering! Ill see if I can figure what is going on with the ALTO. It was generated using tesseract but maybe I used the wrong sized image or something.

@jbaiter
Copy link
Member

jbaiter commented Aug 9, 2023

Found the problem: There's a mismatch between the Canvas size and the Image and OCR size:

  • Canvas: 3602x5000px
  • Image: 3517x5000px
  • OCR: 3517x5000px

@glenrobson
Copy link

That is weird! but thank you Ill look at updating the ALTO (and annotations).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants