Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing book abbreviations in <ref> content for entries beginning with… #109

Conversation

cbearden
Copy link
Member

@cbearden cbearden commented Sep 9, 2021

… α; I identified them by comparing them with a script that compares book abbreviations with the list on xii of the lexicon; in each case I verify the spelling of the abbrev in the text. Many more fixes to come. The file validates. See issue #107 .

… α; I identified them by comparing them with a script that compares book abbreviations with the list on xii of the lexicon; in each case I verify the spelling of the abbrev in the text. Many more fixes to come. The file validates.
@destatez
Copy link
Contributor

destatez commented Sep 9, 2021 via email

@cbearden
Copy link
Member Author

cbearden commented Sep 10, 2021

Hi Dave,

I made that change because the original text doesn't specify the range, but rather the particular verses 1 John 3:6,8,9 (see attached screenshot). In the Greek texts I have, verse 7 doesn't include a form of ἁμαρτάνω, so giving the range instead of the individual verses reflected both the dictionary and the Greek text. I suppose we could give

<ref osisRef="1John.3.6">3:6</ref>,<ref osisRef="1John.3.8-9">8,9</ref>

if we thought it was important to convert comma-separated sequences of verses into ranges. That hasn't been consistently done so far, but I do see it sometimes. My preference though is to stick closely to the form of the original text. I think the markup under ἁμαρτωλός on line 4109 is wrong, since the text has "Mt 9^10,11,12". Even if you want a range in the @osisRef, the content of the ref element should be what the original has. I would go so far as to say the verse numbers should all be marked up as superscript, with no colon between chapter and verse, but it's a bit late for that.

hamartano

@destatez
Copy link
Contributor

destatez commented Sep 11, 2021 via email

@cbearden
Copy link
Member Author

Dave,

I appreciate your understanding. It’s a large project, and a number different people and groups have worked on it at various times. They have all had different priorities for the markup and different use cases. I imagine a case can be made for normalizing the text in some way to make it more useful in a particular context. I have my own preferences, but we’ve needed everybody’s contribution. I can remember when large swathes of the dictionary were just OCR full of errors and no markup. I’m grateful that we’ve gotten as far as we have!

Chuck

@destatez
Copy link
Contributor

destatez commented Sep 11, 2021 via email

@jonathanrobie
Copy link
Contributor

jonathanrobie commented Sep 11, 2021 via email

@destatez
Copy link
Contributor

destatez commented Sep 11, 2021 via email

@cbearden
Copy link
Member Author

Sounds like we are agreeing that we should not change the surface text for references.

Let me restate my understanding of what you are saying here: We should not depart from the content of the original document for references. "surface text" here refers to the Abbott-Smith original?

Can we agree that any changes to the surface text should be restricted to notes, contained in markup? Of course, we have free reign with attributes.

If I got "surface text" right, then I'm on board.

@jonathanrobie
Copy link
Contributor

jonathanrobie commented Sep 11, 2021 via email

@destatez
Copy link
Contributor

destatez commented Sep 11, 2021 via email

@cbearden
Copy link
Member Author

Excellent, Dave & Jonathan!

@cbearden
Copy link
Member Author

Jonathan, there are a good many changes in the ref element content to come (then I'll turn to the @osisRef attribute values and to discrepancies between element content & attribute values). Do you want me to push them to this branch/pull request in batches, so that you aren't reviewing so many at once, or do you prefer to wait until I've made all the needed changes I've identified? I can go either way.

@jonathanrobie
Copy link
Contributor

jonathanrobie commented Sep 11, 2021 via email

@cbearden
Copy link
Member Author

Okay, in that case hold off on this pull request. There are probably about 135 more fixes to book abbreviations in ref element content, that is cases where the book abbreviation in the element doesn't match one of the abbreviations on xii of the dictionary. What I do is check the reference in the XML against the PDF and correct it if the XML doesn't match the PDF. There have been two cases where A-S misspells the abbreviation:

  • "II Tim" in ἀγαπάω|G25
  • "Lu" in ἄρτος|G740

and I left those refs unchanged and added a comment <!-- sic abbrev --> for any people working on the XML in the future. The attached file has the remaining cases where my script found an abbreviation that doesn't match the abbrevs list, and I'll check them against the PDF.

Fixing these abbreviations, and ensuring that all the @osisRef abbreviations match the OSIS ID standard (the next stage), will enable me to compare the two to look for discrepancies (the third stage). Does this make sense?

mismatches.txt

@jonathanrobie
Copy link
Contributor

jonathanrobie commented Sep 11, 2021 via email

…beginning with ε. Went back & picked up some for II Jn & III Jn which my script doesn't presently detect.
…, and κ; caught a few back in α; the file is valid against the schema.
… φ, χ, ψ; these are the last text abbreviations that don't match the AS list; file validates.
@jonathanrobie jonathanrobie merged commit 2bf7cee into translatable-exegetical-tools:master Oct 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants