Fixing book abbreviations in <ref> content for entries beginning with… #109

cbearden · 2021-09-09T02:11:08Z

… α; I identified them by comparing them with a script that compares book abbreviations with the list on xii of the lexicon; in each case I verify the spelling of the abbrev in the text. Many more fixes to come. The file validates. See issue #107 .

… α; I identified them by comparing them with a script that compares book abbreviations with the list on xii of the lexicon; in each case I verify the spelling of the abbrev in the text. Many more fixes to come. The file validates.

destatez · 2021-09-09T04:21:23Z

Charles I took a look at the difs on this. Everything looks good except for line 4059, which is a ‘lengthy’ one. I have both old and new versions shown below. The problem is as you expanded the ‘range’ in the old version to the consecutive verse references you left out 1John 3:7.... Sorry about the format. My tablet and Chrome have a few issues when pasting in from spreadsheets. Dave *Old:* *<ref osisRef="1John.3.6-1John.3.9">Jn 3:6-9</ref>* *New:* <ref osisRef="1John.3.6">3:6</ref>,<ref osisRef="1John.3.8">8</ref><ref osisRef="1John.3.9">9</ref>

On Wed, Sep 8, 2021 at 9:11 PM Charles Bearden ***@***.***> wrote: … α; I identified them by comparing them with a script that compares book abbreviations with the list on xii of the lexicon; in each case I verify the spelling of the abbrev in the text. Many more fixes to come. The file validates. See issue #107 <#107> . ------------------------------ You can view, comment on, or merge this pull request online at: #109 Commit Summary - Fixing book abbreviations in <ref> content for entries beginning with α; I identified them by comparing them with a script that compares book abbreviations with the list on xii of the lexicon; in each case I verify the spelling of the abbrev in the text. Many more fixes to come. The file validates. File Changes - *M* abbott-smith.tei.xml <https://github.com/translatable-exegetical-tools/Abbott-Smith/pull/109/files#diff-ac3ad71dfc46b7215c6a86379a398601d587fe0b498657d0ff8a8a16c8bea9ac> (40) Patch Links: - https://github.com/translatable-exegetical-tools/Abbott-Smith/pull/109.patch - https://github.com/translatable-exegetical-tools/Abbott-Smith/pull/109.diff — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#109>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AEACF3YWBZ5EU2TE7C2PV6TUBAJUPANCNFSM5DWCKJAA> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

--

cbearden · 2021-09-10T22:13:57Z

Hi Dave,

I made that change because the original text doesn't specify the range, but rather the particular verses 1 John 3:6,8,9 (see attached screenshot). In the Greek texts I have, verse 7 doesn't include a form of ἁμαρτάνω, so giving the range instead of the individual verses reflected both the dictionary and the Greek text. I suppose we could give

<ref osisRef="1John.3.6">3:6</ref>,<ref osisRef="1John.3.8-9">8,9</ref>

if we thought it was important to convert comma-separated sequences of verses into ranges. That hasn't been consistently done so far, but I do see it sometimes. My preference though is to stick closely to the form of the original text. I think the markup under ἁμαρτωλός on line 4109 is wrong, since the text has "Mt 9^10,11,12". Even if you want a range in the @osisRef, the content of the ref element should be what the original has. I would go so far as to say the verse numbers should all be marked up as superscript, with no colon between chapter and verse, but it's a bit late for that.

destatez · 2021-09-11T01:07:15Z

Charles Glad you went back to the source, and using its syntax with comma separation is the best solution. Glad you found this. In our Phase I work we missed the fact that the original did not include verse 7 and we had it as a range. Our lexicon file was created with that range and I will have to get an Issue opened to make sure it’s fixed in our repo. Dave

On Fri, Sep 10, 2021 at 5:14 PM Charles Bearden ***@***.***> wrote: Hi Dave, I made that change because the original text doesn't specify the range, but rather the particular verses 1 John 3:6,8,9 (see attached screenshot). In the Greek texts I have, verse 7 doesn't include a form of ἁμαρτάνω, so giving the range instead of the individual verses reflected both the dictionary and the Greek text. I suppose we could give <ref osisRef="1John.3.6">3:6</ref>,<ref osisRef="1John.3.8-9">8,9</ref> if we thought it was important to convert comma-separated sequences of verses into ranges. That hasn't been consistently done so far, but I do see it sometimes. My preference though is to stick closely to the form of the original text. I think the markup under ἁμαρτωλός on line 4109 is wrong, since the text has "Mt 9^10,11,12". Even if you want a range in the @osisRef, the content of the ref element should be what the original has. [image: hamartano] <https://user-images.githubusercontent.com/427030/132922318-c65a7337-f412-463b-84a8-b68ead4650e5.jpeg> — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#109 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AEACF332X6DLY7EWISDAJ7TUBJ7K7ANCNFSM5DWCKJAA> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

--

cbearden · 2021-09-11T14:49:16Z

Dave,

I appreciate your understanding. It’s a large project, and a number different people and groups have worked on it at various times. They have all had different priorities for the markup and different use cases. I imagine a case can be made for normalizing the text in some way to make it more useful in a particular context. I have my own preferences, but we’ve needed everybody’s contribution. I can remember when large swathes of the dictionary were just OCR full of errors and no markup. I’m grateful that we’ve gotten as far as we have!

Chuck

destatez · 2021-09-11T15:10:32Z

Charles I sure am glad for your guy’s work on compliance with the A-S PDF, but also with your compliance checks with all of the standards used in the XML. Right now UnfoldingWord has the editors and the checker for en_UGL ignoring the references. I want to make sure I stay up on the mods you are making to make sure that those mods get worked into the en_UGL. I may have to go back and review all of the TeT Issues to see if there needs to be equivalent changes to our files. Are there any that stand out in your memory that resulted in change(s) to references, content and/or format? Todd Price had it right when he was emphasizing our Phase I activity to make sure that we would be working from a stable base for Phase II. You guys need that as well. I’m glad that in this sense we are all working towards the same goal. Dave

On Sat, Sep 11, 2021 at 9:49 AM Charles Bearden ***@***.***> wrote: Dave, I appreciate your understanding. It’s a large project, and a number different people and groups have worked on it at various times. They have all had different priorities for the markup and different use cases. I imagine a case can be made for normalizing the text in some way to make it more useful in a particular context. I have my own preferences, but we’ve needed everybody’s contribution. I can remember when large swathes of the dictionary were just OCR full of errors and no markup. I’m grateful that we’ve gotten as far as we have! Chuck — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#109 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AEACF366DTR5EMMFQ4XRDD3UBNT7PANCNFSM5DWCKJAA> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

--

jonathanrobie · 2021-09-11T15:24:52Z

Sounds like we are agreeing that we should not change the surface text for references. Can we agree that any changes to the surface text should be restricted to notes, contained in markup? Of course, we have free reign with attributes. On Sat, Sep 11, 2021 at 11:10 AM David Statezni ***@***.***> wrote:

…

Charles I sure am glad for your guy’s work on compliance with the A-S PDF, but also with your compliance checks with all of the standards used in the XML. Right now UnfoldingWord has the editors and the checker for en_UGL ignoring the references. I want to make sure I stay up on the mods you are making to make sure that those mods get worked into the en_UGL. I may have to go back and review all of the TeT Issues to see if there needs to be equivalent changes to our files. Are there any that stand out in your memory that resulted in change(s) to references, content and/or format? Todd Price had it right when he was emphasizing our Phase I activity to make sure that we would be working from a stable base for Phase II. You guys need that as well. I’m glad that in this sense we are all working towards the same goal. Dave On Sat, Sep 11, 2021 at 9:49 AM Charles Bearden ***@***.***> wrote: > Dave, > > I appreciate your understanding. It’s a large project, and a number > different people and groups have worked on it at various times. They have > all had different priorities for the markup and different use cases. I > imagine a case can be made for normalizing the text in some way to make it > more useful in a particular context. I have my own preferences, but we’ve > needed everybody’s contribution. I can remember when large swathes of the > dictionary were just OCR full of errors and no markup. I’m grateful that > we’ve gotten as far as we have! > > Chuck > > — > You are receiving this because you commented. > Reply to this email directly, view it on GitHub > < #109 (comment) >, > or unsubscribe > < https://github.com/notifications/unsubscribe-auth/AEACF366DTR5EMMFQ4XRDD3UBNT7PANCNFSM5DWCKJAA > > . > Triage notifications on the go with GitHub Mobile for iOS > < https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 > > or Android > < https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub >. > > -- — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#109 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AANPTPOJDCNMYDVAKV4SWBLUBNWPHANCNFSM5DWCKJAA> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

destatez · 2021-09-11T15:46:09Z

Jonathan I agree with your plan of attack. You guys are much more knowledgeable about all the underlying standards on this, so its call on on the best way to update the XML. We are no longer using the XML. All the en_UGL files were created from it quite a good while back. That’s why I want to do a little review of your Issues and resolutions to determine if/which ones need to manually get worked into our files. We have asked our editors to refer back to the PDF if they have questions on A-S content. Dave On Sat, Sep 11, 2021 at 10:25 AM Jonathan Robie ***@***.***> wrote:

Sounds like we are agreeing that we should not change the surface text for references. Can we agree that any changes to the surface text should be restricted to notes, contained in markup? Of course, we have free reign with attributes. On Sat, Sep 11, 2021 at 11:10 AM David Statezni ***@***.***> wrote: > Charles > > I sure am glad for your guy’s work on compliance with the A-S PDF, but also > with your compliance checks with all of the standards used in the XML. > Right now UnfoldingWord has the editors and the checker for en_UGL ignoring > the references. I want to make sure I stay up on the mods you are making to > make sure that those mods get worked into the en_UGL. I may have to go back > and review all of the TeT Issues to see if there needs to be equivalent > changes to our files. Are there any that stand out in your memory that > resulted in change(s) to references, content and/or format? Todd Price had > it right when he was emphasizing our Phase I activity to make sure that we > would be working from a stable base for Phase II. You guys need that as > well. I’m glad that in this sense we are all working towards the same goal. > > Dave > > On Sat, Sep 11, 2021 at 9:49 AM Charles Bearden ***@***.***> > wrote: > > > Dave, > > > > I appreciate your understanding. It’s a large project, and a number > > different people and groups have worked on it at various times. They have > > all had different priorities for the markup and different use cases. I > > imagine a case can be made for normalizing the text in some way to make > it > > more useful in a particular context. I have my own preferences, but we’ve > > needed everybody’s contribution. I can remember when large swathes of the > > dictionary were just OCR full of errors and no markup. I’m grateful that > > we’ve gotten as far as we have! > > > > Chuck > > > > — > > You are receiving this because you commented. > > Reply to this email directly, view it on GitHub > > < > #109 (comment) > >, > > or unsubscribe > > < > https://github.com/notifications/unsubscribe-auth/AEACF366DTR5EMMFQ4XRDD3UBNT7PANCNFSM5DWCKJAA > > > > . > > Triage notifications on the go with GitHub Mobile for iOS > > < > https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 > > > > or Android > > < > https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub > >. > > > > > -- > > — > You are receiving this because you are subscribed to this thread. > Reply to this email directly, view it on GitHub > < #109 (comment) >, > or unsubscribe > < https://github.com/notifications/unsubscribe-auth/AANPTPOJDCNMYDVAKV4SWBLUBNWPHANCNFSM5DWCKJAA > > . > Triage notifications on the go with GitHub Mobile for iOS > < https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 > > or Android > < https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub >. > > — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#109 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AEACF32ILPJD5FSIVSDBDETUBNYE5ANCNFSM5DWCKJAA> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

--

cbearden · 2021-09-11T16:39:53Z

Sounds like we are agreeing that we should not change the surface text for references.

Let me restate my understanding of what you are saying here: We should not depart from the content of the original document for references. "surface text" here refers to the Abbott-Smith original?

Can we agree that any changes to the surface text should be restricted to notes, contained in markup? Of course, we have free reign with attributes.

If I got "surface text" right, then I'm on board.

jonathanrobie · 2021-09-11T17:13:25Z

Yes, "surface text" means the Abbott-Smith original. Jonathan

…

On Sat, Sep 11, 2021 at 12:40 PM Charles Bearden ***@***.***> wrote: Sounds like we are agreeing that we should not change the surface text for references. Let me restate my understanding of what you are saying here: We should not depart from the content of the original document for references. "surface text" here refers to the Abbott-Smith original? Can we agree that any changes to the surface text should be restricted to notes, contained in markup? Of course, we have free reign with attributes. If I got "surface text" right, then I'm on board. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#109 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AANPTPN2KAM5Z5VDIVYM3R3UBOA6HANCNFSM5DWCKJAA> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

destatez · 2021-09-11T17:25:08Z

Guys I got thinking about this after my last response. If surface text refers to the XML that we worked on, the answer would be NO!. Since I've seen your definition as the PDF, the answer is YES! Dave

…

On Sat, Sep 11, 2021 at 11:40 AM Charles Bearden ***@***.***> wrote: Sounds like we are agreeing that we should not change the surface text for references. Let me restate my understanding of what you are saying here: We should not depart from the content of the original document for references. "surface text" here refers to the Abbott-Smith original? Can we agree that any changes to the surface text should be restricted to notes, contained in markup? Of course, we have free reign with attributes. If I got "surface text" right, then I'm on board. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#109 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AEACF37I6X3VS65EMBDGU73UBOA6HANCNFSM5DWCKJAA> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

cbearden · 2021-09-11T18:52:41Z

Excellent, Dave & Jonathan!

cbearden · 2021-09-11T18:55:54Z

Jonathan, there are a good many changes in the ref element content to come (then I'll turn to the @osisRef attribute values and to discrepancies between element content & attribute values). Do you want me to push them to this branch/pull request in batches, so that you aren't reviewing so many at once, or do you prefer to wait until I've made all the needed changes I've identified? I can go either way.

jonathanrobie · 2021-09-11T18:58:39Z

I would rather look at the whole shebang, unless you want me to look at a smaller batch that you have questions about.

…

On Sat, Sep 11, 2021 at 2:56 PM Charles Bearden ***@***.***> wrote: Jonathan, there are a good many changes in the ref element content to come (then I'll turn to the @osisRef attribute values and to discrepancies between element content & attribute values). Do you want me to push them to this branch/pull request in batches, so that you aren't reviewing so many at once, or do you prefer to wait until I've made all the needed changes I've identified? I can go either way. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#109 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AANPTPKOCR7PGBWUOVXYMA3UBOQ4LANCNFSM5DWCKJAA> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

cbearden · 2021-09-11T21:53:59Z

Okay, in that case hold off on this pull request. There are probably about 135 more fixes to book abbreviations in ref element content, that is cases where the book abbreviation in the element doesn't match one of the abbreviations on xii of the dictionary. What I do is check the reference in the XML against the PDF and correct it if the XML doesn't match the PDF. There have been two cases where A-S misspells the abbreviation:

"II Tim" in ἀγαπάω|G25
"Lu" in ἄρτος|G740

and I left those refs unchanged and added a comment  for any people working on the XML in the future. The attached file has the remaining cases where my script found an abbreviation that doesn't match the abbrevs list, and I'll check them against the PDF.

Fixing these abbreviations, and ensuring that all the @osisRef abbreviations match the OSIS ID standard (the next stage), will enable me to compare the two to look for discrepancies (the third stage). Does this make sense?

mismatches.txt

jonathanrobie · 2021-09-11T22:12:19Z

Yes, it does make sense.

…

On Sat, Sep 11, 2021 at 5:54 PM Charles Bearden ***@***.***> wrote: Okay, in that case hold off on this pull request. There are probably about 135 more fixes to book abbreviations in ref element content, that is cases where the book abbreviation in the element doesn't match one of the abbreviations on xii of the dictionary. What I do is check the reference in the XML against the PDF and correct it if the XML doesn't match the PDF. There have been two cases where A-S misspells the abbreviation: - "II Tim" in *ἀγαπάω|G25* - "Lu" in *ἄρτος|G740* and I left those refs unchanged and added a comment  for any people working on the XML in the future. The attached file has the remaining cases where my script found an abbreviation that doesn't match the abbrevs list, and I'll check them against the PDF. Fixing these abbreviations, and ensuring that all the @osisRef abbreviations match the OSIS ID standard (the next stage), will enable me to compare the two to look for discrepancies (the third stage). Does this make sense? mismatches.txt <https://github.com/translatable-exegetical-tools/Abbott-Smith/files/7148571/mismatches.txt> — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#109 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AANPTPNHUX5NIII52M3LNZTUBPFYFANCNFSM5DWCKJAA> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

…beginning with ε. Went back & picked up some for II Jn & III Jn which my script doesn't presently detect.

…, and κ; caught a few back in α; the file is valid against the schema.

… file validates.

… φ, χ, ψ; these are the last text abbreviations that don't match the AS list; file validates.

cbearden added 5 commits September 11, 2021 17:24

Fixing book abbreviations in <ref> content for entries through those …

dbfa184

…beginning with ε. Went back & picked up some for II Jn & III Jn which my script doesn't presently detect.

Checking & fixing AS book abbreviations for entries beginning η, θ, ι…

4b05388

…, and κ; caught a few back in α; the file is valid against the schema.

Checking & fixing AS book abbreviations in entries beginning λ & μ.

14549b7

Checking & fixing AS book abbreviations in entries beginning ν, ο, π;…

dee63c2

… file validates.

Checking & fixing AS book abbreviations in entries beginning σ, τ, υ,…

9c1eeba

… φ, χ, ψ; these are the last text abbreviations that don't match the AS list; file validates.

jonathanrobie merged commit 2bf7cee into translatable-exegetical-tools:master Oct 18, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixing book abbreviations in <ref> content for entries beginning with… #109

Fixing book abbreviations in <ref> content for entries beginning with… #109

cbearden commented Sep 9, 2021

destatez commented Sep 9, 2021 via email

cbearden commented Sep 10, 2021 •

edited

Loading

destatez commented Sep 11, 2021 via email

cbearden commented Sep 11, 2021

destatez commented Sep 11, 2021 via email

jonathanrobie commented Sep 11, 2021 via email

destatez commented Sep 11, 2021 via email

cbearden commented Sep 11, 2021

jonathanrobie commented Sep 11, 2021 via email

destatez commented Sep 11, 2021 via email

cbearden commented Sep 11, 2021

cbearden commented Sep 11, 2021

jonathanrobie commented Sep 11, 2021 via email

cbearden commented Sep 11, 2021

jonathanrobie commented Sep 11, 2021 via email

Fixing book abbreviations in <ref> content for entries beginning with… #109

Fixing book abbreviations in <ref> content for entries beginning with… #109

Conversation

cbearden commented Sep 9, 2021

destatez commented Sep 9, 2021 via email

cbearden commented Sep 10, 2021 • edited Loading

destatez commented Sep 11, 2021 via email

cbearden commented Sep 11, 2021

destatez commented Sep 11, 2021 via email

jonathanrobie commented Sep 11, 2021 via email

destatez commented Sep 11, 2021 via email

cbearden commented Sep 11, 2021

jonathanrobie commented Sep 11, 2021 via email

destatez commented Sep 11, 2021 via email

cbearden commented Sep 11, 2021

cbearden commented Sep 11, 2021

jonathanrobie commented Sep 11, 2021 via email

cbearden commented Sep 11, 2021

jonathanrobie commented Sep 11, 2021 via email

cbearden commented Sep 10, 2021 •

edited

Loading