-
Notifications
You must be signed in to change notification settings - Fork 1
/
odf-demo.xml
2 lines (2 loc) · 10.7 KB
/
odf-demo.xml
1
2
<?xml version="1.0" encoding="UTF-8"?>
<office:document-content xmlns:office="urn:oasis:names:tc:opendocument:xmlns:office:1.0" xmlns:style="urn:oasis:names:tc:opendocument:xmlns:style:1.0" xmlns:text="urn:oasis:names:tc:opendocument:xmlns:text:1.0" xmlns:table="urn:oasis:names:tc:opendocument:xmlns:table:1.0" xmlns:draw="urn:oasis:names:tc:opendocument:xmlns:drawing:1.0" xmlns:fo="urn:oasis:names:tc:opendocument:xmlns:xsl-fo-compatible:1.0" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:meta="urn:oasis:names:tc:opendocument:xmlns:meta:1.0" xmlns:number="urn:oasis:names:tc:opendocument:xmlns:datastyle:1.0" xmlns:svg="urn:oasis:names:tc:opendocument:xmlns:svg-compatible:1.0" xmlns:chart="urn:oasis:names:tc:opendocument:xmlns:chart:1.0" xmlns:dr3d="urn:oasis:names:tc:opendocument:xmlns:dr3d:1.0" xmlns:math="http://www.w3.org/1998/Math/MathML" xmlns:form="urn:oasis:names:tc:opendocument:xmlns:form:1.0" xmlns:script="urn:oasis:names:tc:opendocument:xmlns:script:1.0" xmlns:ooo="http://openoffice.org/2004/office" xmlns:ooow="http://openoffice.org/2004/writer" xmlns:oooc="http://openoffice.org/2004/calc" xmlns:dom="http://www.w3.org/2001/xml-events" xmlns:xforms="http://www.w3.org/2002/xforms" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:rpt="http://openoffice.org/2005/report" xmlns:of="urn:oasis:names:tc:opendocument:xmlns:of:1.2" xmlns:xhtml="http://www.w3.org/1999/xhtml" xmlns:grddl="http://www.w3.org/2003/g/data-view#" xmlns:officeooo="http://openoffice.org/2009/office" xmlns:tableooo="http://openoffice.org/2009/table" xmlns:drawooo="http://openoffice.org/2010/draw" xmlns:calcext="urn:org:documentfoundation:names:experimental:calc:xmlns:calcext:1.0" xmlns:loext="urn:org:documentfoundation:names:experimental:office:xmlns:loext:1.0" xmlns:field="urn:openoffice:names:experimental:ooo-ms-interop:xmlns:field:1.0" xmlns:formx="urn:openoffice:names:experimental:ooxml-odf-interop:xmlns:form:1.0" xmlns:css3t="http://www.w3.org/TR/css3-text/" office:version="1.2"><office:scripts/><office:font-face-decls><style:font-face style:name="Courier New" svg:font-family="'Courier New'" style:font-family-generic="modern" style:font-pitch="fixed"/><style:font-face style:name="Cambria" svg:font-family="Cambria" style:font-family-generic="roman" style:font-pitch="variable"/><style:font-face style:name="Palatino Linotype" svg:font-family="'Palatino Linotype'" style:font-family-generic="roman" style:font-pitch="variable"/><style:font-face style:name="Times New Roman" svg:font-family="'Times New Roman'" style:font-family-generic="roman" style:font-pitch="variable"/><style:font-face style:name="Calibri" svg:font-family="Calibri" style:font-family-generic="swiss" style:font-pitch="variable"/><style:font-face style:name="Liberation Sans" svg:font-family="'Liberation Sans'" style:font-family-generic="swiss" style:font-pitch="variable"/><style:font-face style:name="Tahoma" svg:font-family="Tahoma" style:font-family-generic="swiss" style:font-pitch="variable"/><style:font-face style:name="MS Mincho" svg:font-family="'MS Mincho'" style:font-family-generic="system" style:font-pitch="variable"/><style:font-face style:name="Tahoma1" svg:font-family="Tahoma" style:font-family-generic="system" style:font-pitch="variable"/></office:font-face-decls><office:automatic-styles><style:style style:name="P1" style:family="paragraph" style:parent-style-name="Normal"><style:paragraph-properties fo:text-align="justify" style:justify-single-word="false"/><style:text-properties style:font-name="Palatino Linotype"/></style:style><style:style style:name="P2" style:family="paragraph" style:parent-style-name="articlehead" style:master-page-name="MP0"><style:paragraph-properties style:page-number="auto" fo:break-before="page"/></style:style><style:style style:name="T1" style:family="text"><style:text-properties style:text-position="super 64%"/></style:style><style:style style:name="T2" style:family="text"><style:text-properties style:font-name="Calibri"/></style:style><style:style style:name="T3" style:family="text"><style:text-properties fo:font-style="italic" style:font-style-asian="italic"/></style:style><style:style style:name="T4" style:family="text"><style:text-properties fo:color="#f79646" style:font-name="Courier New" fo:font-weight="bold" style:font-weight-asian="bold" style:font-name-complex="Courier New"/></style:style></office:automatic-styles><office:body><office:text text:use-soft-page-breaks="true"><text:sequence-decls><text:sequence-decl text:display-outline-level="0" text:name="Illustration"/><text:sequence-decl text:display-outline-level="0" text:name="Table"/><text:sequence-decl text:display-outline-level="0" text:name="Text"/><text:sequence-decl text:display-outline-level="0" text:name="Drawing"/></text:sequence-decls><text:p text:style-name="P2">The application of Schematron schemas to word-processing documents</text:p><text:p text:style-name="bodytext">As traditional print-based publishing has made the transition into the digital age, a convention has developed in some quarters of capturing or even typesetting content using word‑processing applications.</text:p><text:p text:style-name="bodytext">These can present a convenient route to publication in the many instances where content derives (in the form of author manuscript) from the same word‑processing package. It is also a relatively cheap and efficient one, demanding the now basic and widespread skills of styling a document to achieve the desired appearance. </text:p><text:p text:style-name="bodytext">As a result, typesetting workflows consuming these documents still exist, template-based workflows designed to capture structured data are still in place, and for some publishers large quantities of legacy data persist in word‑processing formats only and require migration to XML to meet modern production demands.</text:p><text:p text:style-name="bodytext">During the long period (for some) of moving to a digital-first workflow, with publication of a single source of structured data in various renditions, it has become apparent to such publishers that the quality of their content no longer only resides in the appearance of the rendered product, but also in the quality of the data capture itself. The quality question has shifted from “Does my product look right?” to “Is my source markup sufficiently rich to service the outputs I wish to produce?” When generating XML markup from a word‑processing source, the inevitable corollary is whether the document has been styled appropriately to drive good-quality data capture.</text:p><text:p text:style-name="bodytext">The requirement to apply business rules to styled documents is not new. This was often done using macros to interrogate the underlying object model before Microsoft Office (OOXML)<text:span text:style-name="bibref">[1]</text:span> and Open Office (ODF)<text:span text:style-name="bibref">[2]</text:span> began exposing their respective file formats as XML. With the word-processing document being edited now available as XML, other, native-XML validation approaches are viable and indeed attractive.</text:p><text:p text:style-name="bodytext">This paper will present Schematron as a portable, standards-based alternative, demonstrating how it can be integrated into a word‑processing template to alert authors and editors directly to content problems during capture.</text:p><text:p text:style-name="bodytext">It will demonstrate how business rules can be applied to a word‑processing document held in one of the standard word-processing XML file formats using an ISO Schematron schema. These rules will comprise typical Schematron validation activity. They might be unexpected or missing style(s), co-occurrence constraints, or datatype errors; in fact, anything that Schematron might normally be used to identify in any other XML content. Further, it will be shown how errors found in the document and reported as SVRL can be successfully merged back <text:span text:style-name="Default_20_Paragraph_20_Font"><text:span text:style-name="T3">in situ</text:span></text:span> into the original document, so that an editor can address the problem so located within the originating editing environment.</text:p><text:p text:style-name="bodytext"><text:soft-page-break/>Writing XPath-based Schematron rules for flat structures is reasonably tedious work: in document-based XML the structure broadly reflects the meaning of the content, whereas a word‑processing document is essentially a succession of (paragraph and character) styles, tables and other objects. Some work has been done to derive an XML-based schema for a document’s expected disposition of styles<text:span text:style-name="T4">[3]</text:span>.</text:p><text:p text:style-name="P1">The paper will further develop this theme by discussing the use and practicality of such a language to express the business rules governing a word‑processing document more declaratively, and how successfully this may be transformed to produce a Schematron schema which can then be used to perform the validation outlined above.</text:p><text:p text:style-name="bodytext">Finally, it will consider the feasibility and desirability of extending this approach to other kinds of office document, for instance to spreadsheets.</text:p><text:p text:style-name="P1"/><text:p text:style-name="P1"/><text:h text:style-name="Heading_20_2" text:outline-level="2">References</text:h><text:p text:style-name="bib"><text:span text:style-name="bibnum">[1]</text:span> <text:a xlink:type="simple" xlink:href="http://www.ecma-international.org/publications/standards/Ecma-376.htm" office:target-frame-name="_top" xlink:show="replace" text:style-name="Internet_20_link" text:visited-style-name="Visited_20_Internet_20_Link"><text:span text:style-name="Hyperlink">http://www.ecma-international.org/publications/standards/Ecma-376.htm</text:span></text:a>. Retrieved <text:span text:style-name="bibdate">2015-03-08</text:span>.</text:p><text:p text:style-name="bib"><text:span text:style-name="bibnum">[2]</text:span> <text:a xlink:type="simple" xlink:href="https://www.oasis-open.org/standards#opendocumentv1.2" office:target-frame-name="_top" xlink:show="replace" text:style-name="Internet_20_link" text:visited-style-name="Visited_20_Internet_20_Link"><text:span text:style-name="Hyperlink">https://www.oasis-open.org/standards#opendocumentv1.2</text:span></text:a>. Retrieved <text:span text:style-name="bibdate">2015-03-08</text:span>.</text:p><text:p text:style-name="bib"><text:span text:style-name="bibnum">[3]</text:span> Francis Cave, Francis Cave Digital Publishing: a style schema for word‑processing documents; personal communication, <text:span text:style-name="bibdate">February 2015</text:span>.</text:p></office:text></office:body></office:document-content>