Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run properties in Paragraph properties #724

Closed
pongstylin opened this issue Dec 16, 2020 · 18 comments
Closed

Run properties in Paragraph properties #724

pongstylin opened this issue Dec 16, 2020 · 18 comments

Comments

@pongstylin
Copy link

pongstylin commented Dec 16, 2020

This is probably a feature request.

Use Case / Problem
I want 2 tables stacked on top of each other, but with no space in between. Impossible right? Word merges stacked tables and the only way to split them is by putting a paragraph in between. So, the trick is to make the paragraph as short as possible. Why? Because I want the 2nd table to be on a new page. So the paragraph must be at the end of the first table/page or at the beginning of the 2nd table/page. If the former, you don't want the paragraph to wrap to a new page if the table takes up almost all vertical space. If the latter, you don't want extra dead space above your table on the 2nd page.

Solution
To make a paragraph take up as little space as possible, give it a font size of 0.5pts. Note that MSWord doesn't allow you to go lower than 1pt, but the XML supports 0.5pts and Word respects it and will preserve it when resaving the doc.

docx.js Limitation
But while I can make an empty paragraph with 1pt font size in MSWord, I can't in docx.js. When I try, Word doesn't like it and falls back to the default 10pt font size. I compared the XML generated by docx.js with the XML generated by Word and discovered the difference.

This is Word:

    <w:p>
      <w:pPr>
        <w:rPr>
          <w:sz w:val="2" />
          <w:szCs w:val="2" />
        </w:rPr>
      </w:pPr>
    </w:p>

This is docx.js:

    <w:p>
      <w:r>
        <w:rPr>
          <w:sz w:val="1" />
          <w:szCs w:val="1" />
        </w:rPr>
      </w:r>
    </w:p>

The difference is subtle. But notice the "rPr" nested inside of "pPr" in the Word example. While the docx.js example nests "rPr" inside of a "r". This is because docx.js forced me to create an empty TextRun object and apply the font size to it. But Word doesn't create a TextRun at all. It appears to be defining default TextRun properties in the paragraph properties. As far as I can tell, docx.js doesn't allow me to pass the font size to paragraph properties at all.

Workaround
Until this is supported, I have resorted to adding text to the TextRun and setting the font color to "white". But, being able to add default run properties at the paragraph level would be nice for a number of reasons, so I decided to bring this feature to your attention.

@HTrayford
Copy link

HTrayford commented Dec 17, 2020

@pongstylin I think you can achieve the affect you're looking for by setting the second table to float, using a vertical anchor of TEXT and setting the absoluteVerticalPosition to -220.

const table1 = new Table({
  /*build table*/
})

const table2 = new Table({
  float: {
    verticalAnchor: TableAnchorType.TEXT,
    absoluteVerticalPosition: -220
  },
 /* build rest of table*/
})

//later on...

doc.addSection({
  children: [
    table1,
    new Paragraph({
      spacing: { before: 0, after: 0 },
      children: [
        new TextRun({
          text: ' ',
          size: 0.5
        })
      ]
    }),
    table2
})

close tables

@pongstylin
Copy link
Author

@HTrayford That is a clever trick, but comes with side-effects that make it a no-go for me. It appears repeating headers don't work on floating tables.

@HTrayford
Copy link

Ah, well that's too bad. I'm glad you've got a work around that seems to be working for you. Perhaps @dolanmiu could look into either your original request or to adding repeating headers to floating tables.

@pongstylin
Copy link
Author

The limitation rests on MS Word when it comes to repeating headers not working on floating tables.

@dolanmiu
Copy link
Owner

I've tried to add sz and szCs to the paragraph properties, and made a proposal like so size: 100:

doc.addSection({
    properties: {},
    children: [
        new Paragraph({
            size: 100,
            children: [
                new TextRun("Hello World"),
                new TextRun({
                    text: "Foo Bar",
                    bold: true,
                }),
            ],
        }),
    ],
});

However, it does not work for me, even though it creates the "correct" xml:

        <w:p>
            <w:pPr>
                <w:sz w:val="100"/>
                <w:szCs w:val="100"/>
            </w:pPr>
            <w:r>
                <w:t xml:space="preserve">Hello World</w:t>
            </w:r>
            <w:r>
                <w:rPr>
                    <w:b w:val="true"/>
                    <w:bCs w:val="true"/>
                </w:rPr>
                <w:t xml:space="preserve">Foo Bar</w:t>
            </w:r>
        </w:p>

There must be something more to this...

@pongstylin
Copy link
Author

pongstylin commented Dec 19, 2020

@dolanmiu It's close, but doesn't match the Word XML I posted. The "sz" and "szCs" properties need to be nested inside of an "rPr", which is nested in the "pPr". You are missing the "rPr" level.

@dolanmiu
Copy link
Owner

oh, i see what you mean, im guessing every single run property could be applied on the paragraph level too?

@pongstylin
Copy link
Author

@dolanmiu Yes, that is my expectation

@dolanmiu
Copy link
Owner

dolanmiu commented Dec 19, 2020

I have developed the feature, and it generated this file, but it still doesn't seem to work for me:

@pongstylin could you examine what is wrong with this word doc?

paragraph-run-properties.docx

It now properly does the proper rPr:

 <w:pPr>
                <w:rPr>
                    <w:sz w:val="100"/>
                    <w:szCs w:val="100"/>
                </w:rPr>
            </w:pPr>

Branch is here: https://github.com/dolanmiu/docx/tree/feat/paragraph-run-properties

@pongstylin
Copy link
Author

@dolanmiu First of all, thanks for your effort on this. I am more than happy to assist with my analysis. So I played with it a bit. Looks like your XML correctly conforms to the spec. But you will only notice the 50pt font size for EMPTY paragraphs. Your paragraph is not empty. MS Word will COPY the paragraph run properties to any runs it adds to the paragraph. For example, if you open a blank document, set the font size to 50pt, type some text, and save it then you will get XML that looks like this:

    <w:p>
      <w:pPr>
        <w:rPr>
          <w:sz w:val="100" />
          <w:szCs w:val="100" />
        </w:rPr>
      </w:pPr>
      <w:r>
        <w:rPr>
          <w:sz w:val="100" />
          <w:szCs w:val="100" />
        </w:rPr>
        <w:t>asdf</w:t>
      </w:r>
    </w:p>

MS Word copied the properties from the paragraph run properties to the new run's properties. docx.js can do the same thing. When creating a new paragraph, merge the paragraph run properties with the properties (if any) of each run added as a child. Allow any properties set on the run to override properties set on the paragraph.

In short, your branch works as-is and will solve my problem since I'm working with an empty paragraph. But if you want your changes to work intuitively with non-empty paragraphs, then you can add the merge logic.

Make sense?

@pongstylin
Copy link
Author

pongstylin commented Dec 19, 2020

P.S. I said you can merge the paragraph and child run properties as the paragraph was created. But instead of that, you could leave the run objects unmolested and perform the merge during serialization of the paragraph and its children to XML if that is more attractive to you.

@dolanmiu
Copy link
Owner

Yes, that makes sense

So would you say this is exclusively a Microsoft Word feature they added in? It seems like its some sort of short hand the people at Microsoft added

And yes merging is bad because of conflicts, an example of this is style, which is on both paragraph and run

@pongstylin
Copy link
Author

pongstylin commented Dec 20, 2020

No, it is not specific to MS Word. It is part of the spec. When no runs are present (the paragraph is empty) then it is the only way to define the initial formatting of the paragraph.

Here is a reference to the spec:
http://officeopenxml.com/WPparagraphProperties.php

Specifies the run properties for the paragraph glyph, which is used to represent the physical location of the paragraph mark. When the mark is formatted, a rPr appears within pPr. The text is then formatted accordingly, except for possible direct text formatting. See Text - Formatting.
Reference: ECMA-376, 3rd Edition (June, 2011), Fundamentals and Markup Language Reference § 17.3.1.29.

@pongstylin
Copy link
Author

pongstylin commented Dec 20, 2020

Regarding style, I see a pStyle on the paragraph properties and an rStyle on the run properties. Both may be set on the paragraph like so:

new docx.Paragraph({
  style: "no",
  runProperties: {
    size: 100,
    style: "conflict"
  },
  children: [ ... ],
});

Then, if a child run also specifies a style it will override "conflict" and leave the paragraph "no" style in place. I haven't studied styles in depth yet, but that is my gut response to your concern.

@dolanmiu
Copy link
Owner

This part confuses me:

The text is then formatted accordingly

Except when we tried it, it is not formatted? Seems like we need manual intervention to add the rPr into the respective TextRuns

@pongstylin
Copy link
Author

pongstylin commented Dec 20, 2020

@dolanmiu Honestly, it confused me too. But after playing with your XML file, I realized that quote doesn't say HOW "The text is then formatted accordingly". After all, it is true in practice. Let's say you have an empty paragraph formatted with a font size of 50pt. Any text you type into that paragraph will inherit the formatting. But the mechanics behind that apparently involve copying the formatting from the paragraph to the run. That's what MS Word does, and docx.js can do the same thing. Also, in MS Word, you can then highlight the text and change the font size to 10pt (make sure you highlight the text and not the whole paragraph when changing the font size). After saving, you will see in the XML that the paragraph still has a font size of 50pt, but the run now has a font size of 10pt. This is an example of the "except for possible direct text formatting" in the spec. When it comes to docx.js, that just means the formatting defined at the run level overrides formatting defined at the paragraph level as you merge formatting in the serialized XML it produces for the runs' properties.

@BrentFarese
Copy link

Here is a reference to the spec: http://officeopenxml.com/WPparagraphProperties.php

Specifies the run properties for the paragraph glyph, which is used to represent the physical location of the paragraph mark. When the mark is formatted, a rPr appears within pPr. The text is then formatted accordingly, except for possible direct text formatting. See Text - Formatting.
Reference: ECMA-376, 3rd Edition (June, 2011), Fundamentals and Markup Language Reference § 17.3.1.29.

@dolanmiu came across this issue. We are facing a different issue in that rPr inside pPr is not supported in the library right now. As it says above, the rPr inside pPr applies only to the paragraph glyph. It turns out that this also has an impact on list items. The list item marker (e.g., 1.1.2) inherits the format of the paragraph glyph in a number of cases.

Can/does the library support rPr inside pPr? We are really not concerned with formatting the paragraph glyph, but like I said it impacts list item marker formatting in a lot of cases. Thanks!

@dolanmiu
Copy link
Owner

Run properties are now added in 8.3.0:

#2457

Thank you yuanliwei

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants