-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problems with .odt when converting to docx with LibreOffice #43
Comments
Yes, please make a small test file. It will be hopefully just one command or package that cause problems. Best regards, |
I've looked into it and for article sized files it is difficult to pinpoint the exact latex code that is causing problems. It seems to me that it is the styles that ar causing problems. If I unzip the odt file produced by make4ht and simply delete styles.xml it seems to work better. This is confirmed by the odfvalidator. The following MWE converts fine wiht make4ht:
But the resulting odt file is not validated. If set validating to ODF1.0 Strict I get the following errors:
Is this expected, or is it something wrong with my setup? |
@hcf-n do you use up-to-date TeX Live? I've tried to compile your example and report from the validator is following:
|
When validating with 1.0 I get the same as you. But when i use 1.0 strict I get the errors in the last post. I also get errors validating for 1.1 strict, 1.2 or 1.3. The problem for me is that the generated odt output from make4ht can't be parsed by pandoc and that LibreOffice sometimes crashes when saving the odt file as docx. That is why I looked in to the validity of the files trying to get an idea of why the odt file is problematic. What do you think? Could a stricter validation help make the odt file more conformant with other tools? |
I think that it is important that the ODT file is conformant with the version it declares. It is unfortunate that Pandoc doesn't show a more helpful error message, like what is the actual issue why it cannot parse the file. |
Agree, |
I don't know. It would be quite difficult to fix all these issues. It is mainly that lots of attributes are non-valid in the strict mode. I've tried to modify the ODT file by hand, succeed in making it valid ODF 1.0. But Pandoc still cannot parse it, so there must be another issue. |
Thank you for trying out the manual edit. I can try to investigate further to problem as to why Pandoc can´t parse the odt output on the Pandoc side. |
I wouldn't expect Pandoc faling just because of some spurious attributes. I expect that it fails because some attribute it needs is missing. It would be useful if they could investigate what is the problem. |
They are making progress on this over at pandoc. |
Thanks, so the issue seems to be caused by |
Seems so. Would you like to leave a comment that this problem will disappear over at the pandoc group yourself? |
It should be fixed in the sources by the latest commit. Pandoc can convert the ODT file now. |
I'll have to apologize for making a bigger problem out of the odt styles than there really was. I was having to problems. Convert with libreoffice was unstable (Which I now have solved by using soffice --convert-to) and pandoc wouldn't parse. I took that to be an indication of some kind of syntactical problem with the file. With a more stable environmont on my part I look forward to working on the odt conversion. I hope I still can bother you with my efforts the get a smooth transition from tex to docx (as requiered by publishers) |
No problem, the XML instructions were unnecessary and it is not that hard to remove them. I was worried about the syntactical issues, because I find it quite difficult to find any information about ODF. So it is really good that this is not the case. I certainly welcome any feature requests and bug reports. |
Is this commited to Tex Live? I still have files genereted with |
No, I have some work in progress in Make4ht, so I want to update it when it is done. |
When I convert to .odt with make4ht i get a file that works fine in LibreOffice. But, when I try to save as .docx I get some problems and LibreOffice refuses to convert. Investigating this I tried out the validator at https://odfvalidator.org. It seems that the odt. from Make4ht has some errors.
If this is something you could look into I would happily make a testfile to identify the errors. Since the validator has several different versions for the .odt format I wonder which one I should aim for in the tests.
best regards
Hans
The text was updated successfully, but these errors were encountered: