Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix saving XLSX with drawings #1462

Merged
merged 12 commits into from
May 23, 2020

Conversation

Vagir-dev
Copy link
Contributor

@Vagir-dev Vagir-dev commented May 4, 2020

This is:

- [x] a bugfix
- [ ] a new feature

Checklist:

  • Changes are covered by unit tests
  • Code style is respected
  • Commit message explains why the change is made
  • CHANGELOG.md contains a short summary of the change
  • Documentation is updated as necessary

Why this change is needed?

A problem with broken files is often happens when PhpSpreadsheet writes XLSX file format.
Opening such files in MS Office raises error messages like

We found a problem with some content in filename.xlsx.
Do you want us to try and recover as much as we can? If you trust the source of this workbook, click Yes.

And then after going through the repair process we get the message from Excel that some of images (or sometimes other elements) were deleted from the spreadsheet.

This problem happens when any XLSX file was loaded as a template and there is at least one drawing in it.

The reason is in generating incorrect rIds for worksheets\_rels. Here is example (doubled rId1):

xl\worksheets_rels\sheet1.xml.rels

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
- <Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships">
  <Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/drawing" Target="../drawings/drawing1.xml" />
  <Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/printerSettings" Target="../printerSettings/printerSettings1.bin" />
  </Relationships>

Analysing code of PhpSpreadsheet\Writer\Xlsx\ shows that algorithm used for writing spreadsheet's content to XML files cannot generate rIds in right sequence.
It's because rIds for worksheets\_rels are generated separately for each rel Type (printerSettings, drawing, vmlDrawings etc.) but not as a continuous sequence.
The fact that XLSX files created by the PhpSpreadsheet from a template are often correct is a luck, coincidence - when rIds from the template and rIds from programmatically created elements suite each other.
I hope you understand what I mean...

By the way, it is why such problem doesn't arise if spreadsheet created totally by code, not by loading from template.

To fix this completely a big code refactoring is needed, so it is unreal for now, unfortunately.
But in many cases (not all unfortunately) we can avoid described problem by adding a smal changes in Writer/Xlsx/Rels class.

Hope this will help.

Test demonstrates one of cases: when 2nd (3rd, 4th...) sheet contains any drawing(s) but the first one doesn't.

Sorry for my pure English, it is not my native language.

Resolves #1421
Resolves #1294
Resolves #1236
Resolves #1222
Resolves #1022
Resolves #876
Resolves #861
Resolves #853
Resolves #795
Resolves #723
Resolves #676
Resolves #610
Resolves #544

@MAKS-dev
Copy link

MAKS-dev commented May 6, 2020

Would be nice if #1153 would also be taken into consideration.

@Vagir-dev
Copy link
Contributor Author

Would be nice if #1153 would also be taken into consideration.

@MAKS-dev, I encountered this problem too.
I think I resolved it locally. Maybe I'll make a PR, when current PR will be approved.

@maosong-pactera
Copy link

maosong-pactera commented May 8, 2020

@Vagir-dev I used your latest code, but there are still problems

clone worksheet

<?php
use PhpOffice\PhpSpreadsheet\Spreadsheet;
use PhpOffice\PhpSpreadsheet\Writer\Xlsx;

$outfile ='new_doc.xlsx';
$spreadsheet = \PhpOffice\PhpSpreadsheet\IOFactory::load(__DIR__ . '/new.xlsx');
$clonedWorksheet = clone $spreadsheet->getSheetByName('Sheet1');
$clonedWorksheet->setTitle('Sheet2');
$spreadsheet->addSheet($clonedWorksheet);
$writer = new Xlsx($spreadsheet);
$writer->save($outfile);
  • xl\worksheets_rels\sheet2.xml.rels
  • Two rId1
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships">
  <Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/drawing" Target="../drawings/drawing2.xml"/>
  <Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/printerSettings" Target="../printerSettings/printerSettings1.bin"/>
</Relationships>

@Vagir-dev
Copy link
Contributor Author

Vagir-dev commented May 8, 2020

@maosong-pactera, I couldn't repeat cloning problem with my template files. Can you please attach your file 'new.xlsx'?

$spreadsheet = \PhpOffice\PhpSpreadsheet\IOFactory::load(__DIR__ . '/new.xlsx');

@Vagir-dev
Copy link
Contributor Author

I used your latest code, but there are still problems

@maosong-pactera, are you shure you applied my fix to the sources?

@Howitzer
Copy link

Howitzer commented May 9, 2020

@Vagir-dev thank you so much for this! I was getting the rId errors and corrupt XLSX file only when I used cloning in my sheets, but applying your fix has made everything work now. You saved my day!

@Vagir-dev Vagir-dev changed the title Fix incorrect behaviour when saving XLSX file with drawings Fix saving XLSX with drawings May 12, 2020
@PowerKiKi PowerKiKi added the writer/xlsx Writer for MS OfficeOpenXML-format (xlsx) spreadsheet files label May 18, 2020
@Vagir-dev
Copy link
Contributor Author

@PowerKiKi, hi!
May I ask why this PR is not merged?
Am I doing something wrong?

@MarkBaker
Copy link
Member

Sorry that it's taken me a while to assess the code and confirm that it resolves the basic problem. As you say, some significant changes would be needed to provide a full resolution, but this certainly resolves the rid issues in many commonly-occurring cases
Thank you for the your time and effort

@MarkBaker MarkBaker merged commit 3446bb0 into PHPOffice:master May 23, 2020
@Vagir-dev
Copy link
Contributor Author

@MarkBaker, thank you!

@Vagir-dev Vagir-dev deleted the fix-writer-xlsx-drawing branch May 23, 2020 14:11
PowerKiKi added a commit that referenced this pull request May 31, 2020
### Added

- Support writing to streams in all writers [#1292](#1292)
- Support CSV files with data wrapping a lot of lines [#1468](#1468)
- Support protection of worksheet by a specific hash algorithm [#1485](#1485)

### Fixed

- Fix Chart samples by updating chart parameter from 0 to DataSeries::EMPTY_AS_GAP [#1448](#1448)
- Fix return type in docblock for the Cells::get() [#1398](#1398)
- Fix RATE, PRICE, XIRR, and XNPV Functions [#1456](#1456)
- Save Excel 2010+ functions properly in XLSX [#1461](#1461)
- Several improvements in HTML writer [#1464](#1464)
- Fix incorrect behaviour when saving XLSX file with drawings [#1462](#1462),
- Fix Crash while trying setting a cell the value "123456\n" [#1476](#1481)
- Improved DATEDIF() function and reduced errors for Y and YM units [#1466](#1466)
- Stricter typing for mergeCells [#1494](#1494)

### Changed

- Drop support for PHP 7.1, according to https://phpspreadsheet.readthedocs.io/en/latest/#php-version-support
- Drop partial migration tool in favor of complete migration via RectorPHP [#1445](#1445)
- Limit composer package to `src/` [#1424](#1424)
@Aristotel2003
Copy link

Aristotel2003 commented Jun 15, 2020

Hello. I have latest version PhpSpreadsheet [1.13.0] - 2020-05-31
I get the same error.
I am load excel from template
and insert image.

.

$reader = new \PhpOffice\PhpSpreadsheet\Reader\Xlsx();
$spreadsheet = $reader->load("../core/templates/template.xlsx");
$drawing = new \PhpOffice\PhpSpreadsheet\Worksheet\Drawing();
$drawing->setPath($logo_img); // local path to file image png.
$drawing->setCoordinates('B2');
$drawing->setHeight(50);
$drawing->setWorksheet($spreadsheet->getActiveSheet());
header("Content-type: $contetn_type");
header("Content-Disposition: attachment;filename=$filename");
header('Cache-Control: max-age=0');
$writer->save('php://output');


Excel file has an error in the same ID
file: \xl\worksheets_rels\sheet1.xml.rels

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships">
<Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/drawing" Target="../drawings/drawing1.xml"/>
<Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/printerSettings" Target="../printerSettings/printerSettings1.bin"/>
</Relationships>


I found simple fix for this, change
phpoffice/phpspreadsheet/src/PhpSpreadsheet/Reader/Xlsx.php

1962: line from:
$unparsedPrinterSettings = &$unparsedLoadedData['sheets'][$docSheet->getCodeName()]['printerSettings'];
foreach ($sheetPrinterSettings as $rId => $printerSettings) {
$rId = substr($rId, 3)'; // rIdXXX

to :
$unparsedPrinterSettings = &$unparsedLoadedData['sheets'][$docSheet->getCodeName()]['printerSettings'];
foreach ($sheetPrinterSettings as $rId => $printerSettings) {
$rId = substr($rId, 3).'b'; // rIdXXX SIMPLE ADD RANDOM ID FOR PRINTER SETTING


And all work fine now.

template.xlsx

@Vagir-dev
Copy link
Contributor Author

I get the same error.
I am load excel from template
and insert image.

Hi @Aristotel2003

This is not the same case. My PR fixes cases when the template file contains at least one drawing.
But your template doesn't.
So I think you should start a new issue for your problem.

@Aristotel2003
Copy link

Aristotel2003 commented Jun 23, 2020 via email

@Vagir-dev
Copy link
Contributor Author

I am great, but my fix will suit you too. He makes a correction in 1 line.

That's great! So why don't you want to create your own Pull Request?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment