Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Empty shared string breaks string references in Xlsx file #1560

Closed
bianchim opened this issue Jun 30, 2020 · 2 comments · Fixed by #4064
Closed

Empty shared string breaks string references in Xlsx file #1560

bianchim opened this issue Jun 30, 2020 · 2 comments · Fixed by #4064

Comments

@bianchim
Copy link

bianchim commented Jun 30, 2020

This is:

- [x] a bug report
- [ ] a feature request
- [ ] **not** a usage question (ask them on https://stackoverflow.com/questions/tagged/phpspreadsheet or https://gitter.im/PHPOffice/PhpSpreadsheet)

What is the expected behavior?

Using the Xlsx reader with empty shared strings should provide empty strings in the data.
This is represented by an empty '<si></si>' in the shared strings file when unzipping the Xlsx file.

What is the current behavior?

The strings gets ignored and skipped, throwing the whole share strings array indices off by 1.

What are the steps to reproduce?

The bug is present exactly there :
https://github.com/PHPOffice/PhpSpreadsheet/blob/1.14.0/src/PhpSpreadsheet/Reader/Xlsx.php#L434

You can either make a test file and empty a shared string in it manually (unzip, modify "xl/sharedStrings.xml" and rezip it), or force the "else" case on one string in the linked code above.
The result in both will be to throw an "undefined index" when simply opening and reading the file, since the last shared string will be out of bounds.

Which versions of PhpSpreadsheet and PHP are affected?

Last stable ( 1.14.0 )

Possible resolution

The simplest fix would be to add an else case and put an empty string in there. While this worked in my case, I don't have the time nor ressources to make extensives tests, and I fear this could break other things for other users.

@stale
Copy link

stale bot commented Aug 29, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
If this is still an issue for you, please try to help by debugging it further and sharing your results.
Thank you for your contributions.

@stale stale bot added the stale label Aug 29, 2020
@stale stale bot closed this as completed Sep 5, 2020
@oleibman
Copy link
Collaborator

oleibman commented Jun 5, 2024

Reopening, will be fixed by solution for issue #4063.

@oleibman oleibman reopened this Jun 5, 2024
oleibman added a commit to oleibman/PhpSpreadsheet that referenced this issue Jun 6, 2024
Fix PHPOffice#4063. Fix PHPOffice#1560. Fix PHPOffice#1293. PhpSpreadsheet is not accounting for an empty string in Xlsx sharedStrings.xml.The code which parses it in Reader/Xlsx looks for a `t` or `r` tag descending from `si`, but, in this case, the tag is coded as `<si/>`, with neither t nor r tag descending. An else clause is added to set the string to empty string in this case.

I was surprised that this had not turned up before, and a search through the archives found at least 2 earlier reports from 4 years ago. Those had been marked stale; the stale indicator is removed, and the issues are re-opened, to be closed when this PR is merged.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

2 participants