Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Undefined offset in Xlsx reader #1293

Closed
nick-lai opened this issue Dec 20, 2019 · 3 comments · Fixed by #4064
Closed

Undefined offset in Xlsx reader #1293

nick-lai opened this issue Dec 20, 2019 · 3 comments · Fixed by #4064

Comments

@nick-lai
Copy link

nick-lai commented Dec 20, 2019

This is:

- [*] a bug report
- [ ] a feature request
- [ ] **not** a usage question (ask them on https://stackoverflow.com/questions/tagged/phpspreadsheet or https://gitter.im/PHPOffice/PhpSpreadsheet)

Which versions of PhpSpreadsheet and PHP are affected?

php: ^7.0

PHPOffice/PhpSpreadsheet: ^1.8.2

File: Undefined offset in Xlsx reader(Cell B1 is an empty richtext).xlsx

if (isset($xmlStrings, $xmlStrings->si)) {
foreach ($xmlStrings->si as $val) {
if (isset($val->t)) {
$sharedStrings[] = StringHelper::controlCharacterOOXML2PHP((string) $val->t);
} elseif (isset($val->r)) {
$sharedStrings[] = $this->parseRichText($val);
}
}
}

The processing of 441 to 445 lines causes the $sharedStrings array to skip the empty SimpleXMLElement object.

Example for $xmlStrings:

SimpleXMLElement::__set_state(array(
    // '@attributes' => ...
    'si' =>
    array (
        0 => SimpleXMLElement::__set_state(array(
           't' => 'Name',
        )),
        1 => SimpleXMLElement::__set_state(array()), // will be skipped.
        // ...
    )
))

// Read cell!
switch ($cellDataType) {
case 's':
if ((string) $c->v != '') {
$value = $sharedStrings[(int) ($c->v)];
if ($value instanceof RichText) {
$value = clone $value;
}
} else {
$value = '';
}
break;

In this case, a undefined offset error would occur on line 679.

Fix for empty SimpleXMLElement:

foreach ($xmlStrings->si as $val) {
    if (isset($val->t)) {
        $sharedStrings[] = StringHelper::controlCharacterOOXML2PHP((string) $val->t);
    } elseif (isset($val->r)) {
        $sharedStrings[] = $this->parseRichText($val);
    } else {
        // When SimpleXMLElement is empty the `t` and `r` properties no exists.
        $sharedStrings[] = ''; 
    }
}
@nick-lai
Copy link
Author

nick-lai commented Jan 2, 2020

@MarkBaker

@stale
Copy link

stale bot commented Mar 2, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
If this is still an issue for you, please try to help by debugging it further and sharing your results.
Thank you for your contributions.

@oleibman
Copy link
Collaborator

oleibman commented Jun 5, 2024

Reopening, (very late) fix is on the way.

@oleibman oleibman reopened this Jun 5, 2024
oleibman added a commit to oleibman/PhpSpreadsheet that referenced this issue Jun 6, 2024
Fix PHPOffice#4063. Fix PHPOffice#1560. Fix PHPOffice#1293. PhpSpreadsheet is not accounting for an empty string in Xlsx sharedStrings.xml.The code which parses it in Reader/Xlsx looks for a `t` or `r` tag descending from `si`, but, in this case, the tag is coded as `<si/>`, with neither t nor r tag descending. An else clause is added to set the string to empty string in this case.

I was surprised that this had not turned up before, and a search through the archives found at least 2 earlier reports from 4 years ago. Those had been marked stale; the stale indicator is removed, and the issues are re-opened, to be closed when this PR is merged.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

2 participants