Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regression bug in 1.29 for getHighestColumn in ods #3721

Closed
1 of 8 tasks
oprudkyi opened this issue Sep 8, 2023 · 6 comments · Fixed by #3727
Closed
1 of 8 tasks

regression bug in 1.29 for getHighestColumn in ods #3721

oprudkyi opened this issue Sep 8, 2023 · 6 comments · Fixed by #3727

Comments

@oprudkyi
Copy link

oprudkyi commented Sep 8, 2023

This is:

- [x] a bug report
- [ ] a feature request
- [x] **not** a usage question (ask them on https://stackoverflow.com/questions/tagged/phpspreadsheet or https://gitter.im/PHPOffice/PhpSpreadsheet)

What is the expected behavior?

getHighestColumn() should return correct value

What is the current behavior?

1.28.0 - 'C'
1.29.0 - 'BL'

What are the steps to reproduce?

Please provide a Minimal, Complete, and Verifiable example of code that exhibits the issue without relying on an external Excel file or a web server:

<?php

require 'vendor/autoload.php';

use PhpOffice\PhpSpreadsheet\IOFactory;

$filePath = 'test.ods';
$reader = IOFactory::createReaderForFile($filePath);
$reader->setReadDataOnly(false);

// disable warnings, because PHPSpreadsheet lib has error in own sources
try {
    $oldReporting = error_reporting(error_reporting() & ~E_NOTICE);
    $spreadsheet = $reader->load($filePath);
} finally {
    error_reporting($oldReporting);
}

$sheet = $spreadsheet->getSheetByName('sheet with data');
$highestColumn = $sheet->getHighestColumn(); // e.g 'F'
echo $highestColumn . PHP_EOL;

File used:
test.ods

What features do you think are causing the issue

  • Reader
  • Writer
  • Styles
  • Data Validations
  • Formula Calculations
  • Charts
  • AutoFilter
  • Form Elements

Does an issue affect all spreadsheet file formats? If not, which formats are affected?

ods

Which versions of PhpSpreadsheet and PHP are affected?

1.29.0
php: 8.2.8

@oleibman
Copy link
Collaborator

oleibman commented Sep 9, 2023

I can confirm the regression, but I don't receive any warning messages (or any other kinds) when I run this code without your try/finally block. Please let me know what message you are seeing.

You probably want to use getHighestDataColumn rather than getHighestColumn. That will give you your desired result.

Nevertheless, looking at your problem, the regression was probably caused by PR #3610. The ODS reader had been ignoring number-columns-repeated in the xml, leading to errors. In your file (and in many others that I've looked at), there is a declaration after the last populated cell in your rows:

<table:table-cell table:number-columns-repeated="61"/>

That is what is pushing you from 'C' to 'BL'. It seems to be an unnecessary declaration, and I am not sure what purpose it is serving for Ods. It might or might not be possible for me to identify this situation and ignore that declaration. I will look into it, but go with getHighestDataColumn anyhow.

@oprudkyi
Copy link
Author

oprudkyi commented Sep 9, 2023

Hi @oleibman

Thank you very much.
getHighestDataColumn works correctly (I didn't know about existence of it )
as for try/catch etc - this is code from relatively legacy ETL and probably some random files did produce warnings which were regarded as exceptions. it is unrelated to the issue

@oprudkyi oprudkyi closed this as completed Sep 9, 2023
oleibman added a commit to oleibman/PhpSpreadsheet that referenced this issue Sep 10, 2023
Fix PHPOffice#3720. Third-party product created a spreadsheet which PhpSpreadsheet could not read because of unexpected namespacing in workbook.xml.rels.

The file which demonstrated the problem was attached to PHPOffice#3423, however I do not believe it was related to the original problem. Nevertheless, the original issue specifically called out Protection, so I put some Protection tests in the validation test for the fix. In doing so, I found that Style/Protection is particularly confusing. Its properties will often have the value `inherit`, which isn't all that helpful; and, even when the `locked` value is `protected`, the cell won't actually be locked unless the sheet is protected as well. The `hidden` property is even more obscure - it applies only to formulas, and refers to hiding the property on the formula bar, not in the cell. I have added methods `isLocked` and `isHiddenOnFormulaBar` to `Cell`. I corrected the docs to explain this. And, as long as I was looking at the docs, I corrected some examples to use `getHighestDataRow/Column` rather than `getHighestRow/Column`, a frequent problem for users (e.g. PHPOffice#3721).

As a side note, the change to Cell.php is my first use of the nullsafe operator. This is one of many new options available now that we require Php8.0+.
@oprudkyi
Copy link
Author

oprudkyi commented Sep 11, 2023

Hi @oleibman just found inconsistency for getHighestDataColumn,
if I run

<?php

require 'vendor/autoload.php';

use PhpOffice\PhpSpreadsheet\IOFactory;

$filePath = 'test.ods';
$reader = IOFactory::createReaderForFile($filePath);
$reader->setReadDataOnly(false);

$spreadsheet = $reader->load($filePath);

$sheet = $spreadsheet->getSheetByName('sheet with data');
foreach ($sheet->getRowIterator() as $row) {
    foreach ($row->getCellIterator() as $cell) {
    }
}
$highestColumn = $sheet->getHighestDataColumn(); // e.g 'F'
echo $highestColumn . PHP_EOL;

result is BL

but if I comment out getRowIterator block, like this

<?php

require 'vendor/autoload.php';

use PhpOffice\PhpSpreadsheet\IOFactory;

$filePath = 'test.ods';
$reader = IOFactory::createReaderForFile($filePath);
$reader->setReadDataOnly(false);

$spreadsheet = $reader->load($filePath);

$sheet = $spreadsheet->getSheetByName('sheet with data');
/*
foreach ($sheet->getRowIterator() as $row) {
    foreach ($row->getCellIterator() as $cell) {
    }
}
 */
$highestColumn = $sheet->getHighestDataColumn(); // e.g 'F'
echo $highestColumn . PHP_EOL;

the result is 'C'

i.e. look likes iterating resets HighestDataColumn to higher value

@oleibman
Copy link
Collaborator

Okay, reopening.

@oleibman oleibman reopened this Sep 11, 2023
@oleibman
Copy link
Collaborator

foreach ($sheet->getRowIterator() as $row) {
    $iterator = $row->getCellIterator();
    $iterator->setIterateOnlyExistingCells(true);
    foreach ($iterator as $cell) {
    }
}

This should solve your problem. It seems a little awkward, and I think there's probably a good case to be made for adding a parameter for IterateOnlyExisting to the constructor. I'll look into that.

@oprudkyi
Copy link
Author

Hi
Thank you.
as workaround I just call getHighestDataColumn before iterators
So it's low priority issue.

oleibman added a commit to oleibman/PhpSpreadsheet that referenced this issue Sep 13, 2023
Fix PHPOffice#3721. That issue can already be handled, but requires several statements when one ought to suffice. Adding an extra parameter to the RowCellIterator and ColumnCellIterator constructors is useful, easy, and becomes especially practical now that our supported Php releases all support named parameters (see new test Issue3721Test).
oleibman added a commit that referenced this issue Sep 13, 2023
* Unexpected Namespacing in rels File

Fix #3720. Third-party product created a spreadsheet which PhpSpreadsheet could not read because of unexpected namespacing in workbook.xml.rels.

The file which demonstrated the problem was attached to #3423, however I do not believe it was related to the original problem. Nevertheless, the original issue specifically called out Protection, so I put some Protection tests in the validation test for the fix. In doing so, I found that Style/Protection is particularly confusing. Its properties will often have the value `inherit`, which isn't all that helpful; and, even when the `locked` value is `protected`, the cell won't actually be locked unless the sheet is protected as well. The `hidden` property is even more obscure - it applies only to formulas, and refers to hiding the property on the formula bar, not in the cell. I have added methods `isLocked` and `isHiddenOnFormulaBar` to `Cell`. I corrected the docs to explain this. And, as long as I was looking at the docs, I corrected some examples to use `getHighestDataRow/Column` rather than `getHighestRow/Column`, a frequent problem for users (e.g. #3721).

As a side note, the change to Cell.php is my first use of the nullsafe operator. This is one of many new options available now that we require Php8.0+.

* Minor Simplifications

* Scrutinizer

It's being silly again. In many tests, we test a variable for non-null, then use that variable later and Scrutinizer knows it's not null. Not here. Oh well.

* Add Methods

Test if protected without allocating cell if it doesn't exist.
oleibman added a commit that referenced this issue Sep 13, 2023
* Add iterateOnlyExistingCells to Constructors

Fix #3721. That issue can already be handled, but requires several statements when one ought to suffice. Adding an extra parameter to the RowCellIterator and ColumnCellIterator constructors is useful, easy, and becomes especially practical now that our supported Php releases all support named parameters (see new test Issue3721Test).

* Update Column.php

* Update .php-cs-fixer.dist.php

* Update Row.php

* Update Column.php

* Update ByColumnAndRowTest.php
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

2 participants