Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create interface for support of multiple worksheets per file #3706

Closed
lkmorlan opened this issue Sep 1, 2023 · 2 comments · Fixed by #3709
Closed

Create interface for support of multiple worksheets per file #3706

lkmorlan opened this issue Sep 1, 2023 · 2 comments · Fixed by #3709

Comments

@lkmorlan
Copy link

lkmorlan commented Sep 1, 2023

This is a feature request.

When using IOFactory::identify() and IOFactory::createReader(), the resulting reader may or not support multiple worksheets. It would be helpful to have an interface, perhaps called multipleWorksheetInterface, which contains methods like ::listWorksheetNames() and ::setLoadSheetsOnly(). Any reader class with these methods would implement the interface.

This would allow using code like if ($reader instanceof multipleWorksheetInterface) ... to identify readers that have this support.

One can currently do if (method_exists($reader, 'listWorksheetNames')) .... An interface would be a more convenient way of detecting this support.

@oleibman
Copy link
Collaborator

oleibman commented Sep 4, 2023

I think a better solution would be to add the missing methods to the Readers where they do not exist.
-listWorksheetNames - add to Csv, Html, and Slk.
-listWorksheetInfo - add to Html.
-setLoadSheetsOnly - already exists for all readers, although it is ignored for Csv, Html, and Slk.

Would that satisfy your needs?

@lkmorlan
Copy link
Author

lkmorlan commented Sep 5, 2023

If it works the same way, that would be fine. In my use case, I want to load the first sheet from the file, regardless of type. ::listWorksheetNames() could return the filename for formats like CSV that do not actually have multiple sheets.

oleibman added a commit to oleibman/PhpSpreadsheet that referenced this issue Sep 6, 2023
Fix PHPOffice#3706. ListWorksheetInfo is implemented for all Readers except Html. For most (not all), ListWorksheetInfo is more efficient than reading the spreadsheet. I can't think of a way to make that so for Html, but that shouldn't be a reason to leave it unimplemented.

ListWorksheetNames is not implemented for Html, Csv, or Slk. It isn't terribly useful for those formats, but that isn't a reason to omit it. The requester's use case consists of using IOFactory to create a reader for a file of unknown format and determining the first sheet name. That seems legitimate, but it is currently not possible without extra user code if the file is Html, Csv, or Slk; this PR will make it possible.

When Excel opens a Slk or Csv file, the sheet name is based on the file name. PhpSpreadsheet does this for Slk, but it uses a default name for Csv. I am not interested in creating a break for that behavior, but I have added a new boolean property `sheetNameIsFileName` with a setter to Csv Reader. The requester actually mentioned that possibility in our discussion, although it is not essential to the request.

As an adjunct to the issue, the requester wishes to use the worksheet name in `setLoadSheetsOnly`. That is already possible for Html, Csv, and Slk, but that particular property is ignored for those formats. I do not see a reason to change that behavior. This treatment is now explicitly noted in the documentation for property `loadSheetsOnly`.

There had been no tests for what happens when `loadSheetsOnly` is specified but no sheets match the criteria for the formats for which this makes sense (Xlsx, Xls, Ods, Gnumeric, Xml). The behavior was not consistent - some formats threw an Exception while others continued with a single empty worksheet. All cases attempt to set the active sheet, and they will now all throw identical Exceptions when they attempt to do so in this situation. Tests are added for each.

There also had been no tests for `loadSheetsOnly` returning more than one sheet. One is added.
oleibman added a commit that referenced this issue Sep 8, 2023
* ListWorksheetInfo/Names for Html/Csv/Slk

Fix #3706. ListWorksheetInfo is implemented for all Readers except Html. For most (not all), ListWorksheetInfo is more efficient than reading the spreadsheet. I can't think of a way to make that so for Html, but that shouldn't be a reason to leave it unimplemented.

ListWorksheetNames is not implemented for Html, Csv, or Slk. It isn't terribly useful for those formats, but that isn't a reason to omit it. The requester's use case consists of using IOFactory to create a reader for a file of unknown format and determining the first sheet name. That seems legitimate, but it is currently not possible without extra user code if the file is Html, Csv, or Slk; this PR will make it possible.

When Excel opens a Slk or Csv file, the sheet name is based on the file name. PhpSpreadsheet does this for Slk, but it uses a default name for Csv. I am not interested in creating a break for that behavior, but I have added a new boolean property `sheetNameIsFileName` with a setter to Csv Reader. The requester actually mentioned that possibility in our discussion, although it is not essential to the request.

As an adjunct to the issue, the requester wishes to use the worksheet name in `setLoadSheetsOnly`. That is already possible for Html, Csv, and Slk, but that particular property is ignored for those formats. I do not see a reason to change that behavior. This treatment is now explicitly noted in the documentation for property `loadSheetsOnly`.

There had been no tests for what happens when `loadSheetsOnly` is specified but no sheets match the criteria for the formats for which this makes sense (Xlsx, Xls, Ods, Gnumeric, Xml). The behavior was not consistent - some formats threw an Exception while others continued with a single empty worksheet. All cases attempt to set the active sheet, and they will now all throw identical Exceptions when they attempt to do so in this situation. Tests are added for each.

There also had been no tests for `loadSheetsOnly` returning more than one sheet. One is added.

* Update LoadSheetsOnlyTest.php

Add strict types to this new test, consistent with work being done in PR #3718.

* Update LoadSheetsOnlyTest.php

Add strict types to this new test, consistent with work being done in PR #3718.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

2 participants