Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Writing a CSV file with many lines noticeable slower than before #3904

Closed
1 of 8 tasks
ArtoNiittukari opened this issue Feb 16, 2024 · 1 comment · Fixed by #3906
Closed
1 of 8 tasks

Writing a CSV file with many lines noticeable slower than before #3904

ArtoNiittukari opened this issue Feb 16, 2024 · 1 comment · Fixed by #3906

Comments

@ArtoNiittukari
Copy link

This is:

- [x] a bug report
- [ ] a feature request
- [ ] **not** a usage question (ask them on https://stackoverflow.com/questions/tagged/phpspreadsheet or https://gitter.im/PHPOffice/PhpSpreadsheet)

What is the expected behavior?

Writing a CSV file with many lines was generally fast in 1.29.0, as expected.

What is the current behavior?

Writing a file with many lines is noticeable slower in 2.0.0 than 1.29.0. I have pinpointed the issue to commit 096e193 because the earlier commit 9f87d7c before that is fast. The culprit seems to be \PhpOffice\PhpSpreadsheet\Worksheet\namedRangeToArray(). The writing time seems to increase more and more when more lines are added. This can be tested by writing 10k lines vs. 20k lines where the difference is already big.

What are the steps to reproduce?

Please provide a Minimal, Complete, and Verifiable example of code that exhibits the issue without relying on an external Excel file or a web server:

<?php

require __DIR__ . '/vendor/autoload.php';

// Create new Spreadsheet object
$spreadsheet = new \PhpOffice\PhpSpreadsheet\Spreadsheet();

// add code that show the issue here...
$rows = [];

// Add many lines to one column only
for ($i = 0; $i < 50000; $i++) {
    $rows[] = ['A' . $i + 1];
}

$writer = \PhpOffice\PhpSpreadsheet\IOFactory::createWriter($spreadsheet, \PhpOffice\PhpSpreadsheet\IOFactory::WRITER_CSV);

$worksheet = $spreadsheet->getActiveSheet();
$worksheet->fromArray($rows);

$start = \microtime(true);

$writer->save('test.csv');

echo \microtime(true) - $start . "\n";
// 0.71157717704773 in 1.29.0
// 350.71549415588 in 2.0.0

If this is an issue with reading a specific spreadsheet file, then it may be appropriate to provide a sample file that demonstrates the problem; but please keep it as small as possible, and sanitize any confidential information before uploading.

What features do you think are causing the issue

  • Reader
  • Writer
  • Styles
  • Data Validations
  • Formula Calculations
  • Charts
  • AutoFilter
  • Form Elements

Does an issue affect all spreadsheet file formats? If not, which formats are affected?

I think this only affects CSV. Writing an XLSX file with the same code is still fast.

Which versions of PhpSpreadsheet and PHP are affected?

PhpSpreadsheet 2.0.0, tested with PHP 8.2.13

@oleibman
Copy link
Collaborator

Thank you for the report and the analysis. Expect a fix in a day or two.

oleibman added a commit to oleibman/PhpSpreadsheet that referenced this issue Feb 17, 2024
Fix PHPOffice#3904. PR PHPOffice#3839 provided a huge performance boost for sparsely populated spreadsheets. Unfortunately, it degraded performance for more densely populated spreadsheets when writing Csvs. The reason is that Csv Writer calls toArray for each row, meaning that a lot of the intermediate data used to speed things up needs to be recalculated for every row. It would be better off calling toArray just once for the entire spreadsheet; however this gives back some of the memory improvements of PR PHPOffice#3834. However, the memory effects can be substantially alleviated by supplying a Generator function to do the work. This PR does that; the result is that Csv Writer is now quite a bit faster, and with only a small memory uptick, vs. its performance in PhpSpreadsheet 1.29.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

2 participants