Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The method chunkBy is not yielding chunks correctly #524

Closed
6 tasks done
verzola opened this issue May 9, 2024 · 2 comments
Closed
6 tasks done

The method chunkBy is not yielding chunks correctly #524

verzola opened this issue May 9, 2024 · 2 comments

Comments

@verzola
Copy link

verzola commented May 9, 2024

Bug Report

Information Description
Version 9.15.0
PHP version 8.1
OS Platform Ubuntu (WSL)

Summary

The recently added chunkBy method is not chunking the CSV correctly like it is described on the documentation:

If you are dealing with a large CSV and you want it to be split in smaller sizes for better handling you can use the chunkBy method which breaks the TabularDataReader into multiple, smaller instances with a given size. The last instance may contain fewer records because of the chunk size you have chosen.

Instead of creating chunks of the size passed to the chunkBy method, it only creates 2 chunks, the first one with the correct chunk size and the second and last with the total csv lines count.

It seems that this issue might be caused by the changes of this commit:
51968b6#diff-63e150e70c3f2253f8ab94c5a8fe06190bbfee264c842f2c7fb51f9920dd0f2eR170

I tested the first version of the chunkBy method from this commit and it is working as expected:
60b0062#diff-63e150e70c3f2253f8ab94c5a8fe06190bbfee264c842f2c7fb51f9920dd0f2eR170

If I add these 2 lines after the first yield in chunkBy method on the released 9.15.0 version, it also works as expected:

$nbRecords = 0;
$records = [];

Standalone code, or other way to reproduce the problem

I created a repository for this example:
https://github.com/verzola/league-csv-chunkby-bug

<?php

use League\Csv\Reader;

require_once __DIR__ . '/vendor/autoload.php';

// data.csv is an example csv with 6000 lines
$reader = Reader::createFromPath(__DIR__ . '/data.csv');

$chunks = $reader->chunkBy(1000);

foreach ($chunks as $chunk) {
  echo count($chunk) . PHP_EOL;
}

Expected result

The expected output is:
1000
1000
1000
1000
1000
1000

Actual result

The actual output is:
1000
6000

Checks before submitting

  • Be sure that there isn't already an issue about this. See: Issues list
  • Be sure that there isn't already a pull request about this. See: Pull requests
  • I have added every step to reproduce the bug.
  • If possible I added relevant code examples.
  • This issue is about 1 bug and nothing more.
  • The issue has a descriptive title. For example: "JSON rendering failed on Windows for filenames with space".
nyamsprod added a commit that referenced this issue May 9, 2024
@nyamsprod
Copy link
Member

thanks for reporting the issue it has been fixed and will be part of the next release.

nyamsprod added a commit that referenced this issue May 9, 2024
nyamsprod added a commit that referenced this issue May 10, 2024
nyamsprod added a commit that referenced this issue May 10, 2024
@nyamsprod
Copy link
Member

nyamsprod commented May 25, 2024

the fix is now released in version 9.16.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants