Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to force Reader to skip empty first line in windows? #393

Closed
dariusheivazi opened this issue Aug 26, 2020 · 4 comments
Closed

How to force Reader to skip empty first line in windows? #393

dariusheivazi opened this issue Aug 26, 2020 · 4 comments

Comments

@dariusheivazi
Copy link

dariusheivazi commented Aug 26, 2020

(Fill in the relevant information below to help triage your issue.)

Q A
Version 9.6

Question

How to force Reader to skip empty first line in windows?

(Please explain in plain english your question)

When reading files with windows EOL (CR LF), the Reader class, makes a record with the following var_dump result as the fist record:
The empty.csv is a empty file with one EOL (CR LF) in it.
FILE CONTENTS:
1: CR LF
2:

CODE:
$reader = Reader::createFromPath('i:\empty.csv');
$reader->skipEmptyRecords();
$records = iterator_to_array($reader);
var_dump($records);

OUTPUT:
array(1) {
[0] =>
array(1) {
[0] =>
string(0) ""
}
}

However next empty lines are skipped as expected. e.g. If I change the file contents to:
FILE CONTENTS:
1: CR LF
2: foo,bar CR LF
3: CR LF
4:

out put will be:
OUTPUT:
array(2) {
[0] =>
array(1) {
[0] =>
string(0) ""
}
[1] =>
array(2) {
[0] =>
string(3) "foo"
[1] =>
string(3) "bar"
}
}

As you see, the 3rd row is skipped but the first empty row problem is still there.
I have to deal with a lot of small csv files, all build in windows and the empty.csv is a typical among them. By now I workaround the problem by checking the file size, but when it comes to files with several lines, I have to check first lines every time I fetch the CSV. I'm sure there is a better way.
Please help.
Thanks.

@nyamsprod
Copy link
Member

@dariusheivazi thanks for using the library. Did you try to use the Statement object as explain in the documentation ? I'm afraid I can only point you to that direction as it does not seems to be a bug or an undocumented feature

@dariusheivazi
Copy link
Author

dariusheivazi commented Aug 27, 2020

@dariusheivazi thanks for using the library. Did you try to use the Statement object as explain in the documentation ? I'm afraid I can only point you to that direction as it does not seems to be a bug or an undocumented feature

@nyamsprod, Thanks for your response. Actually I use Statement Objects in other parts of the the application. But for this particular problem, (I mean when the first empty line is not skipped), I thought the problem could be solved by a way more straight than using Statement objects.

Part of documentation:
https://csv.thephpleague.com/9.0/reader/

Controlling the presence of empty records
New since version 9.4.0

By default the CSV document normalization removes empty records. But you can control the presence of such records using the following methods:

Reader::skipEmptyRecords(): self;
Reader::includeEmptyRecords(): self;
Reader::isEmptyRecordsIncluded(): bool;

Calling Reader::includeEmptyRecords will ensure empty records are left in the Iterator returned by Reader::getRecords, conversely Reader::skipEmptyRecords will ensure empty records are skipped.
At any given time you can ask you Reader instance if empty records will be stripped or included using the Reader::isEmptyRecordsIncluded method.

@nyamsprod
Copy link
Member

@dariusheivazi I'm guilty of naming stuff poorly 😉 Reader::skipEmptyRecords is the default. So records deemed empty are already removed.

In you case I think the first line is technically not empty so the only way to remove it is to filter it out using the Statement object. At least that's what I understood from your explanation.

@dariusheivazi
Copy link
Author

dariusheivazi commented Aug 27, 2020

@dariusheivazi I'm guilty of naming stuff poorly 😉 Reader::skipEmptyRecords is the default. So records deemed empty are already removed.

In you case I think the first line is technically not empty so the only way to remove it is to filter it out using the Statement object. At least that's what I understood from your explanation.

HI, Really I dont know which part of my words were confusing. The library does not skip empty records as it is promised in documentation: By default the CSV document normalization removes empty records. I investigated the flow. The problem is the normalizing function. If the first record is empty, it does not skip it. Yes. you are right, technically first line may contain BOM, so the normalizing funtion does not find it as null. The resolution is: make it a little bit more smart: Just change the Line 255 of file Reader.php:
FROM
return is_array($record) && ($this->is_empty_records_included || $record != [null]);
TO:
return is_array($record) && ($this->is_empty_records_included || ($record != [null] && $record != [$this->input_bom]));

and It should work, of course if this does not cuase BC break. Anyway. Thanks for this great library. I hope you see this comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants