ODS Reader yields different results #184

madflow · 2016-03-13T17:37:20Z

Hi there!

When playing with the Reader part of the library I came across the artifact, that the ODS reader treats input differently than its peers XLSX and CSV.

0 (Zeros) are treated as missing values - in a Spreadsheet these are perfectly valid values (PHP empty() in use...) I would argue, that even "empty" values are perfectly valid and should yield empty arrays in the row iterator (Like in CSV and XLSX).

Example:

The calculated values to show in a row are computed differently. ODS creates arrays with unequal length.

Example:

$data = [
    ['A','B','C'],
    [0, '', ''],
    [1, 1, '']
];

// Code

Array
(
    [xlsx] => Array
        (
            [0] => Array
                (
                    [0] => A
                    [1] => B
                    [2] => C
                )

            [1] => Array
                (
                    [0] => 0
                    [1] => 
                    [2] => 
                )

            [2] => Array
                (
                    [0] => 1
                    [1] => 1
                    [2] => 
                )

        )

    [ods] => Array
        (
            [0] => Array
                (
                    [0] => A
                    [1] => B
                    [2] => C
                )

            [1] => Array
                (
                    [0] => 0
                )

            [2] => Array
                (
                    [0] => 1
                    [1] => 1
                    [2] => 
                )

        )

    [csv] => Array
        (
            [0] => Array
                (
                    [0] => A
                    [1] => B
                    [2] => C
                )

            [1] => Array
                (
                    [0] => 0
                    [1] => 
                    [2] => 
                )

            [2] => Array
                (
                    [0] => 1
                    [1] => 1
                    [2] => 
                )

        )

)

Maybe this is wanted behavior: Then I would argue to change it and introduce a legacy flag for the old behavior to avoid a regression.

If this is a known thing and ODS needs some love - then this could be a PR.

Thanks!

The text was updated successfully, but these errors were encountered:

adrilo · 2016-03-14T18:36:48Z

Hi @madflow, thanks for filing this issue.

This is definitely not the correct behavior. All readers should behave consistently.
If you have time and are interested to help, feel free to submit a PR. I'd be happy to guide you and get it approved

madflow · 2016-03-15T08:44:54Z

I am interested to help - of course time and knowledge about the spreadsheet internals are an issue. I will have a look at the tests first. I reckon there should be a set of shared tests that all readers should equally pass.

adrilo · 2016-03-15T20:10:16Z

Yes! Tests are in the "tests" directory (obviously). ODS files are composed by a bunch of XML files, zipped together. One way to get familiar with the ODS format is to create an ODS file with fake data, unzip the *.ods file and look at the extracted files. They all contain some pieces of information you have to merge together to get the actual data.

The test files used in the tests are located in "tests/resources". What I recommend you to do is create a test file with zeroes and empty strings, add a test with the expected behavior (it should fail) and then update the ODS Reader code to fix the issue. Once the issue is fixed, the test should now pass.

adrilo · 2016-03-18T00:50:43Z

Hey @madflow, your changes look good. Feel free to submit a pull request. I'll be happy to merge it

madflow · 2016-03-18T09:21:15Z

The current commits only fix bug number 1 (Zeros treated as missing values). The second Bug has not been fixed yet in my branch (tests are failing like they are supposed to) ;)

I can create a "Work in Progress" PR if you like.

…ox#184

adrilo · 2016-03-18T21:40:38Z

No need for a WIP PR. Just submit the PR when you are done :)

…ox#184

Fixes for #184

adrilo · 2016-03-21T17:28:13Z

Fixed as part of #189

adrilo added bug help wanted labels Mar 14, 2016

madflow pushed a commit to madflow/spout that referenced this issue Mar 16, 2016

Tests for box#184

187be5b

madflow pushed a commit to madflow/spout that referenced this issue Mar 16, 2016

Fix zeros treated as missing values box#184

2a25e7a

madflow pushed a commit to madflow/spout that referenced this issue Mar 18, 2016

More explicit rule for ignoring empty placeholder cells in Excel ODS b…

8f499cb

…ox#184

madflow pushed a commit to madflow/spout that referenced this issue Mar 19, 2016

Tests for box#184

2b1160b

madflow pushed a commit to madflow/spout that referenced this issue Mar 19, 2016

Fix zeros treated as missing values box#184

3ee7099

madflow pushed a commit to madflow/spout that referenced this issue Mar 19, 2016

More explicit rule for ignoring empty placeholder cells in Excel ODS b…

e60054f

…ox#184

adrilo added a commit that referenced this issue Mar 21, 2016

Merge pull request #189 from madflow/ods-missing-values

6c57125

Fixes for #184

adrilo closed this as completed Mar 21, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ODS Reader yields different results #184

ODS Reader yields different results #184

madflow commented Mar 13, 2016

adrilo commented Mar 14, 2016

madflow commented Mar 15, 2016

adrilo commented Mar 15, 2016

adrilo commented Mar 18, 2016

madflow commented Mar 18, 2016

adrilo commented Mar 18, 2016

adrilo commented Mar 21, 2016

ODS Reader yields different results #184

ODS Reader yields different results #184

Comments

madflow commented Mar 13, 2016

adrilo commented Mar 14, 2016

madflow commented Mar 15, 2016

adrilo commented Mar 15, 2016

adrilo commented Mar 18, 2016

madflow commented Mar 18, 2016

adrilo commented Mar 18, 2016

adrilo commented Mar 21, 2016