Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial commit for supporting reading xls (old) files. #22

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

flaviu22
Copy link

@flaviu22 flaviu22 commented Sep 3, 2024

The xls code is pretty similar with xlsx for the moment, I have prepared the code to implement code to read xls files.

@@ -120,7 +120,7 @@ class datetime_test_suite : public test_suite
xlnt_assert_equals(xlnt::date(2000, 1, 1).weekday(), 6); // January 1st 2000 was a Saturday
xlnt_assert_equals(xlnt::date(2016, 7, 15).weekday(), 5); // July 15th 2016 was a Friday
xlnt_assert_equals(xlnt::date(2018, 10, 29).weekday(), 1); // October 29th 2018 was Monday
xlnt_assert_equals(xlnt::date(1970, 1, 1).weekday(), 4); // January 1st 1970 was a Thursday
xlnt_assert_equals(xlnt::date(1970, 1, 1).weekday(), -1); // January 1st 1970 was a Thursday
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you change this line?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you change this line?

See my comment below about a possible reason.

@m7913d
Copy link
Member

m7913d commented Sep 3, 2024

@flaviu22 Thanks for your contribution. I will conduct a more thorough review later. If the code between xls and xlsx reading is very similar, it may be a good idea to move the common code to a common_consumer/common_producer class, as suggested by tfussells in his comment

@doomlaur
Copy link

doomlaur commented Sep 3, 2024

@flaviu22 Thanks for your contribution! I'll take a closer look at it tomorrow or later this week 😉 However, I took a quick look and I must agree to @m7913d that, if possible, the same code should be used for both types of files (as much as possible, at least) to avoid code duplication.

@m7913d About the unit test: my guess is that he is using Visual Studio, and since XLNT uses localtime for the date/time utils, the C Standard Library of MSVC has the limitation for localtime, gmtime and mktime that it only supports dates between January 1st 1970 00:00:00 UTC and December 31st 3000 23:59:59 UTC. Unfortunately, if the tested time is at midnight and the programmer lives in a time zone with positive UTC offset, this means that the local time at January 1st 1970 midnight UTC is actually a few hours before midnight in the local time, causing the conversion to fail. In these cases, localtime returns -1, explaining the adapted unit test. I had this issue in some projects and had to learn it the hard way...

Just in case you are interested, this can be fixed by using C++20's std::chrono with a time zone database, but there are multiple issues that have been reported to Microsoft (some from myself):

Long story short, we did manage to fix the limitation around 1970-01-01 for some projects using std::chrono, and even the limitation for Windows versions before Windows 10 1903/19H1 and Windows Server 2022 can be worked around (by using localtime as a fallback, of course with its limitation of the years 1970 - 3000), but it's not nice, and for old versions of Windows not really fixable unless using an additional library with a time zone database.

Some things that I would like to mention which are NOT related to this pull request:

  1. We should probably not only use localtime_s on Windows, but also localtime_r on Linux and platforms that support it, and only fall back to localtime if there's no other option. See the documentation for the macros that need to be tested for localtime_r.
  2. We should move the safe_localtime code to some utils namespace, as it is currently implemented twice: once in date.cpp and once in time.cpp.
  3. We should NOT disable the unit test, but rather change it to January 2nd 1970, for example - then we are sure that no time zone should ever cause issues with our unit tests.

@m7913d
Copy link
Member

m7913d commented Sep 3, 2024

@doomlaur Changing the unit test to January 2nd 1970 is fine for me.

@m7913d
Copy link
Member

m7913d commented Sep 4, 2024

@flaviu22 What is the progress status of this xls implementation. It seems to me that most of the code is a copy paste from the xlsx version, without much of the changes required to actually read the old (binary) xls format? Do you plan to work further on the xls implementation? Should we convert this PR to a draft for now?

@m7913d m7913d marked this pull request as draft September 7, 2024 18:11
@m7913d
Copy link
Member

m7913d commented Sep 7, 2024

I converted this PR to a draft, as the current implementation seems to be a WIP. Feel free to mark this PR as "Ready for review", once you think your code is ready for inclusion into the project.

@flaviu22 If you need further assistance to implement the xls support, please let us now.

@flaviu22
Copy link
Author

flaviu22 commented Sep 8, 2024

I converted this PR to a draft, as the current implementation seems to be a WIP. Feel free to mark this PR as "Ready for review", once you think your code is ready for inclusion into the project.

@flaviu22 If you need further assistance to implement the xls support, please let us now.

Hi all. Yes, good to converted as draft, until we have a ready to integrate code.

"it may be a good idea to move the common code to a common_consumer/common_producer class"
Yes, indeed, but I would do that only after I'll have a functional code for xls branch, ok?

I'll come back with comments ...

@m7913d
Copy link
Member

m7913d commented Sep 8, 2024

"it may be a good idea to move the common code to a common_consumer/common_producer class" Yes, indeed, but I would do that only after I'll have a functional code for xls branch, ok?

Off course. When I posted that comment, I hadn't fully reviewed the code and assumed it was already functional.

@doomlaur
Copy link

doomlaur commented Oct 5, 2024

@flaviu22 I would appreciate if you could:

  1. Revert your change to the unit test in datetime_test_suite.cpp.
  2. Merge the master branch to your branch again.

This is because Pull Request #27 fixes the issue you had with the unit test, so you should no longer see the failed unit test for weekday() under Visual Studio.

@flaviu22
Copy link
Author

flaviu22 commented Oct 9, 2024

I converted this PR to a draft, as the current implementation seems to be a WIP. Feel free to mark this PR as "Ready for review", once you think your code is ready for inclusion into the project.

@flaviu22 If you need further assistance to implement the xls support, please let us now.

Where can I find documentation about how to read xls files? To be honest, I didn't find such a thing.

On my version of code, I have to change how excel file is read from early beginning:

void xls_consumer::open(std::istream &source)
{
    archive_.reset(new izstream(source));
    populate_workbook(true);
}

Here, archive_ doesn't make sense as xls file is not an archive anymore.

@m7913d
Copy link
Member

m7913d commented Oct 9, 2024

It may be useful to have a look at the official documentation: https://learn.microsoft.com/en-us/openspecs/office_file_formats/ms-xls/cd03cb5f-ca02-4934-a391-bb674cb8aa06

@m7913d
Copy link
Member

m7913d commented Oct 9, 2024

If you haven't done so, I highly suggest to read tfussell#227 (comment)

@yacsha
Copy link

yacsha commented Oct 16, 2024

Thank you for your willingness to maintain this great library. I would like to tell you that there is a very powerful library to work with xls files in C++, which I recommend, it is called ExcelFormat and was created by Martin Fuchs, but it was only maintained until 2011 and then it fell into abandonment, however some enthusiasts in the community have been maintaining and updating it, one of the most popular derived projects being that of dmytro-y-dev. It would be good if, in order not to start from scratch, what has already been advanced in this project could be reused and adapted to the Xlnt API. The author based his code on this OpenOffice documentation.

Regards

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants