FileIO performs no range checking #158

D4N · 2017-11-09T22:41:46Z

#156 uncovered that the FileIO functions long read(byte* buf, long rcount) and int read(long offset, Position pos) perform no explicit checking for EOF. They just pass any potential errors via return values (which are more or less ignored everywhere).

I propose two possible solutions:

Add __attribute__((warn_unused_result)) together with -Werror for gcc & clang and _Check_return_ for msvc (with the appropriate flag for -Werror). This will enforce return value checking and the FileIO class keeps being a thin wrapper around the C-API.
Make FileIO explicitly check whether it read enough or seek'd enough and throw an exception on error. Then errors can be "ignored" in the code (i.e. do not explicitly try to catch the exception), but if an error occurs, the program does not continue to run with the implicit assumption of success.

The second solution is easier to implement but has the disadvantage that when explicitly running a seek or read and checking for errors becomes very verbose:

try {
  io.seek(ofs, pos);
catch (const EOF& e) {
  // handle eof error
}

which is also less efficient than:

if (!io.seek(ofs, pos)) { 
  // handle eof 
}

Opinions?

The text was updated successfully, but these errors were encountered:

clanmills · 2017-11-09T23:31:17Z

Dan Perhaps you have to branch and try both approaches to understand how much code requires to be changed. Option 1 seems less brutal. Option 2 seems more verbose. There is another matter with the basicio.cpp code with how it deals with 64 bits. The code was originally 32 but and has been massaged to compile as 64 bit. It needs to be cleaned up to be 64 bit. And then the I/O clients such as jpgimage.cpp will require some work. A user in Russia offered in April to help with this and said he might have time to work on this in 2018. I’ll email him tomorrow. We can add checks to printStructure() to address #156. However reviewing and reworking the I/O is a longer term project. There are about 18 major elements in Exiv2 (user support, build, test, I/O etc). It would be wonderful to have 18 engineers. Let me try to recruit this Russian person. Robin

…

On 9 Nov 2017, at 22:41, D4N ***@***.***> wrote: #156 uncovered that the FileIO functions long read(byte* buf, long rcount) and int read(long offset, Position pos) perform no explicit checking for EOF. They just pass any potential errors via return values (which are more or less ignored everywhere). I propose two possible solutions: Add __attribute__((warn_unused_result)) together with -Werror for gcc & clang and _Check_return_ for msvc (with the appropriate flag for -Werror). This will enforce return value checking and the FileIO class keeps being a thin wrapper around the C-API. Make FileIO explicitly check whether it read enough or seek'd enough and throw an exception on error. Then errors can be "ignored" in the code (i.e. do not explicitly try to catch the exception), but if an error occurs, the program does not continue to run with the implicit assumption of success. The second solution is easier to implement but has the disadvantage that when explicitly running a seek or read and checking for errors becomes very verbose: try { io.seek(ofs, pos); catch (const EOF& e) { // handle eof error } which is also less efficient than: if (!io.seek(ofs, pos)) { // handle eof } Opinions? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

D4N · 2018-01-08T09:58:38Z

I have thought about this over the past weeks and I think that throwing an exception is the better approach here.

Returning an error code requires to check their error codes in every place, however when we hit an EOF during parsing, it is quite surely unrecoverable independently which individual call inside the function reached the EOF. Thus throwing an exception seems like the better thing to do, as it will lead to more readable code instead of the C way of doing things, which would be:

int retval = 0;
const int err = read(buf, count);
if (err) {
    retval = EOF_REACHED;
    goto cleanup;
}

cleanup:
    /* do deallocation and closing of files if applicable here */
    return retval;

Which I really don't want to see in a C++ library, albeit it is more efficient than throwing exceptions all over the place.

D4N · 2018-01-27T00:11:02Z

#212 shows that explicit checking for EOF after a read() makes the code very smelly.

Unfortunately we cannot start throwing exceptions around the place, as that will break existing code. For instance in Jp2Image::readMetadata() the return value of io.read() is explicitly checked and used as a loop condition. Throwing an exception on EOF there will make the whole function behave completely differently (and probably wrongly too).

Maybe have two sets of functions? One that throws and one that doesn't? Though that smells, too... Maybe we have to bite the bullet and refactor Jp2Image::readMetadata() (that function needs it badly anyway), although that is surely not going to be the only place.

cc @Kicer86

D4N · 2018-02-13T22:40:04Z

I have been digging around the code base for the last few days and I think FileIO should just throw exceptions when EOF is reached or when you try to read from a closed file. There are many places in the code base with constructs like this:

if (io_->open() != 0) {
    throw Error(9, io_->path(), strError());
}

or this:

if (io_->error() || io_->eof()) throw Error(14);

which shows that in most cases we throw an exception anyway.

If FileIO starts throwing exceptions it will surely break existing code, the Jp2Image parser for instance. However, I hope the benefits will outweigh the cost.

clanmills · 2018-02-17T14:53:05Z

@D4N I think your idea to throw sounds good. And perhaps io->read(count) should also throw if unable to obtain the requested bytes. Sure, this will break some existing code (eg the JPEG 2000 parser), however it's not difficult to identify and fix/refactor a few code regions.

I think we should delay this change until we are closer to releasing v0.27. This is a substantial change to the API. There are folks still back-porting fixes for v0.26 and this change could collide with back-ports. So, while I think this is a good idea, let's introduce this towards the end of the development of v0.27.

clanmills · 2021-04-13T10:18:36Z

This is a good idea. I've asked @pydera to work on BasicIo for v1.00. I'm going to close this issue to reduce the number of open issues. This will get attention for v1.00. #1538.

D4N added the enhancement feature / functionality enhancements label Nov 9, 2017

D4N mentioned this issue Feb 1, 2018

fix for crash in bigtiff (issue #208) #212

Merged

clanmills modified the milestone: v0.27 Nov 7, 2018

clanmills closed this as completed Apr 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FileIO performs no range checking #158

FileIO performs no range checking #158

D4N commented Nov 9, 2017

clanmills commented Nov 9, 2017 via email

D4N commented Jan 8, 2018

D4N commented Jan 27, 2018

D4N commented Feb 13, 2018

clanmills commented Feb 17, 2018

clanmills commented Apr 13, 2021

FileIO performs no range checking #158

FileIO performs no range checking #158

Comments

D4N commented Nov 9, 2017

clanmills commented Nov 9, 2017 via email

D4N commented Jan 8, 2018

D4N commented Jan 27, 2018

D4N commented Feb 13, 2018

clanmills commented Feb 17, 2018

clanmills commented Apr 13, 2021