Add checks for unsupported characters within image path and output path #4390

Pfleiderer-Adrian · 2024-01-10T12:27:46Z

Description

I encountered an issue while attempting to load images using the ITK / SimpleITK framework in my Python script. The traceback indicates that the problem arises when attempting to create an IO object for reading an image file that contains Umlauts (ä, ü, ö) in its path. See: #4388

Providing a more descriptive error message indicating that the path cannot be read due to special characters, such as Umlauts, would be helpful for users to understand and address the issue.

Changes

I have added a simple filepath check for illegal characters in the _NewImageReader and imwrite routines. Additionally, for the imwrite routine, a check was added to verify if the output directory exists.

PR Checklist

No API changes were made (or the changes have been approved)
No major design changes were made (or the changes have been approved)

github-actions

Thank you for contributing a pull request! 🙏

Welcome to the ITK community! 🤗👋☀️

We are glad you are here and appreciate your contribution. Please keep in mind our community participation guidelines. 📜
More support and guidance on the contribution process can be found in our contributing guide. 📖

This is an automatic message. Allow for time for the ITK community to be able to read the pull request and comment
on it.

…acters

dzenanz

Ah, this check in makes sense via Python interface. Mostly looks good.

dzenanz · 2024-01-10T16:02:52Z

Wrapping/Generators/Python/itk/support/extras.py

+        try:
+            filename.encode('ascii')
+        except UnicodeEncodeError:
+            msg += "\nThe output path contains not supported special characters. \n" + f"Filename = {filename}"


not supported -> unsupported

Perhaps give user a hint about setting UTF-8 as the default code page?

dzenanz · 2024-01-10T16:03:33Z

Wrapping/Generators/Python/itk/support/template_class.py

+                try:
+                    inputFileName.encode('ascii')
+                except UnicodeEncodeError:
+                    msg += "\nThe image path contains not supported special characters. \n" + f"Filename = {inputFileName}"


Preferably give a hint about UTF8 code page.

blowekamp · 2024-01-10T17:56:23Z

There may be something in the SWIG layer that can be down with the conversion. There is some information here:
https://www.swig.org/Doc3.0/Python.html#Python_nn77

That needs to be gone through.

jhlegarreta · 2024-01-10T20:42:36Z

@Pfleiderer-Adrian thanks for having a look at this. A few comments:

Commit 7304267 does not address a bug, but a STYLE change.
Please, add a meaningful commit message body in line with ITK commit message practices.

dzenanz · 2024-01-10T21:45:55Z

The second commit should be squashed into the first one.

blowekamp

A couple suggestions.

blowekamp · 2024-01-11T14:08:55Z

Wrapping/Generators/Python/itk/support/extras.py

+        msg += "\nThe output dir doesn't exist. \n" + f"Filename = {filename}"
+        raise RuntimeError(
+            f"Could not create IO object for writing file {filename}" + msg
+        )


This would be a useful check to occur in the C++ code.

I agree, but I assume that performing an equivalent check in C++ would be a lot harder. If the author has the will to explore this, it would be good.

blowekamp · 2024-01-11T14:09:18Z

Wrapping/Generators/Python/itk/support/extras.py

+        try:
+            filename.encode('ascii')
+        except UnicodeEncodeError:
+            msg += "\nThe output path contains not supported special characters. \n" + f"Filename = {filename}"


It may be useful to place this check into a function.

Pfleiderer-Adrian · 2024-01-12T09:22:08Z

Thank you for your feedback! I'll post an update this weekend. :)

N-Dekker · 2024-01-12T10:18:38Z

Thank you for diving into this issue, @Pfleiderer-Adrian Do I understand correctly that this issue (#4388) is only a problem for Windows users, not for Linux users? (Specifically, only those Windows users that do not have UTF-8 support enabled and selected a locale that does not support the filename?) I'm asking, because it feels too strict to only accept ASCII characters from now, for all users.

So is it really the intention to reject any non-ASCII character in file names, for any user? Aren't we then throwing the baby out with the bath water?

Would it be an idea to only do those checks after trying to use imageIO, and only when imageIO has really failed to do its job?

github-actions bot added the area:Python wrapping Python bindings for a class label Jan 10, 2024

Pfleiderer-Adrian mentioned this pull request Jan 10, 2024

Unable to load images with Umlauts in filepath (ITK / SimpleITK) #4388

Open

github-actions bot reviewed Jan 10, 2024

View reviewed changes

Adrian pfleiderer added 2 commits January 10, 2024 13:53

BUG: check if input and output path contains unsupported special char…

3cd6261

…acters

BUG: delete empty line

7304267

Pfleiderer-Adrian force-pushed the master branch from d1bdbce to 7304267 Compare January 10, 2024 12:56

dzenanz reviewed Jan 10, 2024

View reviewed changes

dzenanz requested review from thewtex and blowekamp January 10, 2024 16:04

blowekamp reviewed Jan 11, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add checks for unsupported characters within image path and output path #4390

Add checks for unsupported characters within image path and output path #4390

Pfleiderer-Adrian commented Jan 10, 2024

github-actions bot left a comment

dzenanz left a comment

dzenanz Jan 10, 2024

dzenanz Jan 10, 2024

blowekamp commented Jan 10, 2024

jhlegarreta commented Jan 10, 2024 •

edited

Loading

dzenanz commented Jan 10, 2024

blowekamp left a comment

blowekamp Jan 11, 2024

dzenanz Jan 11, 2024

blowekamp Jan 11, 2024

Pfleiderer-Adrian commented Jan 12, 2024

N-Dekker commented Jan 12, 2024

Add checks for unsupported characters within image path and output path #4390

Are you sure you want to change the base?

Add checks for unsupported characters within image path and output path #4390

Conversation

Pfleiderer-Adrian commented Jan 10, 2024

Description

Changes

PR Checklist

github-actions bot left a comment

Choose a reason for hiding this comment

dzenanz left a comment

Choose a reason for hiding this comment

dzenanz Jan 10, 2024

Choose a reason for hiding this comment

dzenanz Jan 10, 2024

Choose a reason for hiding this comment

blowekamp commented Jan 10, 2024

jhlegarreta commented Jan 10, 2024 • edited Loading

dzenanz commented Jan 10, 2024

blowekamp left a comment

Choose a reason for hiding this comment

blowekamp Jan 11, 2024

Choose a reason for hiding this comment

dzenanz Jan 11, 2024

Choose a reason for hiding this comment

blowekamp Jan 11, 2024

Choose a reason for hiding this comment

Pfleiderer-Adrian commented Jan 12, 2024

N-Dekker commented Jan 12, 2024

jhlegarreta commented Jan 10, 2024 •

edited

Loading