Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

getroot() method of xml.etree.ElementTree return type should be nullable #7770

Closed
andrea-guarino-sonarsource opened this issue May 2, 2022 · 3 comments · Fixed by #8069
Closed

Comments

@andrea-guarino-sonarsource

Hello,
I'm one of the maintainers of Sonar Python analyzer and we use typeshed for our type inference engine. (btw thanks a lot for maintaining typeshed!)

While investigating a false positive reported by a user, I noticed that that getroot() method of ElementTreeclass has return type annotation -> Element (see here).

I think getroot() return type annotation should beElementTree | None instead, because it can returns None when root element is not provided:

import xml.etree.ElementTree as ET

tree = ET.ElementTree()
print(tree.getroot()) # prints None

In fact, e.g. 3.10 implementation of ElementTree is defined as follows:

class ElementTree:
    def __init__(self, element=None, file=None):
        # assert element is None or iselement(element)
        self._root = element # first node
        if file:
            self.parse(file)

    def getroot(self):
        """Return root element of this tree."""
        return self._root

I'm happy to open a PR if you think the reported issue is valid.

Thanks,
Andrea

@srittau
Copy link
Collaborator

srittau commented May 3, 2022

The problem with adding None to the return type is that users that know that a root exists (for example when parsing an ElementTree) have to do an extra check, something we like to avoid in typeshed. E.g.:

import xml.etree.ElementTree as ET

tree = ET.parse()
root = tree.getroot()  # "root" is never None

Our usual solution would be to return Element | Any to at least somewhat reflect the fact that getroot() can not only return Element, but something else as well.

Obligatory comment that python/typing#566 would help with that as well.

@linzouzou
Copy link

if "return Element | Any" were used, it's kind of overlapped since "Any" includes "Element".

@Akuli
Copy link
Collaborator

Akuli commented Dec 1, 2023

Indeed. You would expect Element | Any to be same as just Any, but it isn't.

For example, consider this: (mostly copy/paste from xml module docs)

import xml.etree.ElementTree as ET
tree = ET.parse('country_data.xml')
root = tree.getroot()
print(root.tagg)  # this line is a typo
  • Any means "please do not complain" to type checkers. If root has type Any, you will no error for this.
  • Element means "will always be an Element", which is wrong, and would cause type checkers to emit errors for code like if root is None.
  • Element | None means "you must check for None", which is correct but can get annoying. If I understand correctly (I don't use xml often), it is common to do things like ET.parse("file.xml").getroot().iter("whatever").
  • Element | Any means "must be prepared to handle an Element". You will get an error for root.tagg, because it is not valid when root is an Element. But type heckers are happy with if root is None checks, because we're saying it can also be something else than an Element.

In typeshed we unofficially call this "the Any trick". We tend to use it whenever something can be None, but requiring users to check for None would be more painful than helpful.

Another example of the Any trick is re stubs, where various functions return a match object or None. Regexes are often used so that the regex matches any string, or the string has already been validated so that the regex will match. But type checkers don't know this, so complaining about the None would get very annoying.

This was referenced Dec 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants