Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return types of XMLPullParser.read_events and iterparse from xml.etree.ElementTree #8039

Closed
Prometheus3375 opened this issue Jun 8, 2022 · 3 comments · Fixed by #8076
Closed

Comments

@Prometheus3375
Copy link
Contributor

Prometheus3375 commented Jun 8, 2022

typeshed assigns return type for iterparse as Iterator[tuple[str, Any]] and for XMLPullParser.read_events as Iterator[tuple[str, Element]].

Internally, iterparse uses XMLPullParser and then yields from XMLPullParser.read_events. But iterparse returns a special iterator object which has a nullable root attribute.

XMLPullParser.read_events itself returns tuples with different 2nd object depending on the event. From docs:

start, end: the current Element.
comment, pi: the current comment / processing instruction
start-ns: a tuple (prefix, uri) naming the declared namespace mapping.
end-ns: None (this may change in a future version)

At now, the presence of root attribute of iterparse result is not documented (issue), so I suggest to ignore it and simply use Iterator[T] as return type for both XMLPullParser.read_events and iterparse.

I propose several variants for T:

tuple[str, Any]
tuple[str, Element | tuple[str, str] | None]
tuple[Literal['start', 'end', 'comment', 'pi'], Element] | \
    tuple[Literal['start-ns'], tuple[str, str]] | \
    tuple[Literal['end-ns'], None]

I am not sure which one is better for type checkers.

@AlexWaygood
Copy link
Member

In general, we prefer to avoid union return types in typeshed: see python/mypy#1693 for some detailed discussion. They end up being annoying for end users due to the number of checks they have to perform on the returned object in order to precisely narrow the type. So -- while I'm far from an expert on the xml modules -- I would guess that Iterator[tuple[str, Any]] would probably be the return type least likely to cause false-positive errors.

@Prometheus3375
Copy link
Contributor Author

Great, it that case only return type of XMLPullParser.read_events must be changed. Should I create a PR or is it better to wait some others for discussion?

@AlexWaygood
Copy link
Member

Feel free to create a PR! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants