Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Issue 305] add new json parser #366

Merged
merged 22 commits into from
Dec 28, 2022

Conversation

meretp
Copy link
Collaborator

@meretp meretp commented Dec 9, 2022

fixes #305

Signed-off-by: Meret Behrens [email protected]

@meretp meretp force-pushed the issue-305_json_parser branch 5 times, most recently from 45d6c65 to c675847 Compare December 14, 2022 07:22
@meretp meretp marked this pull request as ready for review December 14, 2022 07:23
@meretp meretp marked this pull request as draft December 14, 2022 07:24
@meretp meretp marked this pull request as ready for review December 14, 2022 07:54
@meretp meretp changed the title [Issue 305] refactor json parser [Issue 305] add new json parser Dec 14, 2022
Copy link
Collaborator

@armintaenzertng armintaenzertng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this monumental effort :)
I have some initial remarks mostly concerned with (hopefully) improving readability

src/parser/json/actor_parser.py Outdated Show resolved Hide resolved
src/parser/logger.py Show resolved Hide resolved
src/parser/logger.py Outdated Show resolved Hide resolved
src/parser/json/json_parser.py Outdated Show resolved Hide resolved
src/parser/json/json_parser.py Outdated Show resolved Hide resolved
src/parser/json/dict_parsing_functions.py Outdated Show resolved Hide resolved
src/parser/json/dict_parsing_functions.py Outdated Show resolved Hide resolved
src/parser/json/dict_parsing_functions.py Outdated Show resolved Hide resolved
src/parser/json/dict_parsing_functions.py Outdated Show resolved Hide resolved
src/parser/json/dict_parsing_functions.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@armintaenzertng armintaenzertng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! :)
Most comments concern the naming of things but there are a few that go a little deeper.
I haven't looked at the tests yet, but I feel like there aren't enough yet

src/model/typing/constructor_type_errors.py Outdated Show resolved Hide resolved
src/parser/error.py Outdated Show resolved Hide resolved

elif person_match:
name: str = person_match.group(1).strip()
email: Optional[str] = ActorParser.get_email_or_none(person_match)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
email: Optional[str] = ActorParser.get_email_or_none(person_match)
email: Optional[str] = person_match.group(4).strip() or None

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This way, we can discard the get_email_or_none method

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wouldn't work. If person_match.group(4) is None (e.g. if the brackets for email are completely missing) this would lead to an AttributeError. I also thought that this could be handled in one line but didn't find a good way besides using the method.

src/parser/json/annotation_parser.py Outdated Show resolved Hide resolved
src/parser/json/annotation_parser.py Outdated Show resolved Hide resolved
src/parser/logger.py Outdated Show resolved Hide resolved
src/parser/json/annotation_parser.py Outdated Show resolved Hide resolved
src/parser/json/creation_info_parser.py Outdated Show resolved Hide resolved
src/parser/json/package_parser.py Outdated Show resolved Hide resolved
src/parser/json/dict_parsing_functions.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@armintaenzertng armintaenzertng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll probably need more tests to catch invalid inputs (open an issue for this) but the current state will do for now, I think.

tests/parser/test_actor_parser.py Outdated Show resolved Hide resolved
tests/parser/test_actor_parser.py Outdated Show resolved Hide resolved
tests/parser/test_actor_parser.py Outdated Show resolved Hide resolved
tests/parser/test_annotation_parser.py Show resolved Hide resolved
tests/parser/test_annotation_parser.py Outdated Show resolved Hide resolved
tests/parser/test_file_parser.py Outdated Show resolved Hide resolved
tests/parser/test_json_parser.py Outdated Show resolved Hide resolved
tests/parser/test_json_parser.py Outdated Show resolved Hide resolved
tests/parser/test_package_parser.py Outdated Show resolved Hide resolved
tests/parser/test_relationship_parser.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@armintaenzertng armintaenzertng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice going, I like the breadth of the tests much better now :)
Just a few remarks left

src/parser/json/dict_parsing_functions.py Outdated Show resolved Hide resolved
src/parser/json/annotation_parser.py Outdated Show resolved Hide resolved
src/parser/json/dict_parsing_functions.py Outdated Show resolved Hide resolved
Comment on lines 64 to 67
def parse_relationships(self, relationship_dicts: List[Dict]) -> List[Relationship]:
logger = Logger()
relationship_list = []
for relationship_dict in relationship_dicts:
relationship_list = append_parsed_field_or_log_error(logger=logger, list_to_append_to=relationship_list,
field=relationship_dict, method_to_parse=self.parse_relationship)
raise_parsing_error_if_logger_has_messages(logger)
return relationship_list
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new double-lambda expressions may not be the prettiest (we might even consider helper methods which return the required functions there) but we lose quite a lot of weight in return, so I think it is worth it

src/parser/json/relationship_parser.py Outdated Show resolved Hide resolved
src/parser/json/relationship_parser.py Outdated Show resolved Hide resolved
tests/parser/test_creation_info_parser.py Outdated Show resolved Hide resolved
tests/parser/test_package_parser.py Show resolved Hide resolved
tests/parser/test_relationship_parser.py Show resolved Hide resolved
tests/parser/test_relationship_parser.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@armintaenzertng armintaenzertng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the effort! :)
Only two minor comments left from my side



def parse_field_or_no_assertion_or_none(field: Optional[str], method_for_field: Callable = lambda x: x) -> Union[
SpdxNoAssertion, SpdxNone, Any]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Writing Any in a Union does not make sense (as it just overwrites the other entries)

Suggested change
SpdxNoAssertion, SpdxNone, Any]:
SpdxNoAssertion, SpdxNone, str, None]:

see also the next method below

Copy link
Collaborator Author

@meretp meretp Dec 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, but depending on the method_for_field the return value could also be a LicenseExpressions or an Actor or anything else..So probably I should just write Any as return type?

Comment on lines 94 to 99
supplier: Optional[Union[Actor, SpdxNoAssertion]] = parse_field_or_log_error(logger,
package_dict.get("supplier"),
lambda
x: parse_field_or_no_assertion(
x,
self.actor_parser.parse_actor))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one escaped your reformatting ;)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😱

…ueError due to wrong format

Signed-off-by: Meret Behrens <[email protected]>
…use assertCountEqual to compare lists

Signed-off-by: Meret Behrens <[email protected]>
@meretp meretp merged commit f3bca2d into spdx:refactor-python-tools Dec 28, 2022
@meretp meretp deleted the issue-305_json_parser branch December 28, 2022 14:42
This was referenced Dec 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants