-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor ScalingTransformer to Inherit from BaseNumericTransformer and Update Tests #284
Refactor ScalingTransformer to Inherit from BaseNumericTransformer and Update Tests #284
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @favouribude1, thanks for the work here. The changes to the transformer are good but there is still some more to do to refactor the tests into the new format. Please take another look at the issue #275 and an example like #268 to see what changes are needed. We should be able to delete a lot of tests here 🔥
@@ -427,7 +429,7 @@ def transform(self, X: pd.DataFrame) -> pd.DataFrame: | |||
return X | |||
|
|||
|
|||
class ScalingTransformer(BaseTransformer): | |||
class ScalingTransformer(BaseNumericTransformer): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P: thanks for the clean up here it is much nicer now!
@@ -123,7 +123,7 @@ def test_super_init_called(self, mocker): | |||
|
|||
|
|||
class TestCheckNumericColumns: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
B: we now set up our test classes to inherit a set of generic tests from other test modules. See the guidance in the issue #275 #268 for an example in this numeric module. You need to set up the test classes for init, fit and transform to inherit from a parent test class and add
@classmethod def setup_class(cls): cls.transformer_name = "ScalingTransformer"
to each so that the inherited tests can function (they look up initialised transformers and minimal dataframes from conftest.py using this)
After doing this check_numeric will be covered by tests in BaseNumericTransformerFitTests and BaseNumericTransformerTransformTests so these tests can be deleted form here once those tests have been inherited.
You can also delete any implementation tests so we are just testing functionality. There should be far fewer tests in this module when you are done!
def test_exception_raised(self): | ||
"""Test an exception is raised if non numeric columns are passed in X.""" | ||
df = d.create_df_2() | ||
def test_non_numeric_exception_raised(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: This test is already inherited from BaseNumericTransformerFitTests so I don't think it is needed here. Am I missing osmething that is not covered by the test in the base class? https://github.com/lvgig/tubular/blob/feature/refactor_ScalingTransformer_tests/tests/numeric/test_BaseNumericTransformer.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked the base class tests and saw that it doesn’t cover the specific error messages and checks needed for ScalingTransformer. The base test is more general and doesn’t handle the exact way ScalingTransformer raises errors with non-numeric data.
When I removed the specific test from ScalingTransformer, it caused the test to fail. So, I kept the test in ScalingTransformer to ensure everything works as expected. If you’d like, I can look into adjusting the base tests to cover this, or we can stick with the current setup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how so? as far as I can tell the handling should be the same. fit calls super.fit which calls the same _check_numeric method so seems like something is going wrong if the error is returning something different here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
had a bit more of a dig and it looks like we forgot to implement any tests for fit for LogTransformer so the inheritance of this test has not been tested before this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in this case the test doesn't work because the x = initialized_transformers[self.transformer_name]
takes the default initialised transformer for ScalingTransformer and that transformer has only columns = ['a']. Since the df we want to check only has non numeric data in column b we are not throwing an error. adding b to columns in conftest.minimal_attribute_dict['ScalingTransformer] seems to fix the issue without breaking other tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this. I have now resolved it and it's working fine now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution @favouribude1! All looks good apart from one possibly redundant test I have flagged in comments.
There are also a couple of merge conflicts to resolve and you need to update the changelog.
CHANGELOG.rst
Outdated
@@ -46,6 +46,8 @@ Changed | |||
- DatetimeInfoExtractor.mappings_provided changed from a dict.keys() object to list so transformer is serialisable. `#258 <https://github.com/lvgig/tubular/pull/258>`_ | |||
- Created BaseNumericTransformer class to support test refactor of numeric file `#266 <https://github.com/lvgig/tubular/pull/266>`_ | |||
- Updated testing approach for LogTransformer `#268 <https://github.com/lvgig/tubular/pull/268>`_ | |||
- Updated `ScalingTransformer` default columns to `["a", "b"]` in `minimal_attribute_dict` to fix test issues. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this needs a line in the changelog, its just part of the below point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
otherwise happy with changes so happy to approve once we delete this line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have removed the unnecessary changelog entry
Thanks favour, happy with this now |
Summary of Refactoring
Modified the tests based on refactored code