Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add data labeler tf loader for any cnn softmax model #532

Merged
merged 16 commits into from
Jul 13, 2022

Conversation

JGSweets
Copy link
Contributor

Allows the DL to load any TF model which has a softmax output

@JGSweets JGSweets added Medium Priority Significant improvement or bug / feature reducing overall performance New Feature A feature addition not currently in the library labels Jul 12, 2022
@@ -17,237 +16,7 @@
_file_dir = os.path.dirname(os.path.abspath(__file__))

logger = dp_logging.get_child_logger(__name__)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved to labeler_utils.py

@@ -0,0 +1,148 @@
import json
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new test for new labeler

@@ -0,0 +1,373 @@
import json
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new test for new model

@@ -22,7 +22,7 @@
import numpy as np
import tensorflow as tf

from dataprofiler.labelers.character_level_cnn_model import F1Score, FBetaScore
from dataprofiler.labelers.labeler_utils import F1Score, FBetaScore
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated for movement

@@ -267,3 +269,33 @@ def test_save_conf_mat(self, mock_dataframe):
self.assertDictEqual(expected_row_col_names, mock_dataframe.call_args[1])

mock_instance_df.to_csv.assert_called()


class TestTFFunctions(unittest.TestCase):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test new get layer name func

ind = labeler_utils.get_tf_layer_index_from_name(model, 'dense0')
self.assertEqual(0, ind)

def test_hide_tf_logger_warnings(self):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test ability for func to hide logs

@@ -0,0 +1,511 @@
import json
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new code

@taylorfturner taylorfturner enabled auto-merge (squash) July 13, 2022 13:01
Copy link
Contributor

@taylorfturner taylorfturner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM... only recommendation is run both isort and black on these files to ensure we are in alignment with what we are merging into.


# Make sure the necessary parameters are present and valid.
for param in parameters:
if param in ["default_label", "model_path", "pad_label"]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if param in ["default_label", "model_path", "pad_label"]:
if param in list_of_necessary_params:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah great catch!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair, not usually the case, but works for this one since they are all strings

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have slight concerns this makes it less clear as it doesn't list the values at the point of callout, but I'm fine switching it.

taylorfturner
taylorfturner previously approved these changes Jul 13, 2022
@taylorfturner taylorfturner merged commit d191eec into capitalone:main Jul 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Medium Priority Significant improvement or bug / feature reducing overall performance New Feature A feature addition not currently in the library
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants