-
Notifications
You must be signed in to change notification settings - Fork 162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Graph loading bug #603
Graph loading bug #603
Conversation
@@ -102,25 +104,19 @@ def _find_target_string_in_column(self, column_names, keyword_list): | |||
return target_index | |||
|
|||
@classmethod | |||
def csv_column_names(cls, file_path, options): | |||
def csv_column_names(cls, file_path, header, delimiter, encoding="utf-8"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated for reusability in _network
data_as_str = data_utils.load_as_str_from_file( | ||
self.input_file_path, self.file_encoding | ||
) | ||
self._header = CSVData._guess_header_row( | ||
data_as_str, self._delimiter, self._quotechar | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this section is what fixes the bug. If graph data was lazily loaded or loaded without options from Is_match, it wouldn't determine these when thrown into the system. This fixes it.
@classmethod | ||
def setUp(cls): | ||
for buffer in cls.buffer_list: | ||
buffer["path"].seek(0) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to seek to reset data files after use between tests
@@ -147,11 +152,17 @@ def test_csv_column_names(self): | |||
"open_date_src", | |||
"open_date_dst", | |||
] | |||
for input_file in self.input_file_names_pos: | |||
for input_file in self.file_or_buf_list: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be using all data for tests
self.assertEqual( | ||
GraphData.csv_column_names(input_file["path"], input_file["options"]), | ||
GraphData.csv_column_names( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixes to new call
data = GraphData(input_file["path"]) | ||
self.assertEqual(type(data), GraphData) | ||
self.assertIsNotNone(data.data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ensures data can be loaded as expected
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don’t mind both but the new is better at validating
Pre commit too |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
data_readers/graph_data.py L250-L252 is actually failing here when I run the notebook @JGSweets
@JGSweets, sorta false alarm.... Based on what I'm seeing in the networkX documentation... there isn't an attribute to return the graph datasets... so not sure |
Allows lazy loading to work for data class as other classes without causing bugs