Added handling of conversion of names #34

jonwarghed · 2019-06-14T14:54:52Z

when given a csv with incorrect column names

closes issue #14

… column names

bxparks · 2019-06-15T18:06:52Z

bigquery_schema_generator/generate_schema.py

+        # will not be accepted by `bq load` when it contains illegal characters.
+        # Characters such as #, / or -. Neither will it be accepted if the column name
+        # in the schema is larger than 128 characters. 
+        self.sanitize_names = sanitize_names


Can you remove the trailing white space here?

bxparks

Excellent PR, I appreciate that you added unit tests. Just a bunch of small nits below.

bxparks · 2019-06-15T18:07:16Z

bigquery_schema_generator/generate_schema.py

+        # Characters such as #, / or -. Neither will it be accepted if the column name
+        # in the schema is larger than 128 characters. 
+        self.sanitize_names = sanitize_names
+


Here too, remove trailing white spaces..

bxparks · 2019-06-15T18:08:08Z

bigquery_schema_generator/generate_schema.py

            elif key == 'type' and value in ['QINTEGER', 'QFLOAT', 'QBOOLEAN']:
                new_value = value[1:]
            elif key == 'mode':
                if infer_mode and value == 'NULLABLE' and filled:
                    new_value = 'REQUIRED'
                else:
                    new_value = value
+            elif key == 'name' and sanitize_names:
+                print(value)


Remove debugging print() statement, will corrupt the schema output

bxparks · 2019-06-15T18:08:42Z

tests/test_generate_schema.py

        records = chunk['records']
        expected_errors = chunk['errors']
        expected_error_map = chunk['error_map']
        expected_schema = chunk['schema']

        print("Test chunk %s: First record: %s" % (chunk_count, records[0]))
-
+        print(sanitize_names)


Remove debugging print() statement

jonwarghed · 2019-06-17T11:35:50Z

Hi again. It should look better now, let me know if there is anything else you think I should change.

bxparks · 2019-06-17T14:33:37Z

There is still one line that has a trailing space, but it's minor. I'll fix it after merging this, and I'll update the CHANGELOG and README.

Thanks for the PR!

jonwarghed · 2019-06-17T14:48:27Z

Thank you, I really like the work you put in.

Added handling of conversion of names when given a csv with incorrect…

a98a525

… column names

bxparks reviewed Jun 15, 2019

View reviewed changes

bxparks requested changes Jun 15, 2019

View reviewed changes

Removing trailing whitespace and debug print statements

f46396e

bxparks merged commit e57b8d6 into bxparks:develop Jun 17, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added handling of conversion of names #34

Added handling of conversion of names #34

jonwarghed commented Jun 14, 2019

bxparks Jun 15, 2019

bxparks left a comment

bxparks Jun 15, 2019

bxparks Jun 15, 2019

bxparks Jun 15, 2019

jonwarghed commented Jun 17, 2019

bxparks commented Jun 17, 2019

jonwarghed commented Jun 17, 2019

Added handling of conversion of names #34

Added handling of conversion of names #34

Conversation

jonwarghed commented Jun 14, 2019

bxparks Jun 15, 2019

Choose a reason for hiding this comment

bxparks left a comment

Choose a reason for hiding this comment

bxparks Jun 15, 2019

Choose a reason for hiding this comment

bxparks Jun 15, 2019

Choose a reason for hiding this comment

bxparks Jun 15, 2019

Choose a reason for hiding this comment

jonwarghed commented Jun 17, 2019

bxparks commented Jun 17, 2019

jonwarghed commented Jun 17, 2019