Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/add ignore case option to snowflake #289

Conversation

robby-rob-slalom
Copy link

@robby-rob-slalom robby-rob-slalom commented Apr 11, 2024

Description & motivation

PR for Add ignore_case option to snowflake infer schema #288

Checklist

TODO:

  • I have verified that these changes work locally
  • I have updated the README.md (if applicable)
  • I have added an integration test for my fix/feature (if applicable)

Copy link
Collaborator

@dataders dataders left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the change seems straight forward enough. however we'll need at least one test case to prove this out. perhaps even a new seed table and external parquet file to test this against?

@robby-rob-slalom
Copy link
Author

robby-rob-slalom commented Apr 12, 2024

the change seems straight forward enough. however we'll need at least one test case to prove this out. perhaps even a new seed table and external parquet file to test this against?

I'm not sure where this file would go but this is what I used to generate the example mixed case parquet (edit: modified to match existing schema in public_data):

-- partition with UPPERCASE column format
COPY INTO @stage/parquet_with_inferred_schema_and_mixed_column_case
FROM (
    SELECT
        1 AS "ID",
        'FOO' AS "NAME",
        'a' AS "SECTION"
)
PARTITION BY ('section="SECTION"')
FILE_FORMAT = (TYPE = PARQUET)
HEADER = TRUE
;

-- partition with lowercase column format
COPY INTO @stage/parquet_with_inferred_schema_and_mixed_column_case
FROM (
    SELECT
        2 AS "id",
        'bar' AS "name",
        'b' AS "section"
)
PARTITION BY ('section="section"')
FILE_FORMAT = (TYPE = PARQUET)
HEADER = TRUE
;

-- partition with PascalCase column format
COPY INTO @stage/parquet_with_inferred_schema_and_mixed_column_case
FROM (
    SELECT
        3 AS "Id",
        'FooBar' AS "Name",
        'c' AS "Section"
)
PARTITION BY ('section="Section"')
FILE_FORMAT = (TYPE = PARQUET)
HEADER = TRUE
;

@dataders
Copy link
Collaborator

dataders commented May 1, 2024

@robby-rob-slalom is this still a draft? or do you think it's ready to be "formally" reviewed?

@robby-rob-slalom
Copy link
Author

@robby-rob-slalom is this still a draft? or do you think it's ready to be "formally" reviewed?

The code change is ready. I may need some assistance with the integration test. Looking at the /public_data folder, there could be another folder like /json_mixed_case where one of the section files has uppercase keys and another has title case keys.

@cakkinep
Copy link
Contributor

Didn't realize a PR for ignore_case started, i have created this PR that handles this for both infer schema or when the schema is specified in the external table definition - #308
tagging you - @dataders

@robby-rob-slalom
Copy link
Author

Feature included in #308

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants