Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Databricks Loader: Setting config formats.fileFormat = "PARQUET" thows an UnsupportedFileSystemException #911

Closed
dkbrkjni opened this issue Jun 2, 2022 · 3 comments

Comments

@dkbrkjni
Copy link
Contributor

dkbrkjni commented Jun 2, 2022

Running the Databricks loader image (snowplow/transformer-kinesis:4.0.0) in AWS Fargate with following configuration:

{
    input = {
        region = ${AwsRegion}
        streamName = ${EnrichedEventsStreamName}
        appName = ${AppName}
    }
    output = {
        region = ${AwsRegion}
        path = "s3://"${S3Bucket}"/events/widerow"
    }
    queue = {
        region = ${AwsRegion}
        type = "sqs"
        queueName = ${EventTransformationQueueName}
    }
    formats = {
        transformationType = "widerow"
        fileFormat = "PARQUET"
    }
    windowing = "5 minutes"
}

Throws following error:

org.apache.hadoop.fs.UnsupportedFileSystemException: No FileSystem for scheme "s3"
@pondzix
Copy link
Contributor

pondzix commented Jun 13, 2022

Hey @dkbrkjni, thank you for reporting the issue! Indeed, there is a problem with accessing s3 with parquet files. Adding hadoop-aws as a runtime dependency should resolve the issue. This week we're planning to release new version of loader containing the fix!

Note: in transformer output configuration, s3 should be replaced with s3a as this is the schema currently supported by hadoop-aws module.

@dkbrkjni
Copy link
Contributor Author

Hi pondzix

That sounds great!
Thanks for looking into and fixing this quickly.

@istreeter
Copy link
Contributor

Hi @dkbrkjni we released version 4.0.2 which should fix the problem you were having.

I will close this issue for now, but if you're still having any problems with 4.0.2 then please let us know.

Thank you again for telling us about this issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants