Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RDB Shredder: send shredding info to SQS when it's done #200

Closed
benjben opened this issue Nov 19, 2020 · 2 comments
Closed

RDB Shredder: send shredding info to SQS when it's done #200

benjben opened this issue Nov 19, 2020 · 2 comments
Assignees
Milestone

Comments

@benjben
Copy link
Contributor

benjben commented Nov 19, 2020

When shredder is done writing shredded data to S3, it should send a message to SQS with this schema.

This message will be used by RDB loader to know directly where/what to load

@benjben
Copy link
Contributor Author

benjben commented Dec 18, 2020

Need to fix

0/12/18 22:19:31 ERROR Client: Application diagnostics message: User class threw exception: java.lang.RuntimeException: Bucket name must start with s3:// prefix
	at com.snowplowanalytics.snowplow.storage.spark.ShredJob$.run(ShredJob.scala:177)
	at com.snowplowanalytics.snowplow.storage.spark.SparkJob.main(SparkJob.scala:32)
	at com.snowplowanalytics.snowplow.storage.spark.SparkJob.main$(SparkJob.scala:27)
	at com.snowplowanalytics.snowplow.storage.spark.ShredJob$.main(ShredJob.scala:65)
	at com.snowplowanalytics.snowplow.storage.spark.ShredJob.main(ShredJob.scala)

@benjben
Copy link
Contributor Author

benjben commented Dec 22, 2020

 User class threw exception: java.lang.RuntimeException: RDB Shredder could not send shredded types [List(ShreddedType(SchemaKey(com.snowplowanalytics.snowplow,geolocation_context,jsonschema,Full(1,1,0)),TSV), ShreddedType(SchemaKey(nl.basjes,yauaa_context,jsonschema,Full(1,0,1)),TSV), ShreddedType(SchemaKey(org.w3,PerformanceTiming,jsonschema,Full(1,0,0)),TSV), ShreddedType(SchemaKey(com.snowplowanalytics.snowplow,atomic,jsonschema,Full(1,0,0)),TSV), ShreddedType(SchemaKey(com.snowplowanalytics.snowplow,ua_parser_context,jsonschema,Full(1,0,0)),TSV))] to SQS with error [The request must contain the parameter MessageGroupId. (Service: AmazonSQS; Status Code: 400; Error Code: MissingParameter; Request ID: e5f2328e-9d0d-51a3-8381-6ad5c13960b2; Proxy: null)]
at com.snowplowanalytics.snowplow.storage.spark.ShredJob$.run(ShredJob.scala:222)
at com.snowplowanalytics.snowplow.storage.spark.SparkJob.main(SparkJob.scala:32)
at com.snowplowanalytics.snowplow.storage.spark.SparkJob.main$(SparkJob.scala:27)
at com.snowplowanalytics.snowplow.storage.spark.ShredJob$.main(ShredJob.scala:65)
at com.snowplowanalytics.snowplow.storage.spark.ShredJob.main(ShredJob.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:728) 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants