Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add pattern, contentMediaType, and contentEncoding to Schema data class #1581

Merged

Conversation

BuzzCutNorman
Copy link
Contributor

@BuzzCutNorman BuzzCutNorman commented Apr 4, 2023

I would like to propose adding three new variables to the singer_sdk._singerlib.Schema data class and STANDARD_KEYS list.

  • pattern
  • contentMediaType
  • contentEncoding

These JSON Schema variables are avaiable in JSON Schema Draft 7 and are defined at the following links.

https://json-schema.org/understanding-json-schema/reference/string.html#regular-expressions
https://json-schema.org/understanding-json-schema/reference/non_json_data.html#contentmediatype
https://json-schema.org/understanding-json-schema/reference/non_json_data.html#contentencoding

This addition of these variables will assist in resolving the following issue.

BuzzCutNorman/tap-mssql#32
BuzzCutNorman/tap-mssql#33
BuzzCutNorman/tap-mssql#34


📚 Documentation preview 📚: https://meltano-sdk--1581.org.readthedocs.build/en/1581/

@aaronsteers
Copy link
Contributor

@BuzzCutNorman - Thank you for your contribution!

@edgarrmondragon - This is a bit out of my knowledge area. Do you mind reviewing and adding your own context/insight here?

Thanks!

@codecov
Copy link

codecov bot commented Apr 4, 2023

Codecov Report

Merging #1581 (9db25a9) into main (f544b40) will increase coverage by 0.00%.
The diff coverage is 100.00%.

❗ Current head 9db25a9 differs from pull request most recent head 4d18a2b. Consider uploading reports for the commit 4d18a2b to get more accurate results

@@           Coverage Diff           @@
##             main    #1581   +/-   ##
=======================================
  Coverage   85.60%   85.61%           
=======================================
  Files          57       57           
  Lines        4717     4720    +3     
  Branches      801      801           
=======================================
+ Hits         4038     4041    +3     
  Misses        487      487           
  Partials      192      192           
Impacted Files Coverage Δ
singer_sdk/_singerlib/schema.py 100.00% <100.00%> (ø)

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@edgarrmondragon edgarrmondragon self-assigned this Apr 4, 2023
Copy link
Collaborator

@edgarrmondragon edgarrmondragon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @BuzzCutNorman!

This looks good, and it's a simple, stable improvement.

However, the SDK can't produce these JSON schema types yet and I assume that would require tricky changes to the StringType base class so I'm happy to leave those for a future PR.

That means that currently, the only way to get a string type with pattern set, is to customize the Singer catalog for the tap.

@BuzzCutNorman
Copy link
Contributor Author

BuzzCutNorman commented Apr 5, 2023

Hey @edgarrmondragon !

Thanks 🙏for approving this PR. I am very excited to see these in the SDK. 😃

I am working mainly with the SQL taps and targets and have found I can generate and translate pattern, contentMediaType and concontentEncoding in the to_jsonschema_type and to_sql_type methods. The last barrier I am facing is the encoding to base64 which I thought would be best done in the _conform_primitive_property method but I am second guessing that since I have the target side decoder in bulk_insert_records. I recall there is a post get records method on the tap side that I am going to try adding the base64 encoder to.

However, the SDK can't produce these JSON schema types yet and I assume that would require tricky changes to the StringType base class so I'm happy to leave those for a future PR.

I agree it is tricky to expand the StringType base class to allow for any of these. Once I get a handle on how they are in my confined SQL scenarios I want to see if they might be also used in non-SQL taps and targets as well as be added to the SDK Type classes. There are alot of options in the contentMedaType so that will be a challenge. Then there is were to add encoding/decoding and which ones to allow.

@edgarrmondragon edgarrmondragon enabled auto-merge (squash) April 10, 2023 16:27
@edgarrmondragon edgarrmondragon merged commit 1867c2b into meltano:main Apr 10, 2023
@BuzzCutNorman BuzzCutNorman deleted the feat-expand-schema-variables branch November 28, 2023 15:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants