Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EPIC] Source Airtable to Beta #11757

Closed
9 of 18 tasks
lazebnyi opened this issue Apr 6, 2022 · 22 comments
Closed
9 of 18 tasks

[EPIC] Source Airtable to Beta #11757

lazebnyi opened this issue Apr 6, 2022 · 22 comments

Comments

@lazebnyi
Copy link
Collaborator

lazebnyi commented Apr 6, 2022

The main purpose of this issue is to track release progress for the connector.

According to checklist:

Acceptance criteria
  • Create a checklist document that should rbe placed on the Google Drive
  • Checklist filled up by engineering team
  • Collect and list all tech debt-related issues in the section below
  • Create/Revise MLP document and place it here
  • Move checklist to the unblockers team
  • Reviewed/Updated by unblockers team
  • Move checklist to the product team for review
  • Reviewed/Updated by product team
  • Update connector documentation according to the comments in the checklist
  • Certification completed
@bazarnov
Copy link
Collaborator

bazarnov commented May 30, 2022

@sherifnada @misteryeo

We've got this BLOCKER before releasing to BETA: https://github.com/airbytehq/alpha-beta-issues/issues/74
In order to resolve this, we would need to have the access to Metadata API, more info here and here.

Given the next answer from the part of Airtable Support:

---------- Forwarded message ---------
From: 'Airtable Support' via Integration tests <[[email protected]](mailto:[email protected])>
Date: Thu, 26 May 2022 at 16:52
Subject: RE: Metadate API
To: [[email protected]](mailto:[email protected]) <[[email protected]](mailto:[email protected])>

Hi Augustin,

Thanks for writing in!

If you've completed [this form](https://protect-us.mimecast.com/s/GE2BCzpzoxUY4ZxBC49uIo?domain=airtable.com) to
request access to the Metadata API, then you've done all you need! As you may already know, at this time, the process for
approving these requests has been put on hold, so no new developer tokens are being created right now. Unfortunately I
don't have an estimated timeline to offer for when this process might restart, but rest assured that we have all of the
information we need from you for when it does.

Sorry for the limitation here! Our team is aware of the desire for this API,
and are working to improve this process so that we can provide you with this functionality.
Thank you for your patience!

we might be blocked for a long time with this release to BETA, as far as we cannot go with this connector as it is now, because due to this issue - the connector produces wrong schemas and as the result, non-relevant records are fetched.
Therefore we should completely hide this connector from the connectors list, until the issue is resolved, in the first place.

Please give your thoughts about this, thank you.

@misteryeo
Copy link
Contributor

Thanks for this detailed rundown. To confirm, what you're saying is that without the Metadata API the connector is actually completely unusable so there's no reason to even keep the connector in Alpha. Is that correct @bazarnov?

@bazarnov
Copy link
Collaborator

bazarnov commented Jun 1, 2022

Thanks for this detailed rundown. To confirm, what you're saying is that without the Metadata API the connector is actually completely unusable so there's no reason to even keep the connector in Alpha. Is that correct @bazarnov?

@misteryeo Yes, this is my opinion, the connector produces correct records, but schemas are incorrect - to have them correct we need the Metadata API access to fetch the complete table schemas.

@sherifnada
Copy link
Contributor

to be clear, the schemas are not incorrect, right? they are just incomplete?

The logic for inferring schemas just reads the first row from the table and uses that to infer the schema, assuming that everything is a string. Obviously this is bad, but it could be made a little less bad by reading more than just the first row

@bazarnov
Copy link
Collaborator

bazarnov commented Jun 3, 2022

to be clear, the schemas are not incorrect, right? they are just incomplete?

Yes, the schemas are incomplete, using the way they produced now.

The logic for inferring schemas just reads the first row from the table and uses that to infer the schema, assuming that everything is a string. Obviously this is bad, but it could be made a little less bad by reading more than just the first row

The simple use-case: ok, we fetch first 500 records, but on 501 - there is 1 more field comes into play, we miss it, we have incomplete schema again. There is no way to update the schema on the fly, otherwise the fix is quite simple), even if we can update the schema on the fly, there is no way to pass updated schema to destination, or there is?

@misteryeo
Copy link
Contributor

Thanks @bazarnov let's do this.

  1. Let's put this Beta certification on ice for now given what you've uncovered
  2. Are you able to update the docs with a callout that mentions this limitation due to the Metadata API. Please tag me to review.
  3. We'll keep monitoring new issues that come in to relate to this same issue.
  4. We'll leave the connector in Alpha given it can still work for some users and we already have a fairly high adoption rate for this connector.

Thanks!

@ryjabe
Copy link

ryjabe commented Oct 6, 2022

@misteryeo and @bazarnov it doesn't look like this ever got captured in the docs. We've had this nasty surprise creep up on a few users now, and it feels like it's something we should be more transparent about.

Any reason we didn't include it, or was it just a slip?

@misteryeo
Copy link
Contributor

@YowanR @bazarnov let's make sure to document per the comment above please.

@bazarnov
Copy link
Collaborator

bazarnov commented Oct 6, 2022

@misteryeo and @bazarnov it doesn't look like this ever got captured in the docs. We've had this nasty surprise creep up on a few users now, and it feels like it's something we should be more transparent about.

Any reason we didn't include it, or was it just a slip?

I wish I could say more than: It's a non-discovered bug previously, due to the strategy of the implementation, I believe (simple api request, simple response results). As always, there we couple of ways of how to deal with schemas for streams, the one who coded this connector choose the most obvious one)

@YowanR
Copy link
Contributor

YowanR commented Oct 6, 2022

Thanks @bazarnov! Could you take a stab at editing the doc to reflect this as you have the most context? Please tag me and I can review it as well :)

@alafanechere
Copy link
Contributor

alafanechere commented Oct 10, 2022

I added #17794 as a to-fix bug following discussions with @ryjabe . It refers to the problem mentioned by @sherifnada in #11757 (comment)

@alafanechere
Copy link
Contributor

@bazarnov it looks like Airtable made progress on their Metadata API: https://airtable.com/api/meta

@bazarnov
Copy link
Collaborator

@bazarnov it looks like Airtable made progress on their Metadata API: https://airtable.com/api/meta

@alafanechere This is the old article, and the access still needs to be approved by Airtable:

If you want to develop an integration using the Metadata API, you must register [here](https://airtable.com/shrWl6yu8cI8C5Dh3) for access and receive a client secret. Enterprise Airtable accounts do not require a separate Metadata API client secret.

Thus, no progress has been made, since the last time I checked, am I missing something here?

@NumberPiOso
Copy link

@bazarnov it now looks like Airtable finally released their Metadata API. This is the blog post announcing it November 15th and the API endpoint to do so is get-base-schema.

This is an example of the information I got from it

{
    "id": "id_test_table",
    "name": "test_table",
    "primaryFieldId": "id1",
    "fields": [
        {
            "type": "multilineText",
            "id": "id1",
            "name": "field1"
        },
        {
            "type": "date",
            "options": {
                "dateFormat": {
                    "name": "local",
                    "format": "l"
                }
            },
            "id": "id2",
            "name": "field 2"
        },
     -- More fields
    ]
},
-- More tables with their corresponding fields

Is somebody actively working on this?

@bazarnov
Copy link
Collaborator

bazarnov commented Dec 2, 2022

Very well! @NumberPiOso Thank you, we will prioritize this one @YowanR @lazebnyi @YuliiaNahoha @misteryeo

@bazarnov
Copy link
Collaborator

The first part of the improvements is here: #20846 focused on migrating to the Metadata API and cleaning up the input config from base_id and tables properties (with this change these are processed dynamically based on authenticated users permissions)

@bazarnov
Copy link
Collaborator

The second part of the improvements is here: #20934 focused on implementing OAuth2.0 for Airtable.

@YowanR
Copy link
Contributor

YowanR commented Jan 10, 2023

Hey @bazarnov! Does that mean we're close to certifying Airtable to Beta? Exciting ! 🔥

@bazarnov
Copy link
Collaborator

We need your confirmation around the checklist https://docs.google.com/spreadsheets/d/1MWeqcciZt_EuAUBr4T2v_kYeGqwR2nEOXlZnNPRKR8k/edit#gid=1288588270 to announce officially.

@YowanR
Copy link
Contributor

YowanR commented Jan 11, 2023

A couple of thoughts on this certification:

  1. We moved this to closed even though Checklist was not checked off yet. Let's wait until the checklist is closed before we closed it
  2. Looking at the backlog, there's 11 issues that we should at least take a look at before we certified this to Beta.
    I'm reopening the Epic and adding the issues to it. Some of these are properly obsolete but we should do our due diligence before we move to Beta.
    cc @lazebnyi @YuliiaNahoha @bazarnov

@YowanR YowanR reopened this Jan 11, 2023
@bazarnov
Copy link
Collaborator

Will check the rest of issues, thanks.

@bazarnov
Copy link
Collaborator

bazarnov commented Feb 2, 2023

Finally! We can close this Epic, and all Beta-related issues are resolved, even more, we make this connector almost fully automatic (dynamic schemas, dynamic streams, SingleUseRefreshToken).

@bazarnov bazarnov closed this as completed Feb 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Status: Scoping Finished
Development

No branches or pull requests