Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A/B Test to disable suggesting all streams for new connections #22851

Closed
evantahler opened this issue Feb 10, 2023 · 14 comments
Closed

A/B Test to disable suggesting all streams for new connections #22851

evantahler opened this issue Feb 10, 2023 · 14 comments
Assignees

Comments

@evantahler
Copy link
Contributor

evantahler commented Feb 10, 2023

After #21577, we now have the ability for sources to suggest only those important streams for users setting up new connectors. Today, if a connector has not implemented suggestedStreams, all streams are selected by default. We want to set up a test for a small group of users which swaps this behavior - what if no streams were suggested by default?

Positive Hypothesis:

  • First sync success % likely goes up. We are syncing less streams, and only the 'well-tested' popular streams
  • First sync duraiton likely goes down. If we sync less streams, it takes less time
  • Users think more critically about the data they want to move, leading to more investment in the resulting destination's dataset
  • We gain data about which streams matter, so that we can then populate suggestedStreams for more connectors

Negative Hypothesis:

  • Adds additional friction for the first sync for most sources (as most sources don't yet have suggested streams)
  • Some sources, e.g. databases, without a static list of streams, will likely never have any streams suggested by default in this new world.

I think we can do this test entirely in the front end.

The work to be done is:

  1. Set up the Launch Darkly feature flag and test group
  2. Fire segment events for those users within the test group to store which streams they chose for each connector
    • We don't want data about users not in the test group
  3. Report on this data in Metabase:
    • Did sync success rate go up or down for the test group?
    • Did sync duration go up or down for the test group?
    • Did we learn what streams are popular for the connectors and can we populate more suggestedStreams
@evantahler
Copy link
Contributor Author

@misteryeo
Copy link
Contributor

Is there some way that we can AB test this with different users and across select connectors to observe the impact here?

I'd like to loop in @nataliekwong here to make sure she's involved as this would impact activation rates.

My hypothesis is that the give up / abandonment rate for successfully setting up a connection might increase but amongst those who do finish the setup, the % sync success increases.

@bleonard
Copy link
Contributor

We could do this first on the frontend and use LaunchDarkly feature flag. The A/B test wouldn't likely reach significance in a time we're happy with, but it might be directionally interesting to see and we could toggle if off if there was a problem.

@nataliekwong
Copy link
Contributor

Consolidating some thoughts between Ryan and myself from Slack thread:

Some known risks are:

  • You can currently set up connections without requiring to sync any streams
  • Users are often confused with bulk editing so it will be more difficult and time-consuming for users with more tables

That being said, given the first sync is so important to continued success, the tradeoffs here are worth exploring and I think it's worth creating an experiment with these in mind (we anticipate a larger dropoff at the connection settings).

I suggest starting with a few connectors so we can contain the experiment and put it behind a feature flag so we see the impact between the two groups. I don't think we necessarily need to wait to solve the first bullet above in order to move forward (Issue here).

My suggestion would be to choose 3 - 4 connectors so you can see how the experience differs across the types of connectors we offer, and since we want to actually be able to measure a difference between the groups ideally within a few weeks, choose connectors that have a higher number of users trying it out. We should pre-select 1 stream for them that we feel is pretty certainly going to succeed instead of giving a blank slate.

My suggestion would be:

  • API (Facebook Marketing, Google Ads, Hubspot) - high number of users and a smaller set of streams. We pre-select 1 stream but they can select more if they prefer.
  • Database (Postgres) - suggestion from Michel. No streams pre-selected as the schemas are not predefined.

@evantahler Seeing the PR - is this a type of project you/your team could take on? Or would you prefer Growth (@letiescanciano ) moves it forward?

@evantahler
Copy link
Contributor Author

Thanks for all of the feedback everyone!

I think this probably still belongs in the @airbytehq/connector-operations wheelhouse, but this has grown from "a quick change" into a bit larger of a feature now :D. With that in mind, I don't know if we will have space for this in Q1B, but we'll keep it on our radar for the future. That said, if @letiescanciano wants to run with this, I'd be happy to consult!

I like the suggestion of A/B testing this, and moving the logic about which streams to suggest into the frontend for the duration of the experiment. With that in mind, I'll close #22856

@evantahler evantahler changed the title By default, if there are no SuggestedStreams, select no streams by default A/B Test to disable suggesting all streams for a new connection Feb 24, 2023
@evantahler
Copy link
Contributor Author

@nataliekwong and @alex-gron - I rephrased this story as a front-end experiment. Can you comment on the description? Anything to add or change?

@evantahler evantahler changed the title A/B Test to disable suggesting all streams for a new connection A/B Test to disable suggesting all streams for new connections Feb 24, 2023
@alex-gron
Copy link
Contributor

The description sounds great and makes sense to me!

I want to call out though that Metabase monitoring will not be possible until we have LaunchDarkly data available in the data warehouse. That work is prioritized for the end of Q1b. Do we yet know when this experiment would launch?

@bleonard Do you have any concerns with this from a Connector Sync success monitoring standpoint?
Do we need to filter the test users out of your dashboard while we are testing this?

@nataliekwong
Copy link
Contributor

Thanks for reframing! Feel free to assign @letiescanciano as she's already starting to work on this.

Fire segment events for those users within the test group to store which streams they chose for each connector
We don't want data about users not in the test group

The LaunchDarkly variants get passed in Segment events, so I don't think we need to wait for it to be available in the data warehouse. I think we can send this data regardless of variant since we can always filter down by which variant they were in later on.

@alex-gron
Copy link
Contributor

Great call on Segment events! 👍 Makes sense to me

@evantahler
Copy link
Contributor Author

@nataliekwong & @letiescanciano - updating my comment above: I'd love some help from your team to move this experiment forward, especially now that this is scoped to the front-end.

@nataliekwong
Copy link
Contributor

The Growth team's process lives in Airtable, so I'll assign @letiescanciano as the owner here and she will update the issue with the PR when it's ready!

Airtable link in case you want to read on the progress in the interim.

@bleonard
Copy link
Contributor

bleonard commented Mar 1, 2023

The description sounds great and makes sense to me!

I want to call out though that Metabase monitoring will not be possible until we have LaunchDarkly data available in the data warehouse. That work is prioritized for the end of Q1b. Do we yet know when this experiment would launch?

@bleonard Do you have any concerns with this from a Connector Sync success monitoring standpoint? Do we need to filter the test users out of your dashboard while we are testing this?

I don't think so. If anything, they will likely have a higher success rate as they are likely to choose less streams, but I think they are just as relevant to monitor.

@evantahler
Copy link
Contributor Author

@letiescanciano and @alex-gron as the experiment (https://github.com/airbytehq/airbyte-platform-internal/pull/4846) is running, if you happen to get strong signals that some some streams are rarely used, send them my way and I'll start modifying connectors

@letiescanciano
Copy link
Contributor

@evantahler will let you know once I get the PR approved and released! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants