Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: migrate legacy data #62

Merged
merged 12 commits into from
Jul 19, 2023
Merged

Conversation

falconandy
Copy link
Contributor

@falconandy falconandy commented Jul 2, 2023

Summary

Migrate legacy data (Amplitude-iOS SDK)

Based on Amplitude-Kotlin migration logic.
On first run since upgrade (recommended) - device/user id, session id, last event date/id, events, identifies, intercepted identifies. If migration was enabled later after upgrade - only events are migrated.
Some legacy fields are converted to new names/format: library -> from object to string, timestamp -> time, idfa, idfv.
After migration legacy sqlite data are cleaned except deviceId/userId - maybe they will be required later to link legacy/new ids, for example.

NOTE 1:
Probably a bug: instance name is not included in PersistentStorage's storagePrefix - a storage is shared between different instances. Not fixed in this PR - for now I created PersistentStorage instances explicitly with different names in new tests.
https://github.com/amplitude/Amplitude-Swift/blob/main/Sources/Amplitude/Configuration.swift#L67-L69

self.storageProvider = storageProvider ?? PersistentStorage(apiKey: apiKey)
self.identifyStorageProvider = identifyStorageProvider
    ?? PersistentStorage(apiKey: apiKey, storagePrefix: "\(PersistentStorage.DEFAULT_STORAGE_PREFIX)-identify")

NOTE 2:
For some reason there are 2 independent instanceNames:
Amplitude.instanceName - https://github.com/amplitude/Amplitude-Swift/blob/main/Sources/Amplitude/Amplitude.swift#L5
Configuration.instanceName - https://github.com/amplitude/Amplitude-Swift/blob/main/Sources/Amplitude/Configuration.swift#L14

I use Amplitude.instanceName but looks like one of them should be removed (or their values should be synchronized)

Checklist

  • Does your PR title have the correct title format?
  • Does your PR have a breaking change?: No

@falconandy falconandy marked this pull request as ready for review July 6, 2023 12:33
justin-fiedler
justin-fiedler previously approved these changes Jul 7, 2023
Copy link
Contributor

@justin-fiedler justin-fiedler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@falconandy changes look great. I appreciate the tests.

  1. Hmm good call out. I agree we should probably update to include the instanceName. I made a new ticket.
  2. Thanks for calling out the duplicate instance name, we should definitely try to consolidate. @liuyang1520 can you provide any details on why we have both? Thanks

@liuyang1520
Copy link
Collaborator

@falconandy changes look great. I appreciate the tests.

  1. Hmm good call out. I agree we should probably update to include the instanceName. I made a new ticket.
  2. Thanks for calling out the duplicate instance name, we should definitely try to consolidate. @liuyang1520 can you provide any details on why we have both? Thanks

Thanks @justin-fiedler and @falconandy for the call out! I did a quick check, and cannot find it is used anywhere except assignment. I think I have missed the part to add instanceName as part of the storage prefix key/folder. As for the two instanceName, I think we can keep the one in the Configuration and remove the one in the Amplitude, similar as Kotlin or TypeScript one.

@justin-fiedler , do you want me to help add the configuration.instanceName to the storage prefix key/folder? I am thinking we can avoid introducing breaking change by adding similar logic in the RemnantDataMigration to check current storage path (without instanceName), and then use the new storage path with instanceName. Let me know your thoughts!

@justin-fiedler
Copy link
Contributor

justin-fiedler commented Jul 7, 2023

I am thinking we can avoid introducing breaking change by adding similar logic in the RemnantDataMigration to check current storage path (without instanceName), and then use the new storage path with instanceName. Let me know your thoughts!

👍

do you want me to help add the configuration.instanceName to the storage prefix key/folder?

I think we (@falconandy ) can fix it in this PR and clean up the RemnantDataMigration and tests accordingly. Would definitely appreciate a code review once it's ready 🙏

As for the two instanceName, I think we can keep the one in the Configuration and remove the one in the Amplitude

Agreed. @falconandy can you please update this as well.

Thanks!

@falconandy
Copy link
Contributor Author

Hi @justin-fiedler @liuyang1520

As for the two instanceName, I think we can keep the one in the Configuration and remove the one in the Amplitude
Agreed. @falconandy can you please update this as well.

Done.

do you want me to help add the configuration.instanceName to the storage prefix key/folder?
I think we (@falconandy ) can fix it in this PR and clean up the RemnantDataMigration and tests accordingly. Would definitely appreciate a code review once it's ready pray

It looks like both Amplitude-Typescript and Amplitude-Kotlin do not use instanceName in event storage (unlike Amplitude-Android and Amplitude-iOS):

For example, Amplitude-Kotlin only adds apiKey suffix:

class EventsFileManager {
    ...
    private val fileIndexKey = "amplitude.events.file.index.$apiKey"
    ....

Amplitude-Kotlin uses instanceName for identity data only:

    override fun createIdentityConfiguration(): IdentityConfiguration {
        val configuration = configuration as Configuration
        val storageDirectory = configuration.context.getDir("${FileStorage.STORAGE_PREFIX}-${configuration.instanceName}", Context.MODE_PRIVATE)

        return IdentityConfiguration(
            instanceName = configuration.instanceName,
            apiKey = configuration.apiKey,
            identityStorageProvider = configuration.identityStorageProvider,
            storageDirectory = storageDirectory,
            logger = configuration.loggerProvider.getLogger(this)
        )
    }

Should we use instanceName in storage implementation in all maintained SDKs?

@justin-fiedler justin-fiedler self-requested a review July 10, 2023 22:57
@justin-fiedler justin-fiedler dismissed their stale review July 11, 2023 23:30

Extended scope of PR

@justin-fiedler
Copy link
Contributor

justin-fiedler commented Jul 11, 2023

Hi @falconandy @liuyang1520 I put together a doc to outline instance storage across the SDKs. I think that going forward we should use instanceName vs apiKey.

Please let me know your thoughts in Confluence. If we all agree then I think we can add another migration plugin to check for any pending events in the old apiKey storage and move them to instanceName storage. Thanks

@justin-fiedler
Copy link
Contributor

Hi @falconandy see the latest feedback in the doc. The team agreed to start using instanceName for storage in mobile.

I think it will be better to write a new migration (vs a new source). Let me know if you see issues with that implementation. Thanks!

Copy link
Contributor

@justin-fiedler justin-fiedler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above.

@falconandy
Copy link
Contributor Author

Hi @justin-fiedler. Merged with main branch (storage instance name changes). Please review 2 last commits.

Copy link
Contributor

@justin-fiedler justin-fiedler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @falconandy! I also tested on local simulator.

@justin-fiedler justin-fiedler merged commit d1c6b32 into main Jul 19, 2023
3 checks passed
@justin-fiedler justin-fiedler deleted the AMP-63197-migrate-legacy-data branch July 19, 2023 19:19
github-actions bot pushed a commit that referenced this pull request Jul 19, 2023
## [0.4.7](v0.4.6...v0.4.7) (2023-07-19)

### Bug Fixes

* made EventOptions.init public ([#64](#64)) ([74992ce](74992ce))
* migrate 'api key' storage data to 'instance name' storage ([#63](#63)) ([9199039](9199039))
* migrate legacy data ([#62](#62)) ([d1c6b32](d1c6b32))
@github-actions
Copy link

🎉 This PR is included in version 0.4.7 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants