Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[formrecognizer] reconcile storage of forms across languages #10973

Closed
kristapratico opened this issue Apr 21, 2020 · 1 comment
Closed

[formrecognizer] reconcile storage of forms across languages #10973

kristapratico opened this issue Apr 21, 2020 · 1 comment
Assignees
Labels
Client This issue points to a problem in the data-plane of the library. Cognitive - Form Recognizer Cognitive Services

Comments

@kristapratico
Copy link
Member

Python currently dynamically creates a storage and formrecognizer account for more complicated operations like training. We then tear down all the resources at the end - this strategy was recommended by Mike H. It is possible that in the future we switch to fixed storage account with a container containing all the training forms that we use across all languages.

If using a fixed storage account we could remove test dependency on storage, but would need environment variables set for the container SAS URLs and/or storage credentials. We would also need to maintain these so that they don't expire/change and fail tests.

My thoughts on this: Ways to test training (bold is what we currently do)

1) have training files/labeled files in repo, upload to blob storage, create container sas url, train

  • PRO: create everything on the fly, tear down everything at the end, no environment variables. Current recommendation by Mike H.
  • CON: test dependency on storage, training/labeled files committed to repo (~5MB per set), lots to do before we can actually test the training
  1. training/labeled files already uploaded to shared storage account. just create SAS URL and train
  • PRO: Ensures that our SAS URL doesn't expire on us, training files don't exist in repo
  • CON: we have to maintain a shared storage account and have credentials as environment variables.
  1. training files already uploaded to shared storage account with container SAS URL grabbed from environment variable
  • PRO: no test dependency on storage, training files don't exist in repos
  • CON: need to maintain a shared storage account with all files, and maintain container SAS URL so it doesn't expire and fail tests.
@kristapratico kristapratico added Cognitive Services Client This issue points to a problem in the data-plane of the library. FormRecognizer labels Apr 21, 2020
@kristapratico kristapratico self-assigned this Apr 21, 2020
@kristapratico
Copy link
Member Author

Fixed storage account set up for training data, test framework now leveraging this. PR merged, closing issue.

@github-actions github-actions bot locked and limited conversation to collaborators Apr 12, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Client This issue points to a problem in the data-plane of the library. Cognitive - Form Recognizer Cognitive Services
Projects
None yet
Development

No branches or pull requests

1 participant