-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spanish clinical flair embeddings #2292
Comments
Hello @matirojasg thanks for offering to add the models, people would surely find this useful! The standard way would be to put the model on a server and do a pull request to add the auto-downloading functionality to the FlairEmbeddings class. If you like, I can put your models on our faculty server and also do the pull request (this is how we've been mostly doing it). Alternatively you can do the PR and/or put the models on your own server. Both great for me! |
Thank you for the quick response. "If you want, I can put your models on our faculty server and also do the pull request (that's how we've been doing it most of the time)." I prefer this option. Do I share the files with you by drive? Which files do you need exactly? |
Hello @matirojasg yes if you send me a mail with a link to a drive folder where models are, I can put them on our server! Thanks again! |
Here is the link to the clinical models in Spanish, let me know if any file is missing or you can't see the drive. https://drive.google.com/drive/folders/1M1b5FzZqEebTF7B2l58GQvciF4SXP5dT?usp=sharing Thanks! |
Hi @matirojasg I put then on our server: https://flair.informatik.hu-berlin.de/resources/embeddings/flair/ Would you like to do the PR for integration into Flair, or should I? |
Could you do it, please? Thank you :) |
GH-2292: add support for Spanish clinical Flair embeddings
Hi @matirojasg can you please suggest me size of Corpus to fine tune language model 'news-forward' on english tweets , I am currently thinking to follow 50 million words as mentioned by you. But will it be fine? please suggest me |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Hi, in the research group where I work (http://pln.cmm.uchile.cl/grav/en), we have trained flair embeddings in Spanish in the clinical context.
We have fine-tuned an existing LM (es-forward/es-backward provided by @iamyihwa), and we have trained (>1 week) on a Chilean clinical dataset (created by the same group) with around 50 million words. We have good perplexity values, and when generating random text, it generates text close to natural language.
We would like to know the steps to follow to upload these models to the site. Do we have to test it on the NER task for some medical dataset in Spanish and show you the results?
To our knowledge, there is no flair embedding model in Spanish in the clinical context :)
If you want to know more about what this corpus is about, you can see the paper published last year: https://www.aclweb.org/anthology/2020.clinicalnlp-1.32/
The text was updated successfully, but these errors were encountered: