Skip to content

Commit

Permalink
[textanalytics] add samples for custom text actions (#21182)
Browse files Browse the repository at this point in the history
* add samples for custom text actions

* add samples + links to samples readme

* wording

* switch to asyncio.run in async samples
  • Loading branch information
kristapratico authored Oct 15, 2021
1 parent 22f822a commit d4c7167
Show file tree
Hide file tree
Showing 7 changed files with 580 additions and 0 deletions.
9 changes: 9 additions & 0 deletions sdk/textanalytics/azure-ai-textanalytics/samples/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,9 @@ These sample programs show common scenarios for the Text Analytics client's offe
|[sample_analyze_healthcare_entities.py][analyze_healthcare_entities_sample] and [sample_analyze_healthcare_entities_async.py][analyze_healthcare_entities_sample_async]|Analyze healthcare entities|
|[sample_analyze_actions.py][analyze_sample] and [sample_analyze_actions_async.py][analyze_sample_async]|Run multiple analyses together in a single request|
|[sample_extract_summary.py][extract_summary_sample] and [sample_extract_summary_async.py][extract_summary_sample_async]|As part of the analyze API, run extractive text summarization on documents|
|[sample_recognize_custom_entities.py][recognize_custom_entities_sample] and [sample_recognize_custom_entities_async.py][recognize_custom_entities_sample_async]|Use a custom model to recognize custom entities in documents|
|[sample_single_category_classify.py][single_category_classify_sample] and [sample_single_category_classify_async.py][single_category_classify_sample_async]|Use a custom model to classify documents into a single category|
|[sample_multi_category_classify.py][multi_category_classify_sample] and [sample_multi_category_classify_async.py][multi_category_classify_sample_async]|Use a custom model to classify documents into multiple categories|

## Prerequisites
* Python 2.7, or 3.6 or later is required to use this package (3.6 or later if using asyncio)
Expand Down Expand Up @@ -101,6 +104,12 @@ what you can do with the Azure Text Analytics client library.
[sample_analyze_healthcare_entities_with_cancellation_async]: https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/textanalytics/azure-ai-textanalytics/samples/async_samples/sample_analyze_healthcare_entities_with_cancellation_async.py
[extract_summary_sample]: https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/textanalytics/azure-ai-textanalytics/samples/sample_extract_summary.py
[extract_summary_sample_async]: https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/textanalytics/azure-ai-textanalytics/samples/async_samples/sample_extract_summary_async.py
[recognize_custom_entities_sample]: https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/textanalytics/azure-ai-textanalytics/samples/sample_recognize_custom_entities.py
[recognize_custom_entities_sample_async]: https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/textanalytics/azure-ai-textanalytics/samples/async_samples/sample_recognize_custom_entities_async.py
[single_category_classify_sample]: https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/textanalytics/azure-ai-textanalytics/samples/sample_single_category_classify.py
[single_category_classify_sample_async]: https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/textanalytics/azure-ai-textanalytics/samples/async_samples/sample_single_category_classify_async.py
[multi_category_classify_sample]: https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/textanalytics/azure-ai-textanalytics/samples/sample_multi_category_classify.py
[multi_category_classify_sample_async]: https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/textanalytics/azure-ai-textanalytics/samples/async_samples/sample_multi_category_classify_async.py
[pip]: https://pypi.org/project/pip/
[azure_subscription]: https://azure.microsoft.com/free/
[azure_text_analytics_account]: https://docs.microsoft.com/azure/cognitive-services/cognitive-services-apis-create-account?tabs=singleservice%2Cwindows
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
# coding: utf-8

# -------------------------------------------------------------------------
# Copyright (c) Microsoft Corporation. All rights reserved.
# Licensed under the MIT License. See License.txt in the project root for
# license information.
# --------------------------------------------------------------------------

"""
FILE: sample_multi_category_classify_async.py
DESCRIPTION:
This sample demonstrates how to classify documents into multiple custom categories. Here we have a few
movie plot summaries that must be categorized into movie genres like Sci-Fi, Horror, Comedy, Romance, etc.
Classifying documents is available as an action type through the begin_analyze_actions API.
To train a model to classify your documents, see TODO
USAGE:
python sample_multi_category_classify_async.py
Set the environment variables with your own values before running the sample:
1) AZURE_TEXT_ANALYTICS_ENDPOINT - the endpoint to your Cognitive Services resource.
2) AZURE_TEXT_ANALYTICS_KEY - your Text Analytics subscription key
3) AZURE_TEXT_ANALYTICS_PROJECT_NAME - your Text Analytics Language Studio project name
4) AZURE_TEXT_ANALYTICS_DEPLOYMENT_NAME - your Text Analytics deployed model name
"""


import os
import asyncio


async def sample_classify_document_multi_categories_async():
from azure.core.credentials import AzureKeyCredential
from azure.ai.textanalytics.aio import TextAnalyticsClient
from azure.ai.textanalytics import MultiCategoryClassifyAction

endpoint = os.environ["AZURE_TEXT_ANALYTICS_ENDPOINT"]
key = os.environ["AZURE_TEXT_ANALYTICS_KEY"]
project_name = os.environ["AZURE_TEXT_ANALYTICS_PROJECT_NAME"]
deployed_model_name = os.environ["AZURE_TEXT_ANALYTICS_DEPLOYMENT_NAME"]

text_analytics_client = TextAnalyticsClient(
endpoint=endpoint,
credential=AzureKeyCredential(key),
)

documents = [
"In the not-too-distant future, Earth's dying sun spells the end for humanity. In a last-ditch effort to "
"save the planet, a crew of eight men and women ventures into space with a device that could revive the "
"star. However, an accident, a grave mistake and a distress beacon from a long-lost spaceship throw "
"the crew and its desperate mission into a tailspin.",

"Despite his family's generations-old ban on music, young Miguel dreams of becoming an accomplished "
"musician like his idol Ernesto de la Cruz. Desperate to prove his talent, Miguel finds himself "
"in the stunning and colorful Land of the Dead. After meeting a charming trickster named Héctor, "
"the two new friends embark on an extraordinary journey to unlock the real story behind Miguel's "
"family history"
]
async with text_analytics_client:
poller = await text_analytics_client.begin_analyze_actions(
documents,
actions=[
MultiCategoryClassifyAction(
project_name=project_name,
deployment_name=deployed_model_name
),
],
)

pages = await poller.result()

document_results = []
async for page in pages:
document_results.append(page)
for doc, classification_results in zip(documents, document_results):
for classification_result in classification_results:
if not classification_result.is_error:
classifications = classification_result.classifications
print("The movie plot '{}' was classified as the following genres:\n".format(doc))
for classification in classifications:
print("'{}' with confidence score {}.".format(
classification.category, classification.confidence_score
))
else:
print("Movie plot '{}' has an error with code '{}' and message '{}'".format(
doc, classification_result.code, classification_result.message
))


async def main():
await sample_classify_document_multi_categories_async()


if __name__ == '__main__':
asyncio.run(main())
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
# coding: utf-8

# -------------------------------------------------------------------------
# Copyright (c) Microsoft Corporation. All rights reserved.
# Licensed under the MIT License. See License.txt in the project root for
# license information.
# --------------------------------------------------------------------------

"""
FILE: sample_recognize_custom_entities_async.py
DESCRIPTION:
This sample demonstrates how to recognize custom entities in documents.
Recognizing custom entities is available as an action type through the begin_analyze_actions API.
To train a model to recognize your custom entities, see TODO
USAGE:
python sample_recognize_custom_entities_async.py
Set the environment variables with your own values before running the sample:
1) AZURE_TEXT_ANALYTICS_ENDPOINT - the endpoint to your Cognitive Services resource.
2) AZURE_TEXT_ANALYTICS_KEY - your Text Analytics subscription key
3) AZURE_TEXT_ANALYTICS_PROJECT_NAME - your Text Analytics Language Studio project name
4) AZURE_TEXT_ANALYTICS_DEPLOYMENT_NAME - your Text Analytics deployed model name
"""


import os
import asyncio


async def sample_recognize_custom_entities_async():
from azure.core.credentials import AzureKeyCredential
from azure.ai.textanalytics.aio import TextAnalyticsClient
from azure.ai.textanalytics import RecognizeCustomEntitiesAction

endpoint = os.environ["AZURE_TEXT_ANALYTICS_ENDPOINT"]
key = os.environ["AZURE_TEXT_ANALYTICS_KEY"]
project_name = os.environ["AZURE_TEXT_ANALYTICS_PROJECT_NAME"]
deployed_model_name = os.environ["AZURE_TEXT_ANALYTICS_DEPLOYMENT_NAME"]

text_analytics_client = TextAnalyticsClient(
endpoint=endpoint,
credential=AzureKeyCredential(key),
)

document = [
"The Grantor(s), John Smith, who also appears of record as John A. Smith, for and in consideration of "
"Ten dollars and Zero cents ($10.00) and other good and valuable consideration in hand paid, conveys, and "
"warrants to Jane Doe, the following described real estate, situated in the County of King, State of "
"Washington: Lot A, King County Short Plat Number AAAAAAAA, recorded under Recording Number AAAAAAAAA in "
"King County, Washington."
]

async with text_analytics_client:
poller = await text_analytics_client.begin_analyze_actions(
document,
actions=[
RecognizeCustomEntitiesAction(
project_name=project_name,
deployment_name=deployed_model_name
),
],
)

document_results = await poller.result()

async for result in document_results:
custom_entities_result = result[0] # first document, first result
if not custom_entities_result.is_error:
for entity in custom_entities_result.entities:
if entity.category == "Seller Name":
print("The seller of the property is {} with confidence score {}.".format(
entity.text, entity.confidence_score)
)
if entity.category == "Buyer Name":
print("The buyer of the property is {} with confidence score {}.".format(
entity.text, entity.confidence_score)
)
if entity.category == "Buyer Fee":
print("The buyer fee is {} with confidence score {}.".format(
entity.text, entity.confidence_score)
)
if entity.category == "Lot Number":
print("The lot number of the property is {} with confidence score {}.".format(
entity.text, entity.confidence_score)
)
if entity.category == "Short Plat Number":
print("The short plat number of the property is {} with confidence score {}.".format(
entity.text, entity.confidence_score)
)
if entity.category == "Recording Number":
print("The recording number of the property is {} with confidence score {}.".format(
entity.text, entity.confidence_score)
)
else:
print("...Is an error with code '{}' and message '{}'".format(
custom_entities_result.code, custom_entities_result.message
))


async def main():
await sample_recognize_custom_entities_async()


if __name__ == '__main__':
asyncio.run(main())
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
# coding: utf-8

# -------------------------------------------------------------------------
# Copyright (c) Microsoft Corporation. All rights reserved.
# Licensed under the MIT License. See License.txt in the project root for
# license information.
# --------------------------------------------------------------------------

"""
FILE: sample_single_category_classify_async.py
DESCRIPTION:
This sample demonstrates how to classify documents into a single custom category. Here we several
support tickets that need to be classified as internet, printer, email or hardware issues.
Classifying documents is available as an action type through the begin_analyze_actions API.
To train a model to classify your documents, see TODO
USAGE:
python sample_single_category_classify_async.py
Set the environment variables with your own values before running the sample:
1) AZURE_TEXT_ANALYTICS_ENDPOINT - the endpoint to your Cognitive Services resource.
2) AZURE_TEXT_ANALYTICS_KEY - your Text Analytics subscription key
3) AZURE_TEXT_ANALYTICS_PROJECT_NAME - your Text Analytics Language Studio project name
4) AZURE_TEXT_ANALYTICS_DEPLOYMENT_NAME - your Text Analytics deployed model name
"""


import os
import asyncio


async def sample_classify_document_single_category_async():
from azure.core.credentials import AzureKeyCredential
from azure.ai.textanalytics.aio import TextAnalyticsClient
from azure.ai.textanalytics import SingleCategoryClassifyAction

endpoint = os.environ["AZURE_TEXT_ANALYTICS_ENDPOINT"]
key = os.environ["AZURE_TEXT_ANALYTICS_KEY"]
project_name = os.environ["AZURE_TEXT_ANALYTICS_PROJECT_NAME"]
deployed_model_name = os.environ["AZURE_TEXT_ANALYTICS_DEPLOYMENT_NAME"]

text_analytics_client = TextAnalyticsClient(
endpoint=endpoint,
credential=AzureKeyCredential(key),
)

documents = [
"My internet has stopped working. I tried resetting the router, but it just keeps blinking red.",
"I submitted 3 jobs to print but the printer is unresponsive. I can't see it under my devices either.",
"My computer will not boot. Pushing the power button does nothing - just a black screen.",
"I seem to not be receiving all my emails on time. Emails from 2 days ago show up as just received.",
]

async with text_analytics_client:
poller = await text_analytics_client.begin_analyze_actions(
documents,
actions=[
SingleCategoryClassifyAction(
project_name=project_name,
deployment_name=deployed_model_name
),
],
)

pages = await poller.result()

document_results = []
async for page in pages:
document_results.append(page)

for doc, classification_results in zip(documents, document_results):
for classification_result in classification_results:
if not classification_result.is_error:
classification = classification_result.classification
print("The document text '{}' was classified as '{}' with confidence score {}.".format(
doc, classification.category, classification.confidence_score)
)
else:
print("Document text '{}' has an error with code '{}' and message '{}'".format(
doc, classification_result.code, classification_result.message
))


async def main():
await sample_classify_document_single_category_async()


if __name__ == '__main__':
asyncio.run(main())
Loading

0 comments on commit d4c7167

Please sign in to comment.