Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TextAnalytics] Added sample for Extractive Text Summarization #23097

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
73 changes: 73 additions & 0 deletions sdk/textanalytics/Azure.AI.TextAnalytics/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ Azure Cognitive Services Text Analytics is a cloud service that provides advance
* Linked Entity Recognition
* Healthcare Recognition
* Running multiple actions in one or more documents
* Extractive Text Summarization

[Source code][textanalytics_client_src] | [Package (NuGet)][textanalytics_nuget_package] | [API reference documentation][textanalytics_refdocs] | [Product documentation][textanalytics_docs] | [Samples][textanalytics_samples]

Expand Down Expand Up @@ -161,6 +162,7 @@ The following section provides several code snippets using the `client` [created
* [Recognize Entities Asynchronously](#recognize-entities-asynchronously)
* [Analyze Healthcare Entities Asynchronously](#analyze-healthcare-entities-asynchronously)
* [Run multiple actions Asynchronously](#run-multiple-actions-asynchronously)
* [Perform Extractive Text Summarization Asynchronously](#perform-extractive-text-summarization-asynchronously)

### Detect Language
Run a Text Analytics predictive model to determine the language that the passed-in document or batch of documents are written in.
Expand Down Expand Up @@ -710,6 +712,75 @@ This functionality allows running multiple actions in one or more documents. Act
}
```

### Perform Extractive Text Summarization Asynchronously
Get a summary for the input documents by extracting their most relevant sentences. Note that this API can only be used as part of an [Analyze Operation](#run-multiple-actions-asynchronously).

```C# Snippet:TextAnalyticsExtractSummaryWithoutErrorHandlingAsync
// Get input document.
Copy link
Member Author

@kinelski kinelski Aug 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding monster-size input I stole from JS. Running this API for long documents seems more appropriate.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Def a nit: I think we can achieve the same goal with a smaller doc. No? maybe remove some paragraphs? at least for the mean Readme. it just looks weird.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created #23105 for tracking.

string document = @"Windows 365 was in the works before COVID-19 sent companies around the world on a scramble to secure solutions to support employees suddenly forced to work from home, but “what really put the firecracker behind it was the pandemic, it accelerated everything,” McKelvey said. She explained that customers were asking, “’How do we create an experience for people that makes them still feel connected to the company without the physical presence of being there?”
In this new world of Windows 365, remote workers flip the lid on their laptop, bootup the family workstation or clip a keyboard onto a tablet, launch a native app or modern web browser and login to their Windows 365 account.From there, their Cloud PC appears with their background, apps, settings and content just as they left it when they last were last there – in the office, at home or a coffee shop.
And then, when you’re done, you’re done.You won’t have any issues around security because you’re not saving anything on your device,” McKelvey said, noting that all the data is stored in the cloud.
The ability to login to a Cloud PC from anywhere on any device is part of Microsoft’s larger strategy around tailoring products such as Microsoft Teams and Microsoft 365 for the post-pandemic hybrid workforce of the future, she added. It enables employees accustomed to working from home to continue working from home; it enables companies to hire interns from halfway around the world; it allows startups to scale without requiring IT expertise.
“I think this will be interesting for those organizations who, for whatever reason, have shied away from virtualization.This is giving them an opportunity to try it in a way that their regular, everyday endpoint admin could manage,” McKelvey said.
The simplicity of Windows 365 won over Dean Wells, the corporate chief information officer for the Government of Nunavut. His team previously attempted to deploy a traditional virtual desktop infrastructure and found it inefficient and unsustainable given the limitations of low-bandwidth satellite internet and the constant need for IT staff to manage the network and infrastructure.
We didn’t run it for very long,” he said. “It didn’t turn out the way we had hoped.So, we actually had terminated the project and rolled back out to just regular PCs.”
He re-evaluated this decision after the Government of Nunavut was hit by a ransomware attack in November 2019 that took down everything from the phone system to the government’s servers. Microsoft helped rebuild the system, moving the government to Teams, SharePoint, OneDrive and Microsoft 365. Manchester’s team recruited the Government of Nunavut to pilot Windows 365. Wells was intrigued, especially by the ability to manage the elastic workforce securely and seamlessly.
“The impact that I believe we are finding, and the impact that we’re going to find going forward, is being able to access specialists from outside the territory and organizations outside the territory to come in and help us with our projects, being able to get people on staff with us to help us deliver the day-to-day expertise that we need to run the government,” he said.
“Being able to improve healthcare, being able to improve education, economic development is going to improve the quality of life in the communities.”";

// Prepare analyze operation input. You can add multiple documents to this list and perform the same
// operation to all of them.
kinelski marked this conversation as resolved.
Show resolved Hide resolved
var batchInput = new List<string>
{
document
};

TextAnalyticsActions actions = new TextAnalyticsActions()
{
ExtractSummaryActions = new List<ExtractSummaryAction>() { new ExtractSummaryAction() }
};

// Start analysis process.
AnalyzeActionsOperation operation = await client.StartAnalyzeActionsAsync(batchInput, actions);

await operation.WaitForCompletionAsync();

// View operation status.
Console.WriteLine($"AnalyzeActions operation has completed");
Console.WriteLine();

Console.WriteLine($"Created On : {operation.CreatedOn}");
Console.WriteLine($"Expires On : {operation.ExpiresOn}");
Console.WriteLine($"Id : {operation.Id}");
Console.WriteLine($"Status : {operation.Status}");
Console.WriteLine($"Last Modified: {operation.LastModified}");
Console.WriteLine();

// View operation results.
await foreach (AnalyzeActionsResult documentsInPage in operation.Value)
{
IReadOnlyCollection<ExtractSummaryActionResult> summaryResults = documentsInPage.ExtractSummaryResults;

foreach (ExtractSummaryActionResult summaryActionResults in summaryResults)
{
foreach (ExtractSummaryResult documentResults in summaryActionResults.DocumentsResults)
{
Console.WriteLine($" Extracted the following {documentResults.Sentences.Count} sentence(s):");
Console.WriteLine();

foreach (SummarySentence sentence in documentResults.Sentences)
{
Console.WriteLine($" Sentence: {sentence.Text}");
Console.WriteLine($" Rank Score: {sentence.RankScore}");
Console.WriteLine($" Offset: {sentence.Offset}");
Console.WriteLine($" Length: {sentence.Length}");
Console.WriteLine();
}
}
}
}
```

## Troubleshooting

### General
Expand Down Expand Up @@ -773,6 +844,7 @@ Samples are provided for each main functional area, and for each area, samples a
- [Recognize Linked Entities][recognize_linked_entities_sample]
- [Recognize Healthcare Entities][analyze_healthcare_sample]
- [Run multiple actions][analyze_operation_sample]
- [Perform Extractive Text Summarization][extract_summary_sample]

### Advanced samples
- [Analyze Sentiment with Opinion Mining][analyze_sentiment_opinion_mining_sample]
Expand Down Expand Up @@ -827,6 +899,7 @@ This project has adopted the [Microsoft Open Source Code of Conduct][code_of_con
[analyze_sentiment_sample]: https://github.com/Azure/azure-sdk-for-net/tree/main/sdk/textanalytics/Azure.AI.TextAnalytics/samples/Sample2_AnalyzeSentiment.md
[analyze_sentiment_opinion_mining_sample]: https://github.com/Azure/azure-sdk-for-net/tree/main/sdk/textanalytics/Azure.AI.TextAnalytics/samples/Sample2.1_AnalyzeSentimentWithOpinionMining.md
[extract_key_phrases_sample]: https://github.com/Azure/azure-sdk-for-net/tree/main/sdk/textanalytics/Azure.AI.TextAnalytics/samples/Sample3_ExtractKeyPhrases.md
[extract_summary_sample]: https://github.com/Azure/azure-sdk-for-net/blob/main/sdk/textanalytics/Azure.AI.TextAnalytics/README.md
kinelski marked this conversation as resolved.
Show resolved Hide resolved
[recognize_entities_sample]: https://github.com/Azure/azure-sdk-for-net/tree/main/sdk/textanalytics/Azure.AI.TextAnalytics/samples/Sample4_RecognizeEntities.md
[recognize_pii_entities_sample]: https://github.com/Azure/azure-sdk-for-net/tree/main/sdk/textanalytics/Azure.AI.TextAnalytics/samples/Sample5_RecognizePiiEntities.md
[recognize_linked_entities_sample]: https://github.com/Azure/azure-sdk-for-net/tree/main/sdk/textanalytics/Azure.AI.TextAnalytics/samples/Sample6_RecognizeLinkedEntities.md
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
# Perform Extractive Text Summarization in Documents
This sample demonstrates how to run an Extractive Text Summarization action in one or more documents. To get started you will need a Text Analytics endpoint and credentials. See [README][README] for links and instructions.

## Creating a `TextAnalyticsClient`

To create a new `TextAnalyticsClient` to extract summary sentences from a document, you need a Text Analytics endpoint and credentials. You can use the [DefaultAzureCredential][DefaultAzureCredential] to try a number of common authentication methods optimized for both running as a service and development. In the sample below, however, you'll use a Text Analytics API key credential by creating an `AzureKeyCredential` object, that if needed, will allow you to update the API key without creating a new client.

You can set `endpoint` and `apiKey` based on an environment variable, a configuration setting, or any way that works for your application.

```C# Snippet:CreateTextAnalyticsClient
string endpoint = "<endpoint>";
string apiKey = "<apiKey>";
var client = new TextAnalyticsClient(new Uri(endpoint), new AzureKeyCredential(apiKey));
```

## Performing extractive text summarization in one or multiple documents

To perform extractive text summarization in one or multiple documents, set up an `ExtractSummaryAction` and call `StartAnalyzeActionsAsync` on the documents. The result is a Long Running operation of type `AnalyzeActionsOperation` which polls for the results from the API.

```C# Snippet:TextAnalyticsExtractSummaryAsync
// Get input document.
string document = @"Windows 365 was in the works before COVID-19 sent companies around the world on a scramble to secure solutions to support employees suddenly forced to work from home, but “what really put the firecracker behind it was the pandemic, it accelerated everything,” McKelvey said. She explained that customers were asking, “’How do we create an experience for people that makes them still feel connected to the company without the physical presence of being there?”
In this new world of Windows 365, remote workers flip the lid on their laptop, bootup the family workstation or clip a keyboard onto a tablet, launch a native app or modern web browser and login to their Windows 365 account.From there, their Cloud PC appears with their background, apps, settings and content just as they left it when they last were last there – in the office, at home or a coffee shop.
And then, when you’re done, you’re done.You won’t have any issues around security because you’re not saving anything on your device,” McKelvey said, noting that all the data is stored in the cloud.
The ability to login to a Cloud PC from anywhere on any device is part of Microsoft’s larger strategy around tailoring products such as Microsoft Teams and Microsoft 365 for the post-pandemic hybrid workforce of the future, she added. It enables employees accustomed to working from home to continue working from home; it enables companies to hire interns from halfway around the world; it allows startups to scale without requiring IT expertise.
“I think this will be interesting for those organizations who, for whatever reason, have shied away from virtualization.This is giving them an opportunity to try it in a way that their regular, everyday endpoint admin could manage,” McKelvey said.
The simplicity of Windows 365 won over Dean Wells, the corporate chief information officer for the Government of Nunavut. His team previously attempted to deploy a traditional virtual desktop infrastructure and found it inefficient and unsustainable given the limitations of low-bandwidth satellite internet and the constant need for IT staff to manage the network and infrastructure.
We didn’t run it for very long,” he said. “It didn’t turn out the way we had hoped.So, we actually had terminated the project and rolled back out to just regular PCs.”
He re-evaluated this decision after the Government of Nunavut was hit by a ransomware attack in November 2019 that took down everything from the phone system to the government’s servers. Microsoft helped rebuild the system, moving the government to Teams, SharePoint, OneDrive and Microsoft 365. Manchester’s team recruited the Government of Nunavut to pilot Windows 365. Wells was intrigued, especially by the ability to manage the elastic workforce securely and seamlessly.
“The impact that I believe we are finding, and the impact that we’re going to find going forward, is being able to access specialists from outside the territory and organizations outside the territory to come in and help us with our projects, being able to get people on staff with us to help us deliver the day-to-day expertise that we need to run the government,” he said.
“Being able to improve healthcare, being able to improve education, economic development is going to improve the quality of life in the communities.”";

// Prepare analyze operation input. You can add multiple documents to this list and perform the same
// operation to all of them.
var batchInput = new List<string>
{
document
};

TextAnalyticsActions actions = new TextAnalyticsActions()
{
ExtractSummaryActions = new List<ExtractSummaryAction>() { new ExtractSummaryAction() }
};

// Start analysis process.
AnalyzeActionsOperation operation = await client.StartAnalyzeActionsAsync(batchInput, actions);

await operation.WaitForCompletionAsync();
```

The returned `AnalyzeActionsOperation` contains general information about the status of the operation. It can be requested while the operation is running or when it has completed. For example:

```C# Snippet:TextAnalyticsExtractSummaryOperationStatus
// View operation status.
Console.WriteLine($"AnalyzeActions operation has completed");
Console.WriteLine();

Console.WriteLine($"Created On : {operation.CreatedOn}");
Console.WriteLine($"Expires On : {operation.ExpiresOn}");
Console.WriteLine($"Id : {operation.Id}");
Console.WriteLine($"Status : {operation.Status}");
Console.WriteLine($"Last Modified: {operation.LastModified}");
Console.WriteLine();
```

To view the final results of the long-running operation:

```C# Snippet:TextAnalyticsExtractSummaryAsyncViewResults
// View operation results.
await foreach (AnalyzeActionsResult documentsInPage in operation.Value)
{
IReadOnlyCollection<ExtractSummaryActionResult> summaryResults = documentsInPage.ExtractSummaryResults;

foreach (ExtractSummaryActionResult summaryActionResults in summaryResults)
{
if (summaryActionResults.HasError)
{
Console.WriteLine($" Error!");
Console.WriteLine($" Action error code: {summaryActionResults.Error.ErrorCode}.");
Console.WriteLine($" Message: {summaryActionResults.Error.Message}");
continue;
}

foreach (ExtractSummaryResult documentResults in summaryActionResults.DocumentsResults)
{
if (documentResults.HasError)
{
Console.WriteLine($" Error!");
Console.WriteLine($" Document error code: {documentResults.Error.ErrorCode}.");
Console.WriteLine($" Message: {documentResults.Error.Message}");
continue;
}

Console.WriteLine($" Extracted the following {documentResults.Sentences.Count} sentence(s):");
Console.WriteLine();

foreach (SummarySentence sentence in documentResults.Sentences)
{
Console.WriteLine($" Sentence: {sentence.Text}");
Console.WriteLine($" Rank Score: {sentence.RankScore}");
Console.WriteLine($" Offset: {sentence.Offset}");
Console.WriteLine($" Length: {sentence.Length}");
Console.WriteLine();
}
}
}
}
```

To see the full example source files, see:

* [Synchronously ExtractSummary](https://github.com/Azure/azure-sdk-for-net/blob/main/sdk/textanalytics/Azure.AI.TextAnalytics/README.md)
* [Asynchronously ExtractSummary](https://github.com/Azure/azure-sdk-for-net/blob/main/sdk/textanalytics/Azure.AI.TextAnalytics/README.md)
* [Synchronously ExtractSummary Convenience](https://github.com/Azure/azure-sdk-for-net/blob/main/sdk/textanalytics/Azure.AI.TextAnalytics/README.md)
* [Asynchronously ExtractSummary Convenience](https://github.com/Azure/azure-sdk-for-net/blob/main/sdk/textanalytics/Azure.AI.TextAnalytics/README.md)
kinelski marked this conversation as resolved.
Show resolved Hide resolved

[DefaultAzureCredential]: https://github.com/Azure/azure-sdk-for-net/blob/main/sdk/identity/Azure.Identity/README.md
[README]: https://github.com/Azure/azure-sdk-for-net/blob/main/sdk/textanalytics/Azure.AI.TextAnalytics/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Running multiple actions
This sample demonstrates how to run multiple actions in one or more documents. Actions include entity recognition, linked entity recognition, key phrase extraction, Personally Identifiable Information (PII) Recognition, and sentiment analysis. To get started you will need a Text Analytics endpoint and credentials. See [README][README] for links and instructions.
This sample demonstrates how to run multiple actions in one or more documents. Actions include entity recognition, linked entity recognition, key phrase extraction, Personally Identifiable Information (PII) Recognition, sentiment analysis, and extractive text summarization. To get started you will need a Text Analytics endpoint and credentials. See [README][README] for links and instructions.

## Creating a `TextAnalyticsClient`

Expand Down
Loading