Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extended semantic segmentation to image segmentation #27039

Merged
merged 18 commits into from
Nov 23, 2023

Conversation

merveenoyan
Copy link
Contributor

This PR extends semantic segmentation guide to cover two other segmentation types (except for the big fine-tuning part) and compares them. cc @NielsRogge as discussed

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Oct 24, 2023

The documentation is not available anymore as the PR was closed or merged.

docs/source/en/tasks/semantic_segmentation.md Outdated Show resolved Hide resolved
docs/source/en/tasks/semantic_segmentation.md Outdated Show resolved Hide resolved
docs/source/en/tasks/semantic_segmentation.md Outdated Show resolved Hide resolved
docs/source/en/tasks/semantic_segmentation.md Outdated Show resolved Hide resolved
docs/source/en/tasks/semantic_segmentation.md Show resolved Hide resolved
results = panoptic_segmentation(Image.open(image))
results
```
As you can see below, every pixel gets classified and there are multiple instances for car again.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can we see in this output that every pixel is classifed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed this sentence.

Comment on lines +186 to +188
<div class="flex justify-center">
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/segmentation-comparison.png" alt="Segmentation Maps Compared"/>
</div>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd maybe use the same order you used in the exposition: Reference, Semantic Segmentation, Instance Segmentation, Panoptic Segmentation.

The Instance Segmentation Output appears to contain more classes than "car" and "person", but the model output above didn't. Perhaps we could make it consistent?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Surprisingly that building is classified as car, and this is one of the best (maybe it is the best) instance segmentation models on Hub (mask2former). I'd rather not modify?

Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this!

+1 to all of @pcuenca's comments

docs/source/en/tasks/semantic_segmentation.md Outdated Show resolved Hide resolved
Copy link
Contributor

@MKhalusova MKhalusova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the comments from @pcuenca and @amyeroberts. We probably should also add a couple of headers.
Right now, the right-side navigation looks like this:

## Load SceneParse150 dataset
## Preprocess
## Evaluate
## Train
## Inference

I would suggest the following structure:

## Types of segmentation
## Fine-tune a semantic segmentation model
### Load SceneParse150 dataset
### Preprocess
### Evaluate
### Train
### Inference

Also, in the inference example at the end of the fine-tuning section, we can probably leave only the example of doing inference manually, since we already show inference examples with a pipeline at the beginning of the doc.

docs/source/en/tasks/semantic_segmentation.md Outdated Show resolved Hide resolved
@merveenoyan
Copy link
Contributor Author

I addressed all the comments. Sorry I deprioritized it shortly.

Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Just one small nit

docs/source/en/tasks/semantic_segmentation.md Outdated Show resolved Hide resolved
## Types of Segmentation

Semantic segmentation assigns a label or class to every single pixel in an image. Let's take a look at a semantic segmentation model output. It will assign the same class to every instance of an object it comes across in an image, for example, all cats will be labeled as "cat" instead of "cat-1", "cat-2".
We can use transformers' image segmentation pipeline to quickly infer a semantic segmentation model. Let's take a look at the example image.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
We can use transformers' image segmentation pipeline to quickly infer a semantic segmentation model. Let's take a look at the example image.
We can use Transformers' image segmentation pipeline to quickly infer with a semantic segmentation model called [SegFormer](model_doc/segformer). Let's take a look at the example image.

Not sure the link here will work

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would

<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/semantic_segmentation_output.png" alt="Semantic Segmentation Output"/>
</div>

In instance segmentation, the goal is not to classify every pixel, but to predict a mask for **every instance of an object** in a given image. We will use [facebook/mask2former-swin-large-cityscapes-instance](https://huggingface.co/facebook/mask2former-swin-large-cityscapes-instance) for this.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In instance segmentation, the goal is not to classify every pixel, but to predict a mask for **every instance of an object** in a given image. We will use [facebook/mask2former-swin-large-cityscapes-instance](https://huggingface.co/facebook/mask2former-swin-large-cityscapes-instance) for this.
In instance segmentation, the goal is not to classify every pixel, but to predict a mask for **every instance of a class** in a given image. We will use [facebook/mask2former-swin-large-cityscapes-instance](https://huggingface.co/facebook/mask2former-swin-large-cityscapes-instance) for this.

I would add here that instance segmentation is very similar to object detection: you want to get a set of instances out of your image, the only difference between object detection and instance segmentation being that the former gets a bounding box per instance, whereas the latter gets a binary mask per instance

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a really nice way to build intuition for instance segmentation!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I addressed it

<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/instance_segmentation_output.png" alt="Semantic Segmentation Output"/>
</div>

Panoptic segmentation combines semantic segmentation and instance segmentation, where every pixel is classified, and there are multiple masks for each instance of a class. We can use [facebook/mask2former-swin-large-cityscapes-panoptic](https://huggingface.co/facebook/mask2former-swin-large-cityscapes-panoptic) for this.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Panoptic segmentation combines semantic segmentation and instance segmentation, where every pixel is classified, and there are multiple masks for each instance of a class. We can use [facebook/mask2former-swin-large-cityscapes-panoptic](https://huggingface.co/facebook/mask2former-swin-large-cityscapes-panoptic) for this.
Panoptic segmentation combines semantic segmentation and instance segmentation, where every pixel is classified, and there are multiple masks for each instance of a class. We can use [facebook/mask2former-swin-large-cityscapes-panoptic](https://huggingface.co/facebook/mask2former-swin-large-cityscapes-panoptic) for this.

Panoptic segmentation technically assigns 2 labels per pixel: a semantic category and an instance ID.

Copy link
Contributor

@MKhalusova MKhalusova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating on the guide! LGTM, only one minor nit - the notebook login and pip install section is repeated twice in the guide.

Comment on lines 29 to 41
Before you begin, make sure you have all the necessary libraries installed:

```bash
pip install -q datasets transformers evaluate
```

We encourage you to log in to your Hugging Face account so you can upload and share your model with the community. When prompted, enter your token to log in:

```py
>>> from huggingface_hub import notebook_login

>>> notebook_login()
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This paragraph is repeated later in the fine-tuning section. It's probably best to have this information only once. I would suggest to leave the library installation instructions here (as we need the libraries installed for inference examples to work), but the notebook log in makes more sense in the fine-tuning section.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for letting me know, addressed this!

@merveenoyan
Copy link
Contributor Author

@MKhalusova can you merch if you approve? 👉 👈 🥹

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe rename file/URL to image_segmentation.md, for consistency with the contents. (Also in the yaml, of course) :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@merveenoyan
Copy link
Contributor Author

merveenoyan commented Nov 20, 2023

doc tests is still looking for semantic_segmentation.md although I've put a redirection 🧐
Screenshot 2023-11-20 at 17 42 47

@MKhalusova
Copy link
Contributor

We need green CI checks before we can merge. It looks like the utils/check_doctest_list.py check is failing. You can run it locally to debug and fix.

@pcuenca
Copy link
Member

pcuenca commented Nov 20, 2023

@MKhalusova it's my fault, I suggested to rename the file for consistency. I opened this PR to @merveenoyan's repo, which fixes the problem locally. It also fixes an issue with check_task_guides.py. Please, let me know if there's a better way to deal with this kind of situation :)

I also saw a semantic_segmentation.md in the ko translation, but refrained from renaming it because there's no _redirect.yml for Korean. Should we add it?

@MKhalusova
Copy link
Contributor

@pcuenca Generally, we tend to avoid renaming/removing files. I have no experience safely doing so, perhaps @stevhliu could advise?

@pcuenca
Copy link
Member

pcuenca commented Nov 20, 2023

It's not a big deal, we can just go back to @merveenoyan's version before the rename if that's simpler.

@mishig25
Copy link
Contributor

Regarding redirects. For example, perf_infer_gpu_many: perf_infer_gpu_one. There is a chicken-egg situation because hf.co/docs uses _redirects.yml that is on main currently. Unless, this PR gets merged hf.co/docs for now will not redirectperf_infer_gpu_many: perf_infer_gpu_one

@merveenoyan
Copy link
Contributor Author

@MKhalusova according to Mishig's response we need to merge before it turns red and then it will be green, so maybe you can make the call in this case.

@pcuenca
Copy link
Member

pcuenca commented Nov 21, 2023

There are two different things here.

Given the increased complexity and that @MKhalusova said we generally try to avoid renames, I'd suggest we remove the rename and keep the same filename it had before. Sorry for introducing noise!

@merveenoyan
Copy link
Contributor Author

Can this be merged by someone with write access?

@amyeroberts
Copy link
Collaborator

I can merge it :)

Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating!

@amyeroberts amyeroberts merged commit baabd38 into huggingface:main Nov 23, 2023
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants