Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add object detection capability and python API #3472

Merged
merged 56 commits into from
Jun 8, 2021

Conversation

alonfaraj
Copy link
Contributor

@alonfaraj alonfaraj commented Mar 16, 2021

About

This PR adds the capability to detect objects with Unreal.
It support setting radius from camera to search for objects and setting object name in wildcard format.
One can control these settings for each camera, image type and vehicle combination separately.
It output relevant information as described in DetectionInfo.

Itcurrently support only ImageType::Scene, but can be extended by attaching the DetectionComponent in BP_PIPCamera to the relevant camera and add corresponding lines in APIPCamera::PostInitializeComponents

  • Detection APIs implementation:

    • simSetDetectionFilterRadius
    • simAddDetectionFilterMeshName
    • simClearDetectionMeshNames
    • simGetDetections
  • Detection struct:

class DetectionInfo(MsgpackMixin):
    name = ''
    geoPoint = GeoPoint()
    box2D = Box2D()
    box3D = Box3D()
    relative_pose = Pose()

TODO:

  • Implement some get API functions
  • Add Enable/Disable Detection capability API or from settings.json

Most of the detection and object filter code was copied from https://github.com/unrealgt/UnrealGT and changed for my own needs.

Probably much more work to do but I'm using it for a while and though other users might find it useful.

How Has This Been Tested?

Tested on Blocks and ModularNeighborhood environments by running the detection python script in this PR (Windows).

Example API Call -

camera_name = "0"
image_type = airsim.ImageType.Scene

client.simSetDetectionFilterRadius(camera_name, image_type, 80 * 100) # in [cm]
client.simAddDetectionFilterMeshName(camera_name, image_type, "Car_*") 
client.simGetDetections(camera_name, image_type)
client.simClearDetectionMeshNames(camera_name, image_type)

Example output -

Cylinder: <DetectionInfo> {   'box2D': <Box2D> {   'max': <Vector2r> {   'x_val': 617.025634765625,
    'y_val': 583.5487060546875},
    'min': <Vector2r> {   'x_val': 485.74359130859375,
    'y_val': 438.33465576171875}},
    'box3D': <Box3D> {   'max': <Vector3r> {   'x_val': 4.900000095367432,
    'y_val': 0.7999999523162842,
    'z_val': 0.5199999809265137},
    'min': <Vector3r> {   'x_val': 3.8999998569488525,
    'y_val': -0.19999998807907104,
    'z_val': 1.5199999809265137}},
    'geo_point': <GeoPoint> {   'altitude': 16.979999542236328,
    'latitude': 32.28772183970703,
    'longitude': 34.864785008379876},
    'name': 'Cylinder9_2',
    'relative_pose': <Pose> {   'orientation': <Quaternionr> {   'w_val': 0.9929741621017456,
    'x_val': 0.0038591264747083187,
    'y_val': -0.11333247274160385,
    'z_val': 0.03381215035915375},
    'position': <Vector3r> {   'x_val': 4.400000095367432,
    'y_val': 0.29999998211860657,
    'z_val': 1.0199999809265137}}}

Screenshots (if appropriate):

  • Blocks
    Unreal
    blocks_ue4
    Python
    blocks_python

  • ModularNeighborhood
    Unreal

MN_ue4
Python
MN_python

@alonfaraj alonfaraj changed the title Add object detection ability and python API Add object detection capability and python API Mar 16, 2021
@alonfaraj
Copy link
Contributor Author

@rajat2004 any idea why checks failed for Unity build?

@rajat2004
Copy link
Contributor

You'll need to add some unimplemented methods in Unity WorldSimApi.cpp, .h files for compilation, just follow the format in the other methods

There's also a conflict which needs to be fixed, #3477 renamed the file and everywhere it was used. Will need to fix the spelling of RpcLibAdaptors wherever added in this PR

Copy link
Contributor

@rajat2004 rajat2004 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a brief review. I haven't yet looked at the detection part of the code, will do so later

One main point is that AirSim uses 4 spaces instead of tabs, it'll be best to convert to remain consistent. Another is that snake case is used for variables, which would also be another good thing to fix

AirLib/include/api/RpcLibClientBase.hpp Outdated Show resolved Hide resolved
AirLib/src/api/RpcLibServerBase.cpp Outdated Show resolved Hide resolved
PythonClient/detection/detection.py Outdated Show resolved Hide resolved
PythonClient/detection/detection.py Outdated Show resolved Hide resolved
Unreal/Plugins/AirSim/Source/AirBlueprintLib.cpp Outdated Show resolved Hide resolved
Unreal/Plugins/AirSim/Source/AirBlueprintLib.cpp Outdated Show resolved Hide resolved
Unreal/Plugins/AirSim/Source/ObjectFilter.cpp Outdated Show resolved Hide resolved
Unreal/Plugins/AirSim/Source/PawnSimApi.h Outdated Show resolved Hide resolved
@evroon
Copy link
Contributor

evroon commented Mar 25, 2021

I don't want to undermine your hard work, but I want to note that there is a simpler way of getting bounding boxes around objects such as cars by using the segmentation images. I explained it here. This way, you get pixel-perfect bounding boxes. By projecting a 3D box, you will always end up (except for objects that have the shape of a box of course) with bigger bounding boxes than optimal.

@PPakalns
Copy link

@evroon The bounding box retrieval from segmentation images works only in cases when objects do not overlap in the camera view. If there is a forest with a lot of trees or car in front of an another car, then these tightly coupled different objects can not be differentiated in segmentation images. So this bounding box feature is very welcomed in cases such as these.

@evroon
Copy link
Contributor

evroon commented Mar 25, 2021

@PPakalns Yes that's true, in case of occlusion "my" method does not work. But in my case I use it to train yolov4 for example and then you don't need such data AFAIK. I value correct bounding boxes more.
So I'm just curious what the use case is for having bounding boxes for (partially) occluded objects. Is it for training NNs or something else?

Btw I think you can also make my method work by taking multiple segmentation images of the same frame and changing the visibility of objects, but that is more complicated and less performant of course.

@PPakalns
Copy link

PPakalns commented Mar 25, 2021

@evron I will be using it for prototyping object detection model where objects can be tightly located in the scene next to each other, like standard case of people passing in front of each other or, in my case, prototyping survey drone where it is important to correctly recognize each separate object, these objects can occlude each other little bit. Segmentation image approach makes it hard to annotate such objects with separate bounding boxes because their regions overlap.

@alonfaraj At least now DetectionInfo data returns only object geolocation, for generating annotated data it would be useful if object position and orientation relative to the camera could be returned additionally. At least I will try to add such information myself :)

UPDATE Looks like using name returned in DetectionInfo and AirSim api to get object, vehicle and camera poses such information can be calculated.

Tomorrow will test this code and see how it works. Thanks @alonfaraj for such implementation 🥇

@MoBaT
Copy link

MoBaT commented Mar 26, 2021

@alonfaraj This is great! Was building the exact same thing but stumbled upon this. I have a few suggestions..

  1. Can the simSetDetectionFilterRadius and simAddDetectionFilterMeshName api change where a camera name can be given? I would like to have different detections and radiuses done on different cameras. Looking at your implementation, it's a very easy add. So I suggest:
client.simSetDetectionFilterRadius("0", 80 * 100) # in [cm]
client.simAddDetectionFilterMeshName("0", "Car_*") 
  1. Change the simAddDetectionFilterMeshName to accept a regex string instead of just a single wildcard similar to the simListSceneObjects api.

  2. Add a simClearDetections("regex") call or to make it easier, a simClearDetections() with no search criteria. I would like this in case I have a scene with dynamic objects and I want to poll for new objects at a certain frequency. By clearing detections and adding detections.

  3. Modify simGetDetections to return a Detection with the 3D boundingBox info also.

Copy link

@PPakalns PPakalns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, object detection results for some objects are flickering (in some frames object is visible, in some it is not even when camera position is not changed). Will look into cause of it.

Unreal/Plugins/AirSim/Source/ObjectFilter.cpp Outdated Show resolved Hide resolved
Unreal/Plugins/AirSim/Source/PawnSimApi.cpp Outdated Show resolved Hide resolved
@alonfaraj alonfaraj closed this Mar 29, 2021
@alonfaraj
Copy link
Contributor Author

@alonfaraj This is great! Was building the exact same thing but stumbled upon this. I have a few suggestions..

  1. Can the simSetDetectionFilterRadius and simAddDetectionFilterMeshName api change where a camera name can be given? I would like to have different detections and radiuses done on different cameras. Looking at your implementation, it's a very easy add. So I suggest:
client.simSetDetectionFilterRadius("0", 80 * 100) # in [cm]
client.simAddDetectionFilterMeshName("0", "Car_*") 
  1. Change the simAddDetectionFilterMeshName to accept a regex string instead of just a single wildcard similar to the simListSceneObjects api.
  2. Add a simClearDetections("regex") call or to make it easier, a simClearDetections() with no search criteria. I would like this in case I have a scene with dynamic objects and I want to poll for new objects at a certain frequency. By clearing detections and adding detections.
  3. Modify simGetDetections to return a Detection with the 3D boundingBox info also.

@alonfaraj alonfaraj reopened this Mar 29, 2021
@alonfaraj
Copy link
Contributor Author

alonfaraj commented Mar 29, 2021

@MoBaT Thanks for the suggestions!
I will probably make those changes soon.

About 4 - what is the purpose of 3D BB? Should it be in Geo as well?

@alonfaraj
Copy link
Contributor Author

@PPakalns Thank you very much for testing and fix those bugs!
I will make some fixes soon.

@evroon Seems like you already discussed it but I like your approach too :)
As @evroon said, my scenario is mostly when object are partially occluded by others and in this case I think it might be easier to use a "real" detection and not a segmentation.

@alonfaraj
Copy link
Contributor Author

@zimmy87 Seems like all set now.

Copy link
Contributor

@zimmy87 zimmy87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly comments on naming guidelines

Unity/AirLibWrapper/AirsimWrapper/Source/PawnSimApi.cpp Outdated Show resolved Hide resolved
Unreal/Plugins/AirSim/Source/DetectionComponent.cpp Outdated Show resolved Hide resolved
Unreal/Plugins/AirSim/Source/DetectionComponent.cpp Outdated Show resolved Hide resolved
Unreal/Plugins/AirSim/Source/DetectionComponent.h Outdated Show resolved Hide resolved
Unreal/Plugins/AirSim/Source/ObjectFilter.cpp Outdated Show resolved Hide resolved
Unreal/Plugins/AirSim/Source/ObjectFilter.cpp Outdated Show resolved Hide resolved
AirLib/include/vehicles/car/api/CarRpcLibAdaptors.hpp Outdated Show resolved Hide resolved
@alonfaraj alonfaraj requested a review from zimmy87 June 6, 2021 09:14
@alonfaraj
Copy link
Contributor Author

@zimmy87 Thanks for the review! Hopefully I didn't miss anything

adding setup_path.py for convenience in calling detection.py
Copy link
Contributor

@zimmy87 zimmy87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

latest revision looks good to me; will move ahead with merge once all checks pass

@zimmy87 zimmy87 merged commit 83119a6 into microsoft:master Jun 8, 2021
@jonyMarino
Copy link
Collaborator

Hi, @alonfaraj! Congratulations on this merged Pull Request. You are in the top 5 AirSim contributors! However, this contribution would have a much greater impact if it had associated documentation. Can you create a new PR with documentation?

@alonfaraj
Copy link
Contributor Author

@jonyMarino Thank you!
Sure, no problem.

@rajat2004
Copy link
Contributor

Should have mentioned this earlier, the API has a image_type argument, but the detection component is only present for Scene in PIPCamera.cpp, and different image types also don't really make sense since all the images will be same for the detections. Should the image_type arg be kept at all?

@alonfaraj
Copy link
Contributor Author

alonfaraj commented Jun 13, 2021

@rajat2004 you right, this PR currently support only Scene as mentioned in the first post above and it's easy to extend it for all other types. I thought it would be mostly relevant to Scene so didn't add it for all other types.

I added the image_type argument because different image types (for the same camera) can have different parameters such as resolution, FOV etc. which can lead to different detection results.

I'm wondering if add it to all other types is necessary or remove the image_type argument.

@rajat2004
Copy link
Contributor

Yeah, makes sense to have different detections for each image type as well. Since the API already has the image_type arg, adding support for other types will be good

@LIU-Xueming
Copy link

Hi, @alonfaraj
Thank you very much for your work, it is very helpful to me!

But Could you please explain these parameters in detail? ,such as
geoPoint = GeoPoint()
relative_pose = Pose()

I am confused about these two parameters

Thanks a lot!

@alonfaraj
Copy link
Contributor Author

Thank you @LIU-Xueming,

geoPoint is the geographical coordinate of the detected object. It's relevant in case you specify OriginGeopoint you can read more about it here.
relative_pose is position and orientation of the detected object, relative to the camera which generate the detection.

@zohaibjan
Copy link

Hi, @alonfaraj

I have a quick question. How will you accommodate for occluded objects?. Is there a way to determine whether the object is occluded or not ?. I potentially want to exclude the bounding box of an object if it is occluded.

Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants