Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add OpenVino Detector #3768

Merged
merged 9 commits into from
Dec 3, 2022
Merged

Conversation

NateMeyer
Copy link
Contributor

@NateMeyer NateMeyer commented Sep 5, 2022

This PR is dependent on the Object Detector Abstraction change #3656.

This adds the OpenVino framework and a default model to the build image and a detector implementation that uses the OpenVino framework. I have developed and tested this on my i5 1135G7, using the Xe iGPU.

The necessary configuration to run the included model is explained in the Detectors page of the documentation.

TODO:

  • Build OpenVino Wheel for amd64
  • Build OpenVino Wheel for ARM (32 and 64)
  • Determine what is needed to run on VPU/MyriadX hardware
  • Rebase to dev branch
  • Update Documentation with supported hardware

- Update default model Further Model support can be reviewed/developed later

docker/Dockerfile Outdated Show resolved Hide resolved
@NateMeyer
Copy link
Contributor Author

I have two questions with this:

  1. I've only tested the CPU and GPU devices with this. Is there someone who has a NCS2 that can try the VPU device?
  2. When looking through the docker build packages, I realize I don't know if or how this will build for arm/arm64 architectures. I believe the CPU and GPU devices only work for Intel parts (i.e. amd64 arch), but in theory the VPU should work on an arm board. I tried briefly to run a multiarch build, but would I need to do some more setup to get that to build.

@NickM-27
Copy link
Collaborator

NickM-27 commented Sep 7, 2022

  1. I've only tested the CPU and GPU devices with this. Is there someone who has a NCS2 that can try the VPU device?

I know a couple users have mentioned they have one, not sure which ones will be able to test, you might want to make a discussion post about it.

  1. When looking through the docker build packages, I realize I don't know if or how this will build for arm/arm64 architectures. I believe the CPU and GPU devices only work for Intel parts (i.e. amd64 arch), but in theory the VPU should work on an arm board. I tried briefly to run a multiarch build, but would I need to do some more setup to get that to build.

It's not clear to me what you're asking. Are you saying the multiarch build fails somewhere? Are all the dependencies needed for the VPU in the ov package that is added?

If not then I'd imagine you'd just need to add the intel-opencl-icd to arm as well

docker/Dockerfile Outdated Show resolved Hide resolved
@NateMeyer
Copy link
Contributor Author

  1. When looking through the docker build packages, I realize I don't know if or how this will build for arm/arm64 architectures. I believe the CPU and GPU devices only work for Intel parts (i.e. amd64 arch), but in theory the VPU should work on an arm board. I tried briefly to run a multiarch build, but would I need to do some more setup to get that to build.

It's not clear to me what you're asking. Are you saying the multiarch build fails somewhere? Are all the dependencies needed for the VPU in the ov package that is added?

If not then I'd imagine you'd just need to add the intel-opencl-icd to arm as well

The intel-opencl-icd package is only needed for running on the Intel GPU, and is only available for the amd64 arch.

The multiarch build is failing for me, I assume it is just a setup issue on my end. Are there some instructions for running the multiarch build? My quick googling looks like I need to setup qemu on my system to run the ARM builds.

I think the openvino python library should have the necessary pieces for using the VPU plugin, except for the driver which would have to be installed on the host system. I would then expect the enumerated VPU device could be passed through to the container.

@NickM-27
Copy link
Collaborator

NickM-27 commented Sep 7, 2022

You will want to use https://github.com/blakeblackshear/frigate/blob/release-0.11.0/Makefile#L29 to build multiarch. If there is an error then you'll need to paste it here and I can advise.

@NateMeyer
Copy link
Contributor Author

The error I get is:

 => ERROR [wheels 2/8] RUN apt-get -qq update     && apt-get -qq install -y     apt-transport-http  0.4s
 => CANCELED [ov-converter 2/8] RUN apt-get -qq update     && apt-get -qq install -y wget python3   0.5s
------
 > [wheels 2/8] RUN apt-get -qq update     && apt-get -qq install -y     apt-transport-https     gnupg     wget     && apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 9165938D90FDDD2E     && echo "deb http://raspbian.raspberrypi.org/raspbian/ bullseye main contrib non-free rpi" | tee /etc/apt/sources.list.d/raspi.list     && apt-get -qq update     && apt-get -qq install -y     python3     python3-dev     wget     build-essential cmake git pkg-config libgtk-3-dev     libavcodec-dev libavformat-dev libswscale-dev libv4l-dev     libxvidcore-dev libx264-dev libjpeg-dev libpng-dev libtiff-dev     gfortran openexr libatlas-base-dev libssl-dev    libtbb2 libtbb-dev libdc1394-22-dev libopenexr-dev     libgstreamer-plugins-base1.0-dev libgstreamer1.0-dev     gcc gfortran libopenblas-dev liblapack-dev:
#0 0.355 exec /bin/sh: exec format error
------
error: failed to solve: executor failed running [/bin/sh -c apt-get -qq update     && apt-get -qq install -y     apt-transport-https     gnupg     wget     && apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 9165938D90FDDD2E     && echo "deb http://raspbian.raspberrypi.org/raspbian/ bullseye main contrib non-free rpi" | tee /etc/apt/sources.list.d/raspi.list     && apt-get -qq update     && apt-get -qq install -y     python3     python3-dev     wget     build-essential cmake git pkg-config libgtk-3-dev     libavcodec-dev libavformat-dev libswscale-dev libv4l-dev     libxvidcore-dev libx264-dev libjpeg-dev libpng-dev libtiff-dev     gfortran openexr libatlas-base-dev libssl-dev    libtbb2 libtbb-dev libdc1394-22-dev libopenexr-dev     libgstreamer-plugins-base1.0-dev libgstreamer1.0-dev     gcc gfortran libopenblas-dev liblapack-dev]: exit code: 1
make: *** [Makefile:25: arm64] Error 1

when it gets to the first RUN step in the arm64 target.

It is complaining that it can't run the arm binaries in the arm image. I just installed qemu and followed the setup at https://github.com/multiarch/qemu-user-static and got past this first error. To prove out my build system, I built the release-0.11.0 branch. This successfully built each architecture. Finally I had to create a buildx builder to create the multiarch image at the end with docker buildx create --name localBuilder --use. After all that, the build make target finishes.

Back on my openvino branch, now I get to the next issue. The Python package available from pypi only support amd64 architecture.

In order to support OpenVino on arm, it looks like we'll have to add another stage to the build to compile the runtime from source, until someone publishes a python package. The repository from Intel will build on ARM, but only supports the VPU device. OpenVino Build Instructions for RaspbianOS

There is another plugin that supports the ARM CPU for OpenVIno that is maintained by OpenCV, but I'm not sure that will bring much value to this project. OpenVino ARM CPU Plugin Repo

At this point, I think I'll keep this to amd64 only. If there is strong desire to add the build of the openvino runtime on arm, we can look at adding it then.

@NickM-27
Copy link
Collaborator

NickM-27 commented Sep 8, 2022

Yeah, the NCS2 is already end-of-lifed so I don't know if that makes sense to worry about.

docker/Dockerfile Outdated Show resolved Hide resolved
docker/Dockerfile Outdated Show resolved Hide resolved
@NickM-27
Copy link
Collaborator

NickM-27 commented Sep 8, 2022

By the way what inference times are you seeing?

@NateMeyer
Copy link
Contributor Author

The model benchmark in the OpenVino toolbox claims my GPU will do 263fps with this model. Looks like the debug page is showing an inference speed of 8-12ms.

This is just measuring latency? My GPU is barely loaded during this, but I am only running with a single camera at the moment. The measure of throughput could be higher than the inference speed suggests if the async requests or batching are played with.

@NickM-27
Copy link
Collaborator

NickM-27 commented Sep 8, 2022

The debug page is the average amount of time it has taken to start the detection then get a result. Maybe with more cameras it could be loaded higher but that is quite good.

@yeahme49
Copy link
Contributor

Just wanted to say I tested this on my i5-4590 using the CPU plugin and I'm getting 15-18ms with 10-30% CPU usage (2 cores on a Proxmox LXC container). Much better than the built-in CPU plugin, I would get around 80-100ms (sometimes as high as 200ms) and 80% CPU usage. I believe my iGPU is unsupported so I can't test that (plus I think it's disabled due to having a dedicated GPU installed).

This plugin gets about the same, if not slightly better, speed as the TensorRT CUDA plugin with my P400 card (#2548 / #3016).

I definitely hope this gets merged into Frigate, awesome job on this!

@NickM-27
Copy link
Collaborator

(plus I think it's disabled due to having a dedicated GPU installed).

Assuming you're running on linux it doesn't work that, they'll both work just fine together. What error were you getting when trying it with the GPU?

@NateMeyer We will definitely need documentation on which intel iGPUs are supported for this.

@NateMeyer
Copy link
Contributor Author

Yea, it should support GPUs per Intel's "Supported Devices" documentation but we can be explicit about it too.

This still needs to either add ARM support for openvino, or add more platform checks in the detector code. As it is, it will crash on the arm64 and armv7 builds.

@yeahme49
Copy link
Contributor

(plus I think it's disabled due to having a dedicated GPU installed).

Assuming you're running on linux it doesn't work that, they'll both work just fine together. What error were you getting when trying it with the GPU?

@NateMeyer We will definitely need documentation on which intel iGPUs are supported for this.

frigatevino    | Process detector:vino:
frigatevino    | Traceback (most recent call last):
frigatevino    |   File "/usr/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
frigatevino    |     self.run()
frigatevino    |   File "/usr/lib/python3.9/multiprocessing/process.py", line 108, in run
frigatevino    |     self._target(*self._args, **self._kwargs)
frigatevino    |   File "/opt/frigate/frigate/object_detection.py", line 121, in run_detector
frigatevino    |     object_detector = LocalObjectDetector(
frigatevino    |   File "/opt/frigate/frigate/object_detection.py", line 66, in __init__
frigatevino    |     self.detect_api = OvDetector(
frigatevino    |   File "/opt/frigate/frigate/detectors/openvino.py", line 16, in __init__
frigatevino    |     self.interpreter = self.ov_core.compile_model(
frigatevino    |   File "/usr/local/lib/python3.9/dist-packages/openvino/runtime/ie_api.py", line 266, in compile_model
frigatevino    |     super().compile_model(model, device_name, {} if config is None else config)
frigatevino    | RuntimeError: Failed to create plugin /usr/local/lib/python3.9/dist-packages/openvino/libs/libopenvino_intel_gpu_plugin.so for device GPU
frigatevino    | Please, check your environment
frigatevino    | [CLDNN ERROR]. clGetPlatformIDs error -1001

If I use the OpenVIno hello_device_query.py sample it only shows CPU, no GPU device

[ INFO ] Available devices:
[ INFO ] CPU :
[ INFO ]        SUPPORTED_PROPERTIES:
[ INFO ]                AVAILABLE_DEVICES:
[ INFO ]                RANGE_FOR_ASYNC_INFER_REQUESTS: 1, 1, 1
[ INFO ]                RANGE_FOR_STREAMS: 1, 2
[ INFO ]                FULL_DEVICE_NAME: Intel(R) Core(TM) i5-4590 CPU @ 3.30GHz
[ INFO ]                OPTIMIZATION_CAPABILITIES: FP32, FP16, INT8, BIN, EXPORT_IMPORT
[ INFO ]                CACHE_DIR:
[ INFO ]                NUM_STREAMS: 1
[ INFO ]                AFFINITY: UNSUPPORTED TYPE
[ INFO ]                INFERENCE_NUM_THREADS: 0
[ INFO ]                PERF_COUNT: False
[ INFO ]                INFERENCE_PRECISION_HINT: UNSUPPORTED TYPE
[ INFO ]                PERFORMANCE_HINT: UNSUPPORTED TYPE
[ INFO ]                PERFORMANCE_HINT_NUM_REQUESTS: 0
[ INFO ]

I also tried running one of OpenVino's docker images that have all the runtimes and drivers installed and I get the same result. It looks like Intel NEO Compute compatibility is required and that is Broadwell and above (https://github.com/intel/compute-runtime#supported-platforms)

@NickM-27
Copy link
Collaborator

@yeahme49 Perfect thanks, that will be helpful for our docs.

@NickM-27
Copy link
Collaborator

@NateMeyer Looks like we'll want to update the docs that intel GPUs are supported 5th gen and up.

@NateMeyer
Copy link
Contributor Author

Intel's documentation looks like 6th gen and up. I'll get this info added after we settle out the NCS2 and ARM support.

@NickM-27
Copy link
Collaborator

Intel's documentation looks like 6th gen and up. I'll get this info added after we settle out the NCS2 and ARM support.

Sounds good, seems like you're making headway with it.

@titilambert
Copy link

Hello, I have a nsc2. I guess I can make a test.
Could you provide a configuration example ?
Thanks

@NateMeyer
Copy link
Contributor Author

NateMeyer commented Sep 23, 2022

Hello, I have a nsc2. I guess I can make a test. Could you provide a configuration example ? Thanks

Hi @titilambert, there are some instructions in the discussion topic #3797. Just add the detector and model blocks to your normal config, and point to my test image. Post a comment over there and I can help with any other questions. Thanks!

@CarlosML
Copy link

@NateMeyer, thank you very much for your work! I have been waiting to be able to use OpenVINO with Frigate, since in my country it is impossible to get a Google Coral USB. Also, previously using the OpenVINO object detection demo I noticed that nanodet models were not giving me the false positives that I am having with ssdlite. As I wanted to test nanodet with Frigate but I don't know anything about how OpenVINO or Frigate works internally I modified the code to add the Model API that is in the Open Model Zoo of OpenVINO, as it abstracts the adaptation of the tensor to the model and process its output. No doubt it is inefficient, at least because although the tensor_transform function explicitly says that only works permutations B, H, W, C the nanodet models need B, C, H, W and yet when passing the tensor to the Model API model adapter the inference is performed correctly anyway. Even with the very anemic GPU of a Celeron J4025 I have a 65-70 ms latency with nanodet-plus-m-1.5x-416 (with nanodet-m-1.5x-416 I had a slightly lower latency, 50-55 ms), using OpenVINO 2022.2.0.

In case anyone is interested in trying nanodet in this gist there are the diffs of the modifications I made, in config.yml you pass model_type as "NanoDet", "NanoDetPlus" or "SSD" and you have to mount openvino-model externally in docker-compose.yml to pass the corresponding IRs (NanoDet uses a different tag file, coco_80cl.txt). I tried to modify Dockerfile to perform the conversion and add it to the Docker image, but when installing the OpenVINO Python dependencies to perform the Pytorch model conversion it starts to crash because of missing libraries (libGL).

I think it is a great advantage of Frigate to support OpenVINO because besides supporting several generations of iGPUs the latest version announces Intel Arc support.

@NateMeyer
Copy link
Contributor Author

Hi @CarlosML, I'm glad you've found it useful! I played with a few different models when I was working on this but I don't think I tried NanoDet. My intent with this detector would be that you convert a model to the IR format, and you can configure the input format in the config (BCHW vs BHWC, etc). I haven't yet added configurable processing on the detection output. You've worked around this with using the Model API from OMZ. This looks like a neat API, I hadn't considered using it before.

Ideally we could have the shape of the results be configurable and generic enough that the description in the model config would apply to different detector frameworks (i.e. Tensorflow-lite, RKNN, etc.). I'm open to suggestions on how that would look. SSD, NanoDet, and Yolo have fairly different output tensors that need to be parsed a bit differently. Perhaps someone could add a model-parse utility with common functions to get labeled detection boxes from the various different model types. I think these could be done in a detector-agnostic way.

@CarlosML
Copy link

One issue that I don't know how important is that Frigate by default uses SSDLite Mobiledet which I understand has a higher accuracy score than MobileNetV2+SSDLite which is in OMZ. I understand this one would be similar to the one Frigate downloads from the Coral repository (minus some post processing), late last year I tried to convert it with OpenVINO and it failed and still fails 'automatically' with the latest version but the suggestion I was given in the OpenVINO repository seems to work fine. Here is the output of mo.py if anybody is interested in trying it.

@BradleyFord
Copy link

Yeah, the NCS2 is already end-of-lifed so I don't know if that makes sense to worry about.

True, but this croud funding campaign just launched which is based on the same processer.
https://www.crowdsupply.com/eyecloud/openncc-ncb

@NickM-27
Copy link
Collaborator

NickM-27 commented Sep 29, 2022

Yeah, the NCS2 is already end-of-lifed so I don't know if that makes sense to worry about.

True, but this croud funding campaign just launched which is based on the same processer.
https://www.crowdsupply.com/eyecloud/openncc-ncb

Odds are it'll have a different SDK / API to interface with it

Nevermind, seems it is going to be able to support the same SDK

@NateMeyer
Copy link
Contributor Author

Intel says they are deprecating the NCS2 support in OpenVino after 2022.3

Depreciation Notice:

Intel® Corporation has discontinued the Intel® Movidius™ Neural Compute Stick 2 (Intel® NCS2). Version 2022.3 LTS will be the last version of the toolkit to support Intel NCS2.

https://www.intel.com/content/www/us/en/developer/tools/openvino-toolkit/system-requirements.html

@NateMeyer
Copy link
Contributor Author

You're installing those from the testing repo? What version is getting installed? This is what I see in the current image

$ docker run ghcr.io/natemeyer/frigate:0.12.0-openvino-8230b34 apt list --installed | grep intel-

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

intel-media-va-driver-non-free/now 21.1.1+ds1-1 amd64 [installed,local]
intel-opencl-icd/now 20.44.18297-1 amd64 [installed,local]

docker/Dockerfile Outdated Show resolved Hide resolved
@NickM-27
Copy link
Collaborator

Looks good, we'll definitely want to update the docs recommended hardware to users know that these options exist, but I think it makes sense to wait given Nvidia Tensor devices will probably be added to that as well.

Only other thought is might want to look at #4395 and see if that can also work for the stages that are being added in this PR

@NateMeyer
Copy link
Contributor Author

@EduardoLaranjo I believe the updated driver support will be included in #4368

@NickM-27
Copy link
Collaborator

@NateMeyer this will need to be rebased on dev since there have been a bunch of docker changes. Should make for a cleaner build though

@NateMeyer
Copy link
Contributor Author

@NateMeyer this will need to be rebased on dev since there have been a bunch of docker changes. Should make for a cleaner build though

Yes, I've been watching for it to settle out. Are there any more issue or changes for the build/docker/dev env needed or should it be ready now?

@NickM-27
Copy link
Collaborator

Yes, I've been watching for it to settle out. Are there any more issue or changes for the build/docker/dev env needed or should it be ready now?

I think there may be a few stragglers but I think (hope) the structure should be finished and no more organizational changes

@NateMeyer NateMeyer force-pushed the ov-detector branch 2 times, most recently from 1e975b9 to fdcc476 Compare November 24, 2022 17:51
@NateMeyer
Copy link
Contributor Author

Rebased. Built a new test image at ghcr.io/natemeyer/frigate:0.12.0-openvino-fdcc476

@nickp27
Copy link

nickp27 commented Dec 3, 2022

@NateMeyer love your work, this has been amazing on my intel thin client for lowering power and CPU usage. One thing that has plagued me is that ssdlite_mobilenet_v2 seems to be causing a ridiculous number of false positives picking up a a firepit and some pot plants as human, but at such a high percentage that I can't eliminate it with thresholds (see below):

Capture

Any guides or instructions on what I need to change in the Dockerfile to test another model (I understand from the above that anything with the same post-processing can just be swapped out) - I assume by simply changing the omz-download and omz-converter lines.

@blakeblackshear
Copy link
Owner

Is there any outstanding work for this PR, or is it ready to merge? Have you tested a multiarch build?

@NateMeyer
Copy link
Contributor Author

Is there any outstanding work for this PR, or is it ready to merge? Have you tested a multiarch build?

I think this is ready. The only outstanding work may be tweaking the included model, as @nickp27 points out above. I have been building the multiarch image, but I've only run it with qemu personally (amd64, arm64, and arm32). There was some work in #3797 to run it with the NCS2 stick on amd64 and arm64. The image seemed to work ok on arm64, but I think we were running into some issues with the NCS2 on kubernetes.

Any guides or instructions on what I need to change in the Dockerfile to test another model (I understand from the above that anything with the same post-processing can just be swapped out) - I assume by simply changing the omz-download and omz-converter lines.

@nickp27, Clearly you need to move your plants to not be so lifelike 😆 . I'll dig up my notes on which models dropped in. If this gets merged I think you can open a regular issue and may get other suggestions about masking it out, I'm not as familiar myself with creating zones. Are you thinking of rebuilding the image or you can run just the ov-converter stage and mount the converted model in the prebuilt image.

@blakeblackshear blakeblackshear merged commit e5fe323 into blakeblackshear:dev Dec 3, 2022
@NateMeyer
Copy link
Contributor Author

@nickp27 here are a list of some models I think will drop in from the Intel OMZ.

  • ssdlite_mobilenet_v2 (default)
  • mobilenet-ssd
  • efficientdet-d0-tf
  • ssd300

These models may have different sizes and labelmaps that also need to have the correct adjustments in the model config. Also, I didn't have much luck with the ssd300 model, I didn't debug why I wasn't getting detection boxes even though the debug log seemed to have detections. Others may be the same.

I wanted to try a nanodet or yolo model, but this work was left for another time.

@nickp27
Copy link

nickp27 commented Dec 4, 2022

@NateMeyer Thanks for that! I tried converting yolov4-tiny to match the development work being done on tensorrt but haven't figured out how to parse the multiple outputs just yet. Will give a combination of new models and some filters a go!

@thekillerpt
Copy link

not sure if this is the right place, just to provide some feedback,
HA running on a laptop 2630qm , I7, 2nd generation sandy bridge cpu, 5 reolink cameras detecting at 640x352 and capturing at 2k
tried to enable open vino just by using the config example on beta docs just to give it a go

cpu util dropped from 60% (at most, had only 4 cores allowed in detectors) to 25% (at most)

inference speed dropped from 100ms to 75ms

thanks a lot for this ! :-) no need for coral nor machine upgrade for now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.