Sharing setups with CodeProject AI on Coral hardware (Spring 2024)

silencery · Apr 22, 2024

Hey ipcamtalk. Sorry if this has been brought up, but things move quickly, so just wanted to get some opinions on the best hardware accelerator options for under $100USD? I was previously looking at coral m.2 since I have one lying around, but it won't work in my current Intel-based system. I can spare 1 PCIe slot in my system, so I'd like to ask what the best options would be right now. Used Nvidia Quadro? USB stick? Something else I'm not aware of?

I don't really have a target inference level, just something better than CPU or 630 iGPU which is currently around 150-200ms. I guess 50ms or faster would be nice?

Thanks!

Edit: This conversation took a turn to focus solely more on Google Coral TPU setups, so editing the title accordingly.

Edit (5/11/2024): Here's the Coral/CP.AI setup I've settled with for now. If you're new to BlueIris and CP.AI, remember to read this before starting: FAQ: Blue Iris and CodeProject.AI Server

Hardware

Coral M.2 dual TPU. There's also an option for a single TPU or other form factors. USB version has been documented to be unstable. M.2 version seems to be the most stable and performant
PCIe adapter: You can get a generic adapter, but only one core will work. This specific adapter will support dual TPUs Dual Edge TPU Adapter - m.2 2280 B+M key

CP.AI configuration

CP.AI version: 2.6.2
Modules: Object Detection (Coral) v2.2.2
Model: Medium yolov5 (according to the UI, but it could be actually using a different model?)
According to @mailseth, it's suggested to enable multi-TPU support for better stability even if you're running single core TPU

BlueIris Configuration

BI version: 5.9.1.0
Main config:
- Auto-start with Blue Iris: unchecked (this is just my preference, I run CPi.AI as a standalone windows service on the same box. this was to address the delayed startup bug with BI)
- Use custom models: unchecked (no idea if this makes any difference)
- Default object detection: checked
Per camera config:
- To confirm: Same as CPU version. whatever you want to scan for as usual. In my case, i'm using "person,car,truck,bicycle"
- Custom models: Blank. I had to remove all my custom models to get the TPU version working. I wanted to scan for just vehicles and people and remove object detection with "objects:0" but this caused all confirmations to fail.
- Static object analysis: Because you have to keep custom models blank, there's no way to filter out stuff you don't care about. The yolov5 coral model may turn up tons of results such as "potted plant, banana, banana." This means if you keep static object analysis on, you may flood the TPU with too many requests. Unless you need it, it's probably best to keep it off. Thanks to @koops for this reminder

Results

TUNING REMINDER: You can test any AI setup by watching a clip in BlueIris: Right click on the video, go to "Testing & Tuning" and then choose "Analyze with AI" You should see realtime detections from there. While the video is playing, you can also watch the CP.AI console logs to see how it's responding.
Speed: Object Detection inference times are averaging 20-30ms for medium model.
Accuracy: Using yolov5 and medium model size, object detection accuracy is reasonably good for my purposes (detecting people and vehicles). As others have reported, accuracy isn't as good as CPU, but with filtering, it's fine. I've not seen any person being missed, although it also returns lots of false low confidence positives (elephant, airplane, banana, banana, banana....).
Stable? yes, at least for a week so far
Single TPU working? Yes (but enable multi-TPU support anyway since it's more stable)
Dual TPU working? Yes
Overall impressions: Despite not being able to use custom models, the coral setup is overall faster and still accurate enough for my use case. Having it allows me to offload CPU cycles to other stuff, which was my goal at the outset. Pretty nice!
I am NOT using the coral for LPR or facial recognition. Still using CPU for those.

AlwaysSomething · Apr 22, 2024

Not to change the subject but why won't the Coral M.2 work for you? I have one that I've been using since the beginning of the year. I just bought a Dual TPU with a PCIe adapter so I can use a Medium model size since that worked beest for me. However, I'm having issues with the Dual in my BI PC so went back to the M.2 for now. I think there is a bug in CPAI 2.6.2 when enabling multi tpu.

FYI - I bought one of these adapters for the Coral M.2 to use in a PCIe slot on a different machine and it works:

https://www.amazon.com/gp/product/B07JJTVGZM/ref=ppx_yo_dt_b_search_asin_title?ie=UTF8&psc=1

I didn't have an M.2 slot available and for $10 I figured I'd give it a shot. Been working fine so far. Might be an option until you get some responses here.

silencery · Apr 23, 2024

Oh, great question.

My topic was intended to be a conversation reference for others in the same boat right now. I was looking around and noticed most of the recommended M.2 to PCIe adapters were in the $30 range and Coral Dual TPUs are now back in stock for $40 ($70 total). Got me wondering if a used GPU might be high performing enough to be better for about the same price point to be cheap enough as a toy to play with ($100).

HOWEVER, that adapter you linked is cheap enough to be a bit of a gamechanger. What a great find. If a coral Dual TPU + adapter can be used in most machines (assuming PCIe support is better than M.2 support), the target price point gets pushed down even lower which is certainly a bonus.

mailseth · Apr 23, 2024

There is also this one, but your motherboard has to be compatible with bifurcation.

https://a.co/d/cEtN1G7

silencery · Apr 23, 2024

Awesome. My board does support birfucation. Do you have first-hand experience with that adapter? I'm wondering if the Dual version of the coral works on it.

mailseth · Apr 23, 2024

I do not have experience. It was something I looked into and it seemed interesting until I learned I needed bifurcation and didn’t have it. With that one in particular you may need 4-way bifurcation, which is even more rare, but as I understand it if you only have 2-way that just means that only two of the four slots work.

I’d guess that you would need this adapter with the above linked adapter with the Dual TPU because the Dual TPU itself uses to separate PCIe interfaces.

Dual Edge TPU Adapter - m.2 2280 B+M key

Dual Edge TPU Adapter is designed for Coral m.2 Accelerator with Dual Edge TPU to be used on a system with m.2 2280 B- or M-key slot available.

www.makerfabs.com

But I really don’t know. You’d need to test it out and get back to us.

mailseth · Apr 23, 2024

I’m hoping that the guy finishes this board:

Dual-Edge-TPU-Adapter/PCIex4_2xDualTPU_s.jpeg at main · magic-blue-smoke/Dual-Edge-TPU-Adapter

Dual Edge TPU Adapter to use it on a system with single PCIe port on m.2 A/B/E/M slot - magic-blue-smoke/Dual-Edge-TPU-Adapter

github.com

And then we can drop a few duals into a single slot.

AlwaysSomething · Apr 23, 2024

I bought both versions of the dual adapter, the M.2 and the PCIe. You can use the DUAL TPU M.2 adapter on top of the adapter I posted above and it works. In other words the following will work:

Coral Dual TPU -> M.2 adapter from Makerfabs -> M.2 to PCIe adapter I posted above.

However, if you are going to go the PCIe route you mind as well buy the PCIe adapter meant for the Dual TPU (one less adapter). In my case I tried it out since I already had the hardware and it works.

BTW - The reason I have this all is I bought all my stuff in the middle of the night impulsively (and lacking sleep or brains) so that is why. I should have returned most of it but the cost to ship back wasn't worth the refund so I kept everything and figured I may be able to use on spare PCs later.

silencery · Apr 23, 2024

Thanks for clarifying. Gotta love those late night purchases! We've all been there.
This reminds me all over again that the Dual TPU should be used in dual channel mode in order to use both processing units. An adapter which properly supports the necessary 2x PCIe channels for the Dual TPU to work costs minimum $30, so back to square one:

I wonder if a used quadro for the $100 price range would be better than a coral M.2 right now since a coral is already 5 years old anyway.

mailseth · Apr 23, 2024

The most important part of my calculus ended up being power consumption, which is pretty hard to measure. Running a GPU on a steady stream of video for a year is going to cost a bit. For me, for example, I calculated out that if the card ended up using an extra ~40 watts, I'd spend $100-$200 per year in electricity just to power the thing. The exact power usage of a GPU is going to be very debatable, however, and I don't know of a good way of knowing it in advance. However, we do know that the TPUs are much cheaper and tend to run at under 1 watt each.

silencery · Apr 23, 2024

Yeah, absolutely. power, heat, and noise budget are all part of the calculus. Costs are one thing, but noise could be an even bigger dealbreaker for some.

Too bad benchmarks don't exist for CPAI. Trying to get a handle on what to expect out of various platforms

silencery · May 10, 2024

Just wanted to report back on this. Ordered the dual TPU PCIe adapter. Ran into far more hiccups than I should have along the way on both hardware and software side of things, but finally got it installed and working. I've been able to meet my goal of moving all my object detection to the dual TPU to free up CPU for other things (transcoding, LPR, facial recognition, etc)

I don't have a GPU to compare against, but the very early reaction is mostly positive.

Object Detection inference times are averaging 20-30ms for medium model.
Using yolov5 and medium model size, object detection accuracy is reasonably good for my use case (detecting people around our house). As others have reported, accuracy isn't as good as CPU, but with filtering, it's fine.

Does anyone happen to have any best practice recommendations for CP.AI 2.6.2 setup with dual coral? Which model to use (yolov5, yolov8, mobilenet, SSD), custom models, model size?
Can you filter out stuff you don't need with coral models? With CPU processing and the built-in models, I was able to add filters with custom models per camera. I had to remove them to get the coral running correctly.

Thanks!

mailseth · May 10, 2024

‘Best’ practices is going to be what works best for you. People tend to be pretty happy with the custom IPcam models, but they aren’t really available right now on the TPU. YOLOv8 is the best in theory, but is going to require more compute and so may not be the best for you. Similarly, there may be bugs in whatever the current version is.

The multi-TPU code is a work in progress. The version on my machine runs faster due to fewer threads and OpenCV using AVX instructions. But I don’t know when it will roll out and what will break. So experiment and let us know?

silencery · May 10, 2024

Yeah, i know coral support was added just recently, and things are still really fluid, so I fully expect it to be all really experimental. That's why I'm asking what others are seeing so we can share some data

Definitely will experiment on my side and share any results I can. Thanks @mailseth

For starters, getting over the setup hump was already a bit of a process. Here is what I observed from the setup:

The first 1-2 days, object detection under dual TPU was really unreliable. I encountered all different kinds of situations: object detection engine crashing, TPU not detected, falling back to CPU, only single TPU detected, long detections (4000+ ms), or just plain timeouts
I'm not even sure what fixed things, but what seems to have stabilized the setup was changing the coral model to yolov5. Once things stabilized, everything has been really fast.
As noted above, this is based on a virtualized windows install on proxmox with the dual TPU passed through (not shared). CP.AI version 2.6.2.

AlwaysSomething · May 10, 2024

For my testing I found EfficientDet Lite and Medium size to accurately detect objects but as Seth said everyone will have different opinions.

Not to go off topic but I think CPAI 2.6.2 is broken in that it is always using one model regardless what you choose. I think it was MobileNet SSD. When testing using the Explorer I noticed my times would match that when NOT using the Custom Detect. You can also see the first "run" would take longer if using another model since it has to load it into memory (2nd run would be quicker since already loaded in memory). In other words running MobileNet SSD under Custom would match the time and accuracy from the regular detect even if you had a different model configured. Hope that makes sense.

The last version I had I think was 2.5.1 which was working for the Model selection so that's how I ended up on EfficientDet Lite and Medium.

Unfortunately, I couldn't revert because the installer couldn't find a file anymore. I am in the process of moving off of bare metal to Proxmox for that very reason (I can keep a backup version without needing to rely on installers or files still being available).

silencery · May 10, 2024

AlwaysSomething said:
Not to go off topic but I think CPAI 2.6.2 is broken in that it is always using one model regardless what you choose. I think it was MobileNet SSD. When testing using the Explorer I noticed my times would match that when NOT using the Custom Detect. You can also see the first "run" would take longer if using another model since it has to load it into memory (2nd run would be quicker since already loaded in memory). In other words running MobileNet SSD under Custom would match the time and accuracy from the regular detect even if you had a different model configured. Hope that makes sense.

Not topic at all. Very much appreciated. I was suspecting the same thing and I was hoping someone could confirm.

I also wasn't sure which version of CP.AI would be the most stable, and I haven't had the time to play with all the different releases.

Since it's taken a slightly different turn than anticipated, I should probably re-title this thread Spring 2024 Coral CP.AI, haha.

AlwaysSomething · Saturday at 10:37 AM

silencery said:
Spring 2024 Coral CP.AI, haha.

With all the versions and variables I feel we almost need threads like that. :lol:

silencery · Saturday at 8:37 PM

Done. Updated top post with my results and changed title of the thread to focus just on Coral on CPAI

koops · 2024-05-16T03:16:57-0400

I have a similar configuration i've just moved to in the last two weeks. (using your original post as a template).

Previous
I was previously using CPAI via cpu intel only i7-9700.
detections were yolov5 custom ipcamtalk models detection time 120ms - 300ms.

Now
Hardware

Coral M.2 single TPU. M.2 Accelerator B+M key | Coral. Specific purchase link I used (AUS based) : Coral Google Mini PCIe M.2 Accelerator B/M Development Kit G650-04686-01 | RS

Software configuration

CP.AI version: 2.6.2
Modules: Object Detection (Coral) v2.2.2
Model: Medium yolov5. Note: my BI configuration says "Default Object Detection : small". Now I manually changed the model from small to medium within the codeproject control panel. I likely probably should have changed this from within BI itself.

Results

Speed: Object Detection inference times are averaging 19-116ms for medium model. (i have some internal cameras so they are picking up furniture models). Most outdoor cameras are sub 50ms.
Accuracy: Seems to be maybe as good as cpu detection but I dont really have any metrics to back this up.
Stable? yes, at least for 14 days hours so far

Other
I didn't investigate this enough to realize that that there could be issues with single TPU implementations. Specifically the error "Unable to run inference: There is at least 1 reference to internal data". From googling this I think it means that TPU is already busy and is unable to process the current request.
Checked the codeproject AI log window and noticed what appeared to be near constant analysis.
Reduced this by checking my BI settings and seeing that I had "Static object analysis" turned on. This reduced the issue significantly. I'd guess that the dual TPU implementations do not see this nearly as much?

One thing I noticed when moving from the custom models to the fixed yolov5 was mode TPU time spent on objects that arent particualrlly useful in a security camera situation.
None of the objects are in my target list but do result in the TPU being 50% more utilized. Once again likely something that can be reduced with targeted use of BI options like static object detection so not a coral AI issue directly.

Picture just because I thought this analysis was funny.

mailseth · 2024-05-16T03:26:21-0400

You should be always able to use the multi-TPU implementation, even if you only have one. Basically the single TPU just refers to the old code base and multi TPU the newer code base that does better threading. That should fix the error that you were seeing.

Sharing setups with CodeProject AI on Coral hardware (Spring 2024)

Pulling my weight

Getting the hang of it

Pulling my weight

Getting the hang of it

Pulling my weight

Getting the hang of it

Getting the hang of it

Getting the hang of it

Pulling my weight

Getting the hang of it

Pulling my weight

Pulling my weight

Getting the hang of it

Pulling my weight

Getting the hang of it

Pulling my weight

Getting the hang of it

Pulling my weight

n3wb

Getting the hang of it