DeepStack Case Study: Performance from CPU to GPU version

jaydeel

BIT Beta Team
Joined
Nov 9, 2016
Messages
1,132
Reaction score
1,240
Location
SF Bay Area
I tried using two instances on my set up and I did not see much of a difference at all with response time. I'm sure my results are not typical since I'm running an older i5 with a P400 GPU.
These are my observations as well (I7-4770 with a P400 GPU). I’ve analyzed 1000’s of events comparing 1 vs 2 instances; the average processing times are 140-145 ms regardless (ignoring events over 250 ms, typically 5-15% of the dataset).

Furthermore, when using 2 instances, I‘ve observed similar results both with and without assigning Deepstack ports per camera.

I emailed support asking “Does Blue Iris assign a DeepStack port automatically if multiple instances exist and ports are not assigned to individual cameras?”. The answer was “Yes”.

I then asked if the Blue Iris log entry could be modified to show which Deepstack instance port was used, e.g., DeepStack: port:82 truck:65% [196,17 353,96] 143ms and was informed my email was tagged as a feature request. More requests might get this implemented.
 
Last edited:

sebastiantombs

Known around here
Joined
Dec 28, 2019
Messages
11,511
Reaction score
27,691
Location
New Jersey
@sebastiantombs Any particular reason why you run both at night and not just the dark model?
I run both at night due to the view angles and the way headlight bloom affects DS detection. Sometime the stock model works, sometime the dark model works and sometimes they both work. It only adds about 60ms detection time so there's not much of a hit on either the CPU or GPU.
 

jaydeel

BIT Beta Team
Joined
Nov 9, 2016
Messages
1,132
Reaction score
1,240
Location
SF Bay Area
I’ve analyzed 1000’s of events comparing 1 vs 2 instances; the average processing times are 140-145 ms regardless
Quoting myself to make a correction (I was going from memory instead of data, D’oh!).

After tabulating my data since late September, it appears going to 2+ DeepStack instances lowered the processing time from ~165ms to ~143ms. A small difference that is within 1 standard deviation of the dataset.

1638901685377.jpeg
Note: the “Outliers” column is =C4/D4-1
 
Last edited:

jaydeel

BIT Beta Team
Joined
Nov 9, 2016
Messages
1,132
Reaction score
1,240
Location
SF Bay Area
One last note… I’ve decided to return to 1 instance because I have found that restarting Blue Iris (after updates, etc.) is slowed considerably (CPU at 100% for minutes) by loading additional DeepStack Python modules. I’ve not verified this with testing, but it appears to me that additional instance Python modules are opened not just once, but also by each camera.
 

sebastiantombs

Known around here
Joined
Dec 28, 2019
Messages
11,511
Reaction score
27,691
Location
New Jersey
I was just in the office and grabbed a screen cap from task manager, Four instances of python.exe. Note the CPU utilization on the top two.

python instances.JPG
 

105437

BIT Beta Team
Joined
Jun 8, 2015
Messages
2,029
Reaction score
932
These are my observations as well (I7-4770 with a P400 GPU). I’ve analyzed 1000’s of events comparing 1 vs 2 instances; the average processing times are 140-145 ms regardless (ignoring events over 250 ms, typically 5-15% of the dataset).

Furthermore, when using 2 instances, I‘ve observed similar results both with and without assigning Deepstack ports per camera.

I emailed support asking “Does Blue Iris assign a DeepStack port automatically if multiple instances exist and ports are not assigned to individual cameras?”. The answer was “Yes”.

I then asked if the Blue Iris log entry could be modified to show which Deepstack instance port was used, e.g., DeepStack: port:82 truck:65% [196,17 353,96] 143ms and was informed my email was tagged as a feature request. More requests might get this implemented.
Wait, so if I have 2 or more instances set, BI will randomly assign each camera to an instance? I thought you had to specifically assign the instance for each camera in the camera settings.
 

sebastiantombs

Known around here
Joined
Dec 28, 2019
Messages
11,511
Reaction score
27,691
Location
New Jersey
That may have been incorporated in a recent revision. When I first implemented the second instance I didn't see much difference until I assigned cameras specifically to the second instance. All of that assumes my memory isn't faulty.
 

jaydeel

BIT Beta Team
Joined
Nov 9, 2016
Messages
1,132
Reaction score
1,240
Location
SF Bay Area
I was just in the office and grabbed a screen cap from task manager, Four instances of python.exe. Note the CPU utilization on the top two.
I'm talking about what I see in Task Manager while stopping, then restarting the Blue Iris service.

Here's what I see:
  • 1 instance takes about 10 seconds to stabilize (I.e., the cpu usage returns to normal range).
  • 2 instances take about 22 seconds to stabilize.
  • 3 instances take about 80 seconds to stabilize.
This is an exponential trend. I'm trying to create screen video captures to illustrate.
 

sebastiantombs

Known around here
Joined
Dec 28, 2019
Messages
11,511
Reaction score
27,691
Location
New Jersey
The screen shot I posted was after at least a week of running, non stop, with two instances. That is telling me that BI, or maybe DS, is spawning a second python.exe for each instance. Total of four instances of python.exe for analyzing screen shots.
 

jaydeel

BIT Beta Team
Joined
Nov 9, 2016
Messages
1,132
Reaction score
1,240
Location
SF Bay Area
Wait, so if I have 2 or more instances set, BI will randomly assign each camera to an instance? I thought you had to specifically assign the instance for each camera in the camera settings.
It was when I observed the same results when I assigned vs not that I emailed Ken and received the response above.

I would not assume that the DeepStack port is randomly assigned. Support's response to a second question included this feedback "DeepStack is a server application listening on a port instructed at start-up. Sending requests to which server is managed by BI."

After reading this, the phrasing of the "Override server" setting made more sense... it "overrides" the default Blue Iris decision.

Note that the help pdf does not state that port assignment is required.

1638926349560.png
 

jaydeel

BIT Beta Team
Joined
Nov 9, 2016
Messages
1,132
Reaction score
1,240
Location
SF Bay Area
The screen shot I posted was after at least a week of running, non stop, with two instances. That is telling me that BI, or maybe DS, is spawning a second python.exe for each instance. Total of four instances of python.exe for analyzing screen shots.
Was your screenshot taken in the same context as mine below?
Note: I've sorted on the Processes column to group all existing python.exe processes.
1638926891197.png

Note that mine shows 6 running and 1 just-terminated Python processes. I'm periodically running other Python scripts, so not all of these are originating from Blue Iris-DeepStack.

This is a dynamic table for me. I just grabbed it again and now there are many more python.exe processes...
Interestingly the same PIDs remain active; perhaps only these are the DeepStack ones?
If so, I've got 6 while supposedly running a single DeepStack instance.
I've got 5 cameras using DeepStack right now, so maybe that's why?
Head scratcher.
1638927132397.png
 

sebastiantombs

Known around here
Joined
Dec 28, 2019
Messages
11,511
Reaction score
27,691
Location
New Jersey
Yes, same context just more "zoomed in. With other instances of python running, for other purposes, how are you determining which are for BI?
 

jaydeel

BIT Beta Team
Joined
Nov 9, 2016
Messages
1,132
Reaction score
1,240
Location
SF Bay Area
With other instances of python running, for other purposes, how are you determining which are for BI?
Fair question. To isolate Blue Iris + DeepStack actions. I did the following:
  1. Disabled the other python scripts (they run from Task Scheduler).
  2. Rebooted.
  3. Monitored resources before and after restarting the Blue Iris service.

Screenshot before - 6 python processes open.

1638986438475.png

Screenshot after. 6 instances immediately terminated (note PIDS match those above), then 6 new ones created.

1638986486092.png

This seems pretty definitive to me.

BTW, I ran another experiment as above after before and disabling DeepStack (all profiles) for one of my 5 cameras that are using DeepStack.
The result was the same - 6 python processes running for 1 DeepStack instance.

I'll keep digging.
 

toastie

Getting comfortable
Joined
Sep 30, 2018
Messages
254
Reaction score
82
Location
UK
Some things I'm not clear about here. I have BI and 11.3 additionals with DeepStack apparently working OK on a PC with its i7-7700 CPU, I followed the instructions when i migrated from DeepStack CPU to Deepstack GPU after I moving my SFF from using the onboard Intel HD 630 with quick sync hardware acceleration, to having a dedicated separate NVIDIA T600 PCI card (NVIDIA CUDA Cores 640).

After following the instructions I got things working, but on IPCT there seems some additional advice that users of DeepStack need to follow.

Do I need to install Visual Studio for DeepStack to function properly on W10?
Do I need to install Zlib for DeepStack to function properly? There are many warnings against installing zlib because of possible malware risks.

Is W10 python fully available to DeepStack with 11.3 additionals? Do I have to install Pytorch, numpty, Pillow, scipy, torch, torchvision all _amd64 on my Intel CPU based system, or is that only for certain NVIDIA GPU cards and for those with AMD CPUs? Anyway what's torch and torch vision all about? Why is it missing for the requirements to use a GPU with DeepStack on the DeepStack website?
 
Last edited:

sebastiantombs

Known around here
Joined
Dec 28, 2019
Messages
11,511
Reaction score
27,691
Location
New Jersey
I installed Visual Studio when I installed the GPU version. The rest of those add-ons seem to be related to training a new model from what I've seen. As long as you followed the instructions on the DeepStack forum/page you should be fine.
 

sebastiantombs

Known around here
Joined
Dec 28, 2019
Messages
11,511
Reaction score
27,691
Location
New Jersey
Maybe the Quadro handles things differently than a gaming card like the 970, 1050, 1060 and so on. It is designed for graphics intensive applications, not gaming.
 
Top