Sometimes you just have to start over from scratch

kklee

Pulling my weight
Joined
May 9, 2020
Messages
187
Reaction score
203
Location
Vancouver, BC
tldr; Weird problems due to odd H/W failure lead to a rebuild on a new PC

I thought I'd share my tale of what I found when my BI system started to behave strangely.

My system is a i7-7700K with 16GB RAM, SSD system drive, and two surveillance disks (4TB WD and 8TB Skyhawk). This system has been running for years and was very stable, running Deepstack as well with decent CPU utilization, around 30% when idle). There are 10 cameras, mix of 2MP, 4MP, and 8MP cameras. Most of the cameras are configured for motion trigger with AI analysis via DS. There are clones of 75% of the cameras recording continuously from main stream.

Over the past month or so, DS seemed to consume more and more CPU, I thought it was related to enabling the dark model. Idle CPU would fluctuate between 40% and 60%, and hit 100% when DS was running.

I then added a Quadro P400 GPU and configured DS for GPU, which seemed to help with the CPU usage, but it was still peaking around 80% during DS analysis, which was odd.

After a couple of days of fiddling around with BI configurations, the CPU issue got even worse. Eventually, DS kept timing out returning error 100. Disabling AI didn't seem to make a difference. CPU usage would fluctuate wildly without any motion detections, hitting 100% spike every few seconds and dropping down to around 40%.

I ran a hardware monitoring utility and it showed that the CPU was running hot, 90C and was current throttled. I also tried running the Intel XTU tuning utility and it also showed the same thing.

Problem found, whenever the CPU went above 40%, throttling would kick in due to temperature, bottlenecking the system. It looks like I have a hardware failure of some kind. Probably made worse by the heat we've been having here. I pulled the big Noctua heatsink off and checked the thermal paste in case it had dried out, but it was okay.

Fortunately, I had another CPU/Motherboard handy, so I built a new BI server using an i7-8700K with the Quadro P400 GPU.

I did a fresh installation of Windows on a new NVME drive, and moved the two spinning disks to the new system.

I installed BI and DS for GPU and restored a copy of the BI config from the original machine.

CPU still fluctuates quite a bit when idle on the new system, but idles as low as 20% now. I deleted all the cameras and recreated everything just in case, but BI utilization still jumps around quite a bit. Still an major improved over the old system. I guess it's just the way I have things configured. Everything's optimized, HA, substreams, direct to disk, etc.

Everything working great now and I have a little head room now if I add or upgrade to higher resolution cameras.
 

wittaj

IPCT Contributor
Joined
Apr 28, 2019
Messages
25,189
Reaction score
49,083
Location
USA
Regarding your HW failure, many of have noticed and posted that with the introduction of DeepStack, many people have been having issue with Hardware Acceleration. Fortunately with substreams now, HA isn't as big of a deal so many of us have simply turned that off.

But for a 7th generation machine with 10 cameras at 30% when idle and an 8th generation at 20% idle, it is clear you have not done every optimization or you are using the computer for other things.

Do EVERY optimization in the wiki and you will be sub 10%, maybe even sub 5% CPU at idle.

I am on a 4th generation with way more cameras than you and my idle is 10%. Another member here runs 50 cameras on a 4th generation CPU at 30%.

And by EVERY, I mean EVERY. Too many people come here complaining of high CPU usage and claim they have done every optimization in the wiki and once they post screenshots, we see they are not using the substreams and that is probably one of the biggest CPU savers. Do not skip one because you think it isn't important or won't make that big of a deal. Even dropping frame rate a few FPS can make a big difference. No reason to run more than 15FPS, and many us have cams running at 10 to 12 FPS.

If you do not understand what something does in BI, then ask. Too many people also change a setting not realizing what it does and actually makes their performance worse.

Please post a screenshot of your BI camera status page that shows FPS, MP/s, etc. so we can see what is going on. 20% for 10 cameras on a 8th generation CPU is too high.
 

kklee

Pulling my weight
Joined
May 9, 2020
Messages
187
Reaction score
203
Location
Vancouver, BC
I'm a long time BI user ((going back to V3), so I'm very familiar with BI and optimizations.

The computer is dedicated to BI, nothing else running on it.

Cameras are optimized at 15fps and configured where possible with keyframe ratio of 1, with the exception of a couple that have to run at 30 fps. HA, substreams, direct to disk, etc, all done. Continuous recordings are split to 2 different spinning disks.

Please note that I'm not complaining about CPU utilization. I'm happy with the way it's performing and I'm not planning on trying chasing any more optimizations, it's probably as good as it's going to get, given how I want the cameras and recordings to be configured.

The point of the original post was that CPU issues could be caused by a hardware problem that was not obvious. I still need to do a post mortem on the bad CPU/motherboard to determine what failed.
 

wittaj

IPCT Contributor
Joined
Apr 28, 2019
Messages
25,189
Reaction score
49,083
Location
USA
It will be interesting to hear in your post mortem on if you find out if it was a hardware failure of a component or was it due to Hardware Acceleration issues that have happened since DS was introduced that was mimicking a failure of a component? My cameras went crazy in BI and would show a variety of errors in the log and once I disabled HA, the problems went away, and many have experienced the same thing.
 

kklee

Pulling my weight
Joined
May 9, 2020
Messages
187
Reaction score
203
Location
Vancouver, BC
It will be interesting to hear in your post mortem on if you find out if it was a hardware failure of a component or was it due to Hardware Acceleration issues that have happened since DS was introduced that was mimicking a failure of a component? My cameras went crazy in BI and would show a variety of errors in the log and once I disabled HA, the problems went away, and many have experienced the same thing.
That's an interesting observation.

I'm currently running a stress test, which doesn't include the GPU, on the suspect system. The interesting thing is that the CPU is no longer hitting 90C and bottlenecking, it maxxed out at 83C. Now I'm wondering if it is a GPU related issue with the i7-7700K.

More sleuthing required... I may toss another HD into the old system and configure a few cameras to see what happens, Fortunately, I have a spare BI license.
 

wittaj

IPCT Contributor
Joined
Apr 28, 2019
Messages
25,189
Reaction score
49,083
Location
USA
Wow. It certainly is a weird issue - with HA turned on and the cameras erroring out in an update, if I rolled back to 5.4.6.3 then all the cameras would work again. Update and they would get crazy errors relating to HA or some other weird issue. As soon as I turned HA off on the problematic cameras, all worked fine again.

I have slowly started adding HA back in just to see if I can pinpoint why or when it happens. I am probably getting close because the last camera I just turned it back onto is starting to get weird errors again.
 

Flintstone61

Known around here
Joined
Feb 4, 2020
Messages
6,664
Reaction score
11,038
Location
Minnesota USA
I think I turned HA acceleration off on my i5-8500 bi machine and slowly added it back cam by cam and found the Jidetech Ptz’s were an issue. But only in certain Versions of BI. I can’t access my system to see what ver. Im currently running. i do not use DS. I do not understand the advantage/ purpose of Clones. Although i wish i could clone my gf so one of them would stay home from the Casino.
 

wittaj

IPCT Contributor
Joined
Apr 28, 2019
Messages
25,189
Reaction score
49,083
Location
USA
Clones add much more functionality to your system.

Before I got an autotrack PTZ, I created the "poor man's" auto track for a PTZ by using the "clone and zone" method. I had 10 cloned cameras and each one has a designated zone and if something enters that zone, it calls up the PTZ preset for that particular area. The results were fairly promising. When I got an autotrack camera, the non-autotrack went to the backyard and I set up a similar situation and the PTZ will mimic autotracking if anyone is on my patio or a portion of my hard.

You can also use clones to further customize your system. You can set one up for IVS triggers and alerts and then set up a clone that triggers for any motion without alerting you for everything.

I cloned another camera that is only active from 1am to 6am and will send a text message to my neighbor down the street if someone is walking past my house (we are on deadend subdivision street so the only time people are walking around at that hour are door checkers). He is overboard with tactical gear and jumps at the chance to use his night vision goggles and what not LOL.

Those are just a few examples.
 

sebastiantombs

Known around here
Joined
Dec 28, 2019
Messages
11,511
Reaction score
27,696
Location
New Jersey
I use clones of two cameras that have the street, partially in their FOV, to track street traffic. I use clones to detect motion in specific "sensitive" areas. That allows different detection schemes and sensitivities that wouldn't work well in the wider view, 24/7, "master" view. None of my clones display on the console, they're all hidden and in a group called "hidden", boy is that original or what? Makes it easier to view the console without all the extra clutter, and makes it easy to find the right camera for tuning purposes.
 

kklee

Pulling my weight
Joined
May 9, 2020
Messages
187
Reaction score
203
Location
Vancouver, BC
In my case, I use clones for 24x7 main stream recording as MP4. The masters are configured for capturing motion events (as BVR) with DS analysis. I also configured the clones as hidden to reduce the console view clutter.

A note related my original post of moving BI to a new system, I've seen CPU drop as low as 9% on idle, so it's definitely improved on the new system.
 
Top