4.7.8 - August 28, 2018 - Support for the Nvidia CUDA hardware decoding

Bavery

n3wb
Joined
Feb 27, 2018
Messages
10
Reaction score
8
actually its much more than 20 dollars, even if you only pay 10c a kwh the 70w difference between running your pc and buying a 100 dollar pc that will consume 30w, is about 60 bux a year.
Understood. I was just referring to the potential cost difference for my existing system with and without CUDA turned on. Yes, it doesn't make financial sense by itself, and in hindsight l should have used this as an opportunity to replace the motherboard/cpu instead of an excuse to pull the trigger on that shiny new Nvidia card I already had my eye on. In this specific scenario the difference is small enough that my buyers remorse is minimal and I'll get plenty of other enjoyment out of the GPU elsewhere outside of blue Iris.
 

fenderman

Staff member
Joined
Mar 9, 2014
Messages
36,901
Reaction score
21,269
Understood. I was just referring to the potential cost difference for my existing system with and without CUDA turned on. Yes, it doesn't make financial sense by itself, and in hindsight l should have used this as an opportunity to replace the motherboard/cpu instead of an excuse to pull the trigger on that shiny new Nvidia card I already had my eye on. In this specific scenario the difference is small enough that my buyers remorse is minimal and I'll get plenty of other enjoyment out of the GPU elsewhere outside of blue Iris.
you should dedicate a pc to blue iris...that way you can enjoy the card as well as run an efficient BI system. This will also maximize stability of the BI system.
 

eluraamat

Young grasshopper
Joined
Nov 27, 2017
Messages
36
Reaction score
8
Could anyone please confirm, tha 4k coding will work with nvidia?
I noticed this thread yesterday and wanted to test it.

I have one very old gtx580 and put it to machine. It supports coding only up to 1080p. Can newer cards support 4K (8mp) cameras?

thanks,
 

jmg

Young grasshopper
Joined
Mar 4, 2018
Messages
54
Reaction score
4
not a huge drop in cpu usage on my end either. I'm running an i7-7700k and a titan xp, cpu usage dropped from around 24% to 18%, titan barely off idle.
Playback overall does seem snappier... going to try increasing framerates on cameras see if it's still snappy.

if using cuda, does it matter if the "limit decode" option is selected?
 

bp2008

Staff member
Joined
Mar 10, 2014
Messages
12,673
Reaction score
14,017
Location
USA
if using cuda, does it matter if the "limit decode" option is selected?
Of course it matters. All the pros and cons of "limit decode" still apply regardless of the type of hardware acceleration being used.
 

jmg

Young grasshopper
Joined
Mar 4, 2018
Messages
54
Reaction score
4
ok, could have worded the question better.. was wondering if the impact of selecting "limit decoding" is minimized using cuda vs. intel. I'll try it out and see. Regardless, I'll likely stick with cuda decodig, BI does seem more responsive with it enabled. Not sure if it's the new version, maybe it's the titan itself.
 

bp2008

Staff member
Joined
Mar 10, 2014
Messages
12,673
Reaction score
14,017
Location
USA
If anything it is the other way around @jmg. When Limit decoding is enabled it would be harder to see the difference from hardware acceleration, direct to disk, etc. Limit decoding is the single biggest CPU reducer in Blue Iris's configuration, but it has the biggest downsides too.
 

jmg

Young grasshopper
Joined
Mar 4, 2018
Messages
54
Reaction score
4
thanks, I see that --- I turned off "limit decode" on all 11 cameras, and CPU usage (with Intel decoding on) jumped from around 20% to 35-40%.
with CUDA decoding on, cpu usage dropped down again to under 20%, the gPu is barely off idle.
 

Javier

n3wb
Joined
Feb 1, 2017
Messages
14
Reaction score
3
I'm running a system with 17 enabled cameras with a total of 1160 MP/s.
I have several cameras disabled due to CPU limits.(need to leave cpu at around 70-80% for remote, live view or windows use)
7700K CPU overclocked to 4.9 Ghz on watercooling.
GTX 1050 TI for the windows and display.

I tested a GTX 1080 and a GTX 1070 TI. The 1080 had around 4% less GPU usage at around 68%, still plenty of headroom left.
CPU use was exactly the same on both cards.
Decoding was enabled on this GPU and graphics on windows were run by the 1050TI.
After adding the GPU I could enable around 200 MP/s more than previously, or 17% more capacity.
Still my system was CPU bound with plenty of GPU power left.
I played with distributing the load on the GPUs, sending some cameras to GPU1 and others to GPU2. It works correctly but still limited by CPU.

Honestly I don't see the use for a GPU unless you want to quickly add 17% more capacity and don't want to buy a new computer.
 

Javier

n3wb
Joined
Feb 1, 2017
Messages
14
Reaction score
3
Hey folks...just installed this build to try it out on my system.

This is my Core i9-7980XE with 128GB ram and Nvidia Titan X (Pascal) which according to specs has 3584 Cuda cores and 12GB ram. My Blue Iris install has 35 cameras with ranging from some 1080p PTZs, all the way up to 4K. Half of my cameras are 4K with one of them being a 4K PTZ.

Before installing the new build, I was running at about 53% with the GPU load sitting around 15%.

I then installed the new build and only enabled Nvidia Cuda for 12 of my 4K cameras. I had to stop because the GPU load was hitting 100% at that point.

My CPU load went from 53% to 31% (on average), so a good solid 20% decrease. Pretty nice!!!

I'm excited by the potential for this, considering how much additional headroom this might afford me. I will also be super curious to test this again when my new RTX card arrives, as the 2080 Ti has 4352 Cuda cores. As soon as I've tried it out, I will post results.

Thanks to Ken for enabling this long-time wish list item...I know there are a BUNCH of people who are super happy tonight!
You can put multiple GPUs and assign each camera to an individual GPU and distribute the load. I tested it and it works.

On how many MP/s did you max out the titan X? I ran a GTX1080 with 1160 MP/s and got 68% GPU usage.
 

jmg

Young grasshopper
Joined
Mar 4, 2018
Messages
54
Reaction score
4
I have a titan XP, I'll check... where are the settings for mp/s? I'm running 12 cameras on my titan, still only at 35%, which makes me think I'm vastly under utilizing it.
 

Svanteson

n3wb
Joined
Oct 2, 2018
Messages
1
Reaction score
0
Location
Stavanger
Would adding more than 1 Nvidia card help? I have like 4 quadro k2200 cards just collecting dust.
 

eluraamat

Young grasshopper
Joined
Nov 27, 2017
Messages
36
Reaction score
8
No I have tested it and my conclusion are.
Do not use cuda coding with intel system witch have integrated gpu.

Test one: 8600k@5ghz vs 4930k (with and without 4.6Ghz) with cuda@1060 3gb.
Console closed, not big difference with cpu/gpu usage.
BUT video replay 8gen cpu usage will rise allmost twice as 4930k. Exaple 8mp playback 8600K cpu 70-80%(gpu 99-100), 4930K same case cpu 40-45% (gpu 95-99).
BOTH cpumark have around 16200-16300.

Test two: 6600K vs very old i7-875K, almost twice slower that 6600K@4700MHz.
6600K have even problem (system around for testing are 750mp/s) view realtime 4K video with chrome (hit' s 100% cpu usage).
i7-875K will handle it much-much better.

Even if I move 1060 to intel system witch have integrated gpu and turn it off. Still cpu usage will hit the roof with playback. No difference if gpu enabled, or not.
 

eluraamat

Young grasshopper
Joined
Nov 27, 2017
Messages
36
Reaction score
8
I got 1070ti today for testing.

1070ti should be almost twice as good 1060 3gb, but with BU no difference. To be honest, I could not see any difference at my setup.
Only difference (both cars) is core clock overclocking, witch make difference with BU.

At the moment 1060 seems to best for BU, because clock speed and ability to overclock a bit ;)
 

eluraamat

Young grasshopper
Joined
Nov 27, 2017
Messages
36
Reaction score
8
Looks like BU can' t use full benefits higher end cuda cards.

For a example: 1060 3gb vs 1070ti 8gb.
1060 3gb have 1152 cuda core and 1070ti have 2432. So more than twice.
1060 have gpu (not memory clock, witch overclock not helping much) boost clock 1708Mhz, 1070ti 1683Mhz.
So, 1060 have a bit advantage at stock clock, but both 1060 and 1070ti what I have, can overclock allmost identical.

Only one thing what makes difference at BU, is gpu clock speed. NOT memory speed, not cuda cores and my case, not memory amount (because BU will use max 29xxMB of memory, witch 1060 have).
And overcloking these cards will make big enough difference.

If BU could use more cuda cores, I would be good, but until then not point to have a higher ends cars.

10xx series have higher gpu clock 1080, but price of that is nonsense, because BU can use all that core count.
1060 have higher clock speed than 1070 and much lover price, so there a clean winner.

My conclusion: It' s better to have two 1060 cards, than one 1070x or 1080. Not it will be more powerful, it will be cheaper also.
 

eluraamat

Young grasshopper
Joined
Nov 27, 2017
Messages
36
Reaction score
8
"NVIDIA GPUs contain one or more hardware-based decoder and encoder(s) (separate from the CUDA cores) which provides fully-accelerated hardware-based video decoding and encoding for several popular codecs. With decoding/encoding offloaded, the graphics engine and the CPU are free for other operations. "

That' s why better gpu with more cuda core will not help.
 

Netwalker

Getting the hang of it
Joined
Aug 8, 2017
Messages
46
Reaction score
28
Enough with the testing or wondering if Ken can make improvements to the Nvidia based decoding - I think the wiki about quick sync says it all:

Intel Quick Sync Video - Wikipedia

“A benchmark from Tom's Hardware showed in 2011 that Quick Sync could convert a 449 MB, four-minute 1080p file to 1024×768 in 22 seconds. The same encoding using only software took 172 seconds but it is not clear what software encoder was used and how it was configured. The same encoding took 83 or 86 seconds GPU-assisted, using an Nvidia GeForce GTX 570 and an AMD Radeon HD 6870, respectively, both of which were at that time contemporary high-end GPUs.”

GPU’s may have improved a bit since 2011, but so has quick sync.

“The Kaby Lake & Coffee Lakemicroarchitecture adds full fixed-function H.265/HEVC Main10/10-bit encoding and decoding acceleration & full fixed-function VP9 8-bit & 10-bit decoding acceleration & 8-bit encoding acceleration.”

So it’s not that the Nvidia implementation is bad, or anything. Quick Sync is really just damn good and it gets even better. Quick Sync has been extensively optimized for low power usage (laptops & tablets), which also makes it hands down the better BI option for hardware based decoding.
 
Last edited:

ryan99alero

Young grasshopper
Joined
Sep 2, 2018
Messages
38
Reaction score
22
Location
Earth
Current System:
HP DL380P Gen8
Dual E5-2670 2.6ghz 8core
256GB Ram
Mixed Storage: About 1tb local SAS 15k for OS installs and another 30tb ISCSI for storage.
Backups are done using Veeam Cloud connect to AWS for backup and dr.

Since server CPU's or at least my server CPU doesn't support Intel Quicksync that either leaves me with buy a Nvidia Tesla or Grid series GPU as a regular 10x series doesn't work with hypervisors. I'm trying to keep from having to buy another computer to efficiently run the BI. If I didn't like the simplicity of BI I'd just switch to Aimetis like I run at work. Has anyone done any testing with any of the server platform Nvidia cards? I'm surprised by the CPU usage of the software.

If I were to buy another computer does BI do cluster and or failover? I'm still in the beginning phase of my home security system but also have it integrated with Apple HomeKit through proxy of a HomeBridge server.
 
Top