Script to Detect Colour, Make and Model of Car and People from IP Camera (Updated July 2019)

hayrone · Jul 24, 2019

ilovecoffee said:
I figured out how to do the first round of predictions completely without a cloud service, using Tensorflow on Windows in Python. It's not a complicated setup at all. Why no one has made a GUI for this type of use-case is beyond me...

You do require an Nvidia Graphics Card (a fairly modern one, I think GeForce 960 onwards, I'm using a GTX 1060) in order to do the training and predictions. Predictions are not as fast as doing it in the cloud, but approximately 3.2 seconds to get a result from a GTX 1060 card. This is ideal if you want to train for certain trigger images you know happen in your BlueIris setup, such as a familiar face, or car, or a person in a specific area, or a package on a door step, or even a horde of rats.

I can write it up so that all you need to do is install a few things, and then just run a batch file to train images, and another one to return a prediction via Pushbullet or an HTTP Request of your choice (so you can use IFTTT or Smartthings for example). Because everyone here runs Windows by default, it makes it easy.

If anyone is interested let me know.

A write-up would be terrific. Do you ONLY require the nVidia GPU for training? My BI runs on a separate system than my 1060, so I could easily train using the GPU system and hopefully could move over the model to the actual BI system. Is that possible?

ilovecoffee · Jul 25, 2019

No you don't require an nvidia gpu for training, it's just slower from what I understand with only your CPU. You can also still use Google AutoML for training a model to download as well. I'm trying to figure out the best settings for training as I am not having the same accuracy with my own model compared to Google's AutoML model. So I'll post it once I've got the kinks figured out.

ilovecoffee · Jul 30, 2019

So some results:

1) Creating a tensorflow model that can detect two categories such as : "No Person" and "Person" work well on a locally trained machine. Could probably do the same for "Vehicle" "No Vehicle".

2) Creating a model that can detect multiple categories such as: "Bob's Car" "Alice's Car" "Mail man's car" "Some Other Car" seems to fail miserably. I am not a data scientist...or even a coder, so I haven't had luck training a meaningful result with more complex results. I have tried to get Auto-Keras working which an API to determine the best learning model for images, but it is so riddled with compatibility issues, it's near impossible to set up on a Windows machine.

3) I learnt that using Google's AutoML which is infinitely easier, you can train a bunch of images and categories, and export the model as a "tflite" model (Tensor Flow Lite, which is a fast performing model). This costs money. When you first sign up for the first time though, you get 15 free hours of training time. To train a model with say 500 images, it will probably cost you 2 hours training time. Otherwise if you didn't have free credit, it costs approximately $5 to $10 to train a model. Is that worth it? Depends, for me, yes. It's a one time fee, and you don't rely on the cloud to determine your images after the fact.

If you train a TFLite model, and export it, you can run the classification code right on your machine and it's super-fast, like 10ms to get a result.

zibadun · Aug 4, 2019

hayrone said:
Do you ONLY require the nVidia GPU for training? My BI runs on a separate system than my 1060, so I could easily train using the GPU system and hopefully could move over the model to the actual BI system. Is that possible?

once the model is computed running predictions should not require a lot of cpu cycles. Most cell phones are able to do that easily without draining the battery, for example scene or face detection in live view. I'm thinking about using a rasberry pi or beaglebone to process images.

ilovecoffee · Aug 5, 2019

zibadun said:
once the model is computed running predictions should not require a lot of cpu cycles. Most cell phones are able to do that easily without draining the battery, for example scene or face detection in live view. I'm thinking about using a rasberry pi or beaglebone to process images.

Yes, in fact I've created a script that uses the Darknet YoloV3 trained model to detect objects in alert images. It's super fast (doesn't require to keep running like AI Tool, and classifies in a few milliseconds). It can detect people, cars and a bunch of other things.

For me it's useful for people (it's very accurate), but for cars I want to know what car it is, and I want to train it to recognize my family's cars.

I'm not sure what the general community wants to achieve, maybe some ideas and I can create it for you all.

zibadun · Aug 5, 2019

ilovecoffee said:
'm not sure what the general community wants to achieve

I've used your idea to ID vehicles via sighthound. If it finds a USPS truck in the image my script sends a message+photo to a private telegram channel, notifying me of the USPS delivery. All other motion detection is sent to a second "muted" telegram channel (don't want to be bothered for every motion, but I can go into the channel and review pictures at any time).

It's hard to tell what else may be useful. Package on the porch is a good idea!
73!

zibadun · Aug 9, 2019

@ilovecoffee
I was able to create a google model and export the .tflite file

Do you have an example in python of how to classify an image using this exported model? I have the python env setup and tensorflow locally. The google docs go deep fast.

Also if I wonder if it's possible to retrain the google's model on my own GPU with more images..

ilovecoffee · Aug 9, 2019

zibadun said:
@ilovecoffee
I was able to create a google model and export the .tflite file

Do you have an example in python of how to classify an image using this exported model? I have the python env setup and tensorflow locally. The google docs go deep fast.

Also if I wonder if it's possible to retrain the google's model on my own GPU with more images..

Yes I do, I'll post it here on Monday when I'm back at the office.

I've also got custom object detection working, with a model I've trained myself. It's a bit more time consuming because you have to use an app to draw boxes around each object in every image (it took me a good 20 minutes of torture to go through 400ish images) but you can train it yourself easily. Otherwise you can use the built in objects like "person" or "car" with no training.

zibadun · Aug 9, 2019

Good deal thanks

I'm trying to figure out how to load the .pb file exported from google into tensorflow and continue training it. I got google to train for two labels (w/ 1 node-hour for free) and their results are pretty good. I wanted to add another label or add more images and that's when google will start asking for $ to retrain, and can be done only within 14 days or have to start from scratch. I think this is called transfer training.. I'll post here if I get it to work.

zibadun · Aug 10, 2019

it looks like google's own automl vision image classification training is based on this example

https://colab.research.google.com/g...lob/master/community/en/flowers_tf_lite.ipynb

they are using MobileNet V2 as the base, which explains how they are able to build a new model with just a few new images.. this training is reproducible at home with a GPU card i think

ilovecoffee · Aug 12, 2019

zibadun said:
Good deal thanks

I'm trying to figure out how to load the .pb file exported from google into tensorflow and continue training it. I got google to train for two labels (w/ 1 node-hour for free) and their results are pretty good. I wanted to add another label or add more images and that's when google will start asking for $ to retrain, and can be done only within 14 days or have to start from scratch. I think this is called transfer training.. I'll post here if I get it to work.

So here is my example. This is a function, so you can paste it in your existing file, and call it by automl_tensorflowlite(variables here) and it will return the labels back. It also has a function to avoid duplicate responses of the same thing back-to-back. In AutoML I made a label called "Nothing" to train it on what nothing looks like, or things to ignore, that is also why I have an If statement "If prediction != "Nothing""

Code:

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import pickle
import os
import glob
import argparse
import numpy as np

from PIL import Image

from tensorflow.lite.python.interpreter import Interpreter


def load_labels(filename):
  with open(filename, 'r') as f:
    return [line.strip() for line in f.readlines()]


def automl_tensorflowlite(image, model_path, label_file, min_confidence_score=0.5, avoid_duplicate="False"):
 prediction = ""
 input_mean = "127.5"
 input_std = "127.5"

 interpreter = Interpreter(model_path)
 interpreter.allocate_tensors()

 input_details = interpreter.get_input_details()
 output_details = interpreter.get_output_details()

 # check the type of the input tensor
 floating_model = input_details[0]['dtype'] == np.float32

 # NxHxWxC, H:1, W:2
 height = input_details[0]['shape'][1]
 width = input_details[0]['shape'][2]
 img = Image.open(image).resize((width, height))

 # add N dim
 input_data = np.expand_dims(img, axis=0)

 if floating_model:
   input_data = (np.float32(input_data) - input_mean) / input_std

 interpreter.set_tensor(input_details[0]['index'], input_data)

 interpreter.invoke()

 output_data = interpreter.get_tensor(output_details[0]['index'])
 results = np.squeeze(output_data)

 top_k = results.argsort()[-5:][::-1]
 labels = load_labels(label_file)
 top_node = top_k[0]   
 confidence_score = results[top_node] / 255.0
 print("-------------------------------")
 print("Model Testing Results")
 print("-------------------------------")
 for i in top_k:
   if floating_model:
     print('{:08.6f}: {}'.format(float(results[i]), labels[i]))
   else:
     print('{:08.6f}: {}'.format(float(results[i] / 255.0), labels[i]))
 if confidence_score > min_confidence_score:
  prediction = labels[top_node]
 else:
  print("Nothing known found, moving on...")
  return None;
 if avoid_duplicate == "True":
  try:
   with open('automl-tflite-prediction.pickle', 'rb')  as f:
    previous_prediction = pickle.load(f)
   if prediction == previous_prediction:
    print('Data is the same!')
    if prediction != "Nothing":
     raise SystemExit
   with open('automl-tflite-prediction.pickle', 'wb') as f:
    pickle.dump(prediction, f, pickle.HIGHEST_PROTOCOL)
  except EOFError:
   with open('automl-tflite-prediction.pickle', 'wb') as f:
    pickle.dump(prediction, f, pickle.HIGHEST_PROTOCOL)
  except FileNotFoundError:
   os.open("automl-tflite-prediction.pickle", os.O_CREAT | os.O_EXCL)
 return prediction;

zibadun · Aug 12, 2019

@ilovecoffee

Thank you for the code sample. worked great on the first try! awesome stuff
So far I'm only detecting USPS truck, but will add more goodness.. now that the workflow is known.

I'm letting my camera detect motion and upload a picture by FTP to the NAS (synology). A script running on the NAS instantly detects the new file in the FTP share and processes it (i.e. there is no blue iris). So far works great!

silencery · Sep 9, 2019

@ilovecoffee thanks for putting this together! This is an interesting thing to play with.

Just to clarify, automl can't actually distinguish between arriving or leaving, can it?
I ask because I looked over the automl docs and it doesn't seem to have this ability yet. However, you mentioned these folders, so just wanted to double-check.

Thanks again!

ilovecoffee · Sep 9, 2019

silencery said:
@ilovecoffee thanks for putting this together! This is an interesting thing to play with.

Just to clarify, automl can't actually distinguish between arriving or leaving, can it?
I ask because I looked over the automl docs and it doesn't seem to have this ability yet. However, you mentioned these folders, so just wanted to double-check.

Thanks again!

Because of the way BlueIris takes an alert image when I'm coming and going are different, that way the AI is able to tell the difference

zibadun · Sep 9, 2019

if cars are facing in a different direction when coming and going that should be enough for automl to classify based on this feature. You can also trigger on absence of a particular car (missing object search). There is really almost no limit on what can be done with automl..

silencery · Sep 10, 2019

That's what I was thinking. Thanks for confirming!

Unfortunately our environment is different (we pull straight into/out of our driveway always the same direction), but nice to know of different ways this can be applied.

ilovecoffee · Sep 11, 2019

silencery said:
That's what I was thinking. Thanks for confirming!

Unfortunately our environment is different (we pull straight into/out of our driveway always the same direction), but nice to know of different ways this can be applied.

But are the vehicles in different parts of the image for coming or going in the alert jpegs? I bet that would be enough...one at the top of the driveway and one at the foot of the driveway.

silencery · Sep 11, 2019

I assumed that would be another way to define direction.

Unfortunately, in our environment, the driveway cameras are zoomed in pretty tight. The cars take up a large part of the frame, so there probably won't be enough "runway" so to speak to determine a difference in direction between arriving/leaving. I've just started the collection process for still images, so I'll still run them through the automl processor to play with the results. Can't wait to try this out!

Uwvid · Dec 16, 2020

ilovecoffee said:
So some results:

1) Creating a tensorflow model that can detect two categories such as : "No Person" and "Person" work well on a locally trained machine. Could probably do the same for "Vehicle" "No Vehicle".

2) Creating a model that can detect multiple categories such as: "Bob's Car" "Alice's Car" "Mail man's car" "Some Other Car" seems to fail miserably. I am not a data scientist...or even a coder, so I haven't had luck training a meaningful result with more complex results. I have tried to get Auto-Keras working which an API to determine the best learning model for images, but it is so riddled with compatibility issues, it's near impossible to set up on a Windows machine.

3) I learnt that using Google's AutoML which is infinitely easier, you can train a bunch of images and categories, and export the model as a "tflite" model (Tensor Flow Lite, which is a fast performing model). This costs money. When you first sign up for the first time though, you get 15 free hours of training time. To train a model with say 500 images, it will probably cost you 2 hours training time. Otherwise if you didn't have free credit, it costs approximately $5 to $10 to train a model. Is that worth it? Depends, for me, yes. It's a one time fee, and you don't rely on the cloud to determine your images after the fact.

If you train a TFLite model, and export it, you can run the classification code right on your machine and it's super-fast, like 10ms to get a result.

Regarding the export to tensorflow lite. I have no option on my dashboard for export. When I go to "Test & Use" the only option is deploy to cloud with a warning of cost that incur. Have they removed the option or am I missing something?

Script to Detect Colour, Make and Model of Car and People from IP Camera (Updated July 2019)

n3wb

Getting the hang of it

Getting the hang of it

n3wb

Getting the hang of it

n3wb

n3wb

Getting the hang of it

n3wb

n3wb

Getting the hang of it

n3wb

Pulling my weight

Getting the hang of it

n3wb

Pulling my weight

Getting the hang of it

Pulling my weight

n3wb