What is AI and how does it work?

Can you define AI? What about deep learning and machine learning? These words are being used a lot lately, and they can be confusing. Here, we give you a breakdown of what each of these terms mean and how they are being used in video surveillance.

What is AI? AI is Artificial Intelligence

The concept of artificial intelligence has been around for decades. The word “artificial” means “not natural; manmade,” and “intelligence” means the ability to think or understand. AI is a technology that allows a machine, such as a video surveillance system, to perform tasks that traditionally require human intelligence to execute. The goal of AI is to train technology to do things that humans can presently do better.

AI is not a system in and of itself: it is something that is implemented in a system. To compare: you wouldn’t say that infrared technology is a type of video surveillance system. Rather, infrared enables camera sensors to capture clear images in low-light conditions. So we can say that AI technology allows cameras to improve video analytics by integrating data processing and image processing.

AI helps prevent information overload, especially in security systems that cover large facilities and have a lot of cameras. With AI, video data is constantly interpreted. Unusual objects or activity are brought to the security operators’ attention. This enables the human staff to be an active participant in security while minimizing the need to view typical day-to-day activity.

Machine Learning

Machine learning is a method of refining a computer algorithm by providing it with data and teaching it to improve its performance by making adjustments based on that data. In other words, machine learning algorithms get better in response to the data they are exposed to over time.

Through machine learning, a machine can analyze, understand, and identify a pattern in data, as well as make a decision, without being managed by a human. This means a computer can perform tasks that would be impossible for a human to perform, due to sheer scope or processing capability. Facial recognition is a good example: with current systems, it can analyze hundreds of faces per second across multiple cameras.

Deep Learning

Deep learning is a more intricate type of machine learning that is inspired by the neural network structure of the human brain. An Artificial Neural Network (ANN) has sets of algorithms arranged in multiple layers. Data can be compared and moved through different layers to improve accuracy in analysis.

Applying deep learning to video surveillance helps in behavioral analytics, especially since rule-based analytics have limitations, and thus can have unacceptable false alarm rates. By examining activity over an extended period, the system can use deep learning to establish normal patterns of behavior of people and other moving objects. It is possible that after such analysis, anomalous behavior like a vehicle driving on a sidewalk or a person scaling a building could be detected without explicit rules defining it as such.

The Big picture

Now you understand that AI is powered by machine learning and deep learning, and none of it can function without data. Imagine this scenario: you have millions of terabytes of data that are virtually useless unless someone can find a correlation between the data. This is exactly what happens in a video surveillance system that is not using machine learning or deep learning to pull meaningful information out of the data that’s being collected. Big data combined with deep learning, on the other hand, has the potential to transform video surveillance from a passive visual surveillance solution to a much more active one.

Here’s an example. Using a people-counting camera can provide valuable information on how many people are entering and exiting a defined area. It can also detect loitering, keep capacity counts, and signal an alarm when pre-defined thresholds are exceeded. Combine that with Point of Sale (POS) information, and counting how many people pass by certain retail displays can provide valuable insight. You could better determine the effectiveness of an end-cap display, such as tabulating how many people stop to look, verses those who don’t stop and compare it to actual purchases. Adding facial recognition can help detect additional patterns in browsing that would otherwise go unnoticed, such as men with beards stopping and looking at female-oriented product displays. Whether you understand why this is happening, it illuminates an interest trend nonetheless.