Computer vision is a branch of artificial intelligence (AI) that enables a machine to see and understand the world as we do. With the help of this technology, a computer can imitate the human visual system to get data from pictures and videos and use machine learning make decisions based on that data.
The definition might sound complex, but the purpose of computer vision is simple: to automate the tasks humans can do and, in some cases, with even greater accuracy, reliability, and safety.
Thanks to recent advances in AI, machine learning, and deep learning, computer vision-enabled products and technology are seeing huge growth. In fact, the market is projected to reach $48 billion by 2022, according to a report by Tractica. All this to say, the demand for computer vision-related jobs—including computer engineers, developers, programmers, and scientists—is greater than ever.
If any of these career paths interest you, understanding computer vision is a must. This article will provide a basic rundown, but getting your IT degree will give you a much deeper knowledge and equip you with the skills necessary to join this fast-growing field.
The first digital image scanner was introduced in 1959 and is considered the precursor to artificially intelligent image recognition. This early technology was able to transform images into grids of numbers for computers to recognize them. Soon after its introduction, Larry Roberts—considered the father of the internet and computer vision— started exploring the possibilities of extracting 3D geometrical information from 2D perspective views of blocks. By the 1970s, his research had formed the early foundations for many of the computer vision and object detection algorithms that exist today.
Multiple advances in computer vision came from Roberts’ studies, such as methods for capturing and recording objects and recognition-by-components, which suggested the human eye can recognize objects by breaking them down into their main components.
By the early 1990s more advanced technology gave way to computer vision powered by deep learning algorithms, called convolutional neural networks (CNN). CNN acts similarly to human vision, the way our brain interprets an image. It assigns importance to various aspects in an image and classifies them to reduce the time it takes a computer to perform image processing tasks. The most common application of CNN is in image recognition, which is a process that identifies objects and image features.
Today, computer vision is being integrated into major products and technologies we use every day. Apps like Google Maps leverage computer vision to identify real world streets, businesses, and office buildings; Facebook uses computer vision to identify people in photos; and Apple’s iPhone allows us to unlock our phones with facial recognition technology powered by computer vision.
It’s important to know the difference between computer vision and image processing because the two terms are often misunderstood and used interchangeably.
Image processing is used to enhance images by utilizing algorithms to smooth, sharpen, contrast, or stretch an image. Meanwhile, computer vision uses the same image processing algorithms but for the purpose of recognizing objects or facial recognition.
To put it simply: the goal of image processing is to modify an image, while the goal of computer vision is to recognize real world objects. However, both use the same algorithms.
As mentioned, computer vision has seen a huge boom in recent years. Here’s a look at what industries are being impacted the most:
Since so many diagnoses are done by image processing (such as MRI scans, x-rays, CT scans, etc.), computer vision can help reduce the time it takes doctors to analyze medical images and improve the accuracy of their diagnoses—all of which can be life-saving for patients.
The entertainment industry has long used focus groups as a way to gauge how people respond to TV shows, commercials, and movies. But with the rise of computer vision, viewer surveys and interviews may become a thing of the past. Now, computer vision and deep learning algorithms can track and make sense of viewer face and eye movements and translate reactions into quantifiable data.
Computer vision can provide a range of data analytics for marketers who monitor brand exposure during sporting events. It can help tackle massive amounts of raw video content that comes from sports venues, track logos across various media platforms and broadcasting channels, and then calculate the value of every case of exposure.
Computer vision can be deployed to check live or recorded surveillance clips that help law enforcement gain essential information. Security or surveillance systems using computer vision can scan live footage of public places and identify illegal actions or harmful objects, or use facial detection algorithms to search groups of people and find persons of interest.
In recent years, image recognition has been a part of several robotics-based smart home products. One great example of this is the iRobot’s Roomba vacuum. Using computer vision and built-in cameras, the Roomba creates a map of your home and identifies specific pieces of furniture to avoid for better navigation. It can also recognize which spaces need to be vacuumed and what spaces can be left alone.
Similar to how computer vision is being used in robotics, the auto industry is using this technology to enable self-driving cars. With the use of real-time images and 3D maps, self-driving vehicles can identify obstacles to help avoid accidents and even brake to stop a projected collision. Computer vision technology can also gather large sets of data using cameras and sensors to assess traffic conditions, road maintenance, and location information and make real-time decisions regarding alternative routes.
Computer vision has come a long way since its inception in the 1960s. However, it’s still young and prone to challenges.
It’s constantly evolving.
Creating a machine that sees is a difficult task on its own, but it’s made even more difficult when the hardware, applications, and algorithms that support this technology are always changing.
It’s not always accurate.
One of the major challenges in this field is that the technology is still not comparable to the human visual system, which is what it essentially tries to imitate. For example, an object’s texture, shape, size, colors, shadows, etc. can all affect how a computer “sees” the image in front of it and limit its accuracy.
It can be costly.
Neural networks used for computer vision applications need a lot of data to produce good results. Despite the fact that images are available online in bigger quantities than ever, the solution calls for high-quality labeled training data. That gets expensive because the labeling has to be done by a human being.
Despite its challenges, the demand for computer vision (and the people who can help implement it) is growing faster than ever. Ready to join one of the most in-demand and exciting fields in technology? Get started today with an online degree from WGU.