As entrepreneurs, we are always on the lookout for the next big thing. In the software development industry, there are so many areas to explore. Currently, one particular technology that’s capturing the attention of many software companies is Computer Vision. But what is Computer Vision? How can companies use it to increase revenue? Let’s find out.
How did Computer Vision start? The Computer Vision field was born in research laboratories in the US in the mid-1950s, but the first commercial application of Computer Vision was in the 1980s with the Optical Character Recognition system. These systems would visually recognize printed alphanumeric characters and symbols.
During this time period, the optical mouse was also introduced, the same mouse with a laser tracker that we use now. This was the first commercially successful application of the smart camera, an imaging device and a mini-computer that did not only capture and transmit video, but also communicated other useful data that is computed based on the algorithms that the smart camera was programmed and trained for. These products proved the commercial viability of computer vision products and started the trend of companies and applications using this technology.
How is Computer Vision Booming Right Now?
Technological breakthroughs in the past 20 years enabled Computer Vision applications to become more accessible, more scalable, faster, and cheaper. Here are some of these advancements:
Cheaper and More Powerful Cameras, Storage and Computing Devices
Computing devices and camera sensors have become smaller, cheaper, and more powerful than ever, thanks to advances in chip manufacturing and computer architecture. In the not so distant past, people would have separate gadgets for capturing pictures, recording videos, surfing the Internet, taking calls, and sending messages. Today, all of these tasks can now be done by a smartphone. Most smartphones also have two or more cameras and have sensors that are sometimes more powerful than the average digital camera. They also have memory storages as huge as 1 TB (enough for 500 full-length movies or 100,000 songs) to store all your images and videos. And the best part is that two-thirds of the people in the world own such a device.
Faster Data Communication Technologies and Cloud Computing
Not only it is easier to generate images and videos (thanks to the accessibility of cameras and computers), but it is also now easier to exchange and store this data. Only fifteen years ago, it took more than a day to send a 2 GB movie file through Broadband Internet, and it was almost impossible to send a movie through a phone. Now, we can send the same movie file in less than 30 seconds with a Fiber Optic Gigabit Internet connection, while we can also send the same file in 5 minutes with our Mobile Internet, thanks to 4G technology. And we are still improving this with technologies like 5G, a mobile internet technology 20 times faster than 4G.
A combination of faster communication technologies, cheaper and more powerful storage and computing devices, and more robust operating software to coordinate with multiple computers enabled Cloud Computing Technology to progress. Cloud Computing is a paradigm where instead of maintaining your own computing and storage servers, you can turn to a third-party provider to take care of your computing and storage needs, and you will be billed based on how much you’ve used or plan to use. This made both computing and storage cheaper and more accessible.
Abundance and Accessibility of Data
Because of all the previous improvements that we’ve mentioned, people, institutions, and companies find it more and more convenient do a multitude of activities in the digital space. Real-world activities are now being done through the Internet; sharing pictures and videos on Social Networking Websites like Facebook, Twitter, Instagram, and Snapchat, streaming music, movies and TV series on YouTube and Netflix, buying and selling items on Amazon and Alibaba, and advertising on Google Ads.
This integration of real-world and digital space generates tons of data to analyze, particularly pictures and videos. According to Domo, as of 2018, there has been a huge uptick of Images and Videos generated from websites like Snapchat (294% increase from 2017, with 2,083,333 snaps per minute), Giphy (100% increase from 2017, serving 1,388,889 GIFs per minute), and Instagram (5.65% increase from 2017, with 49,380 photos per minute). This created a sea of valuable data that Computer Vision applications can learn from.
Accessibility of Advanced Data Processing Tools
We cannot make use of the sea of data that we have mentioned if we don’t have tools to process it. In the past, people had to start from scratch in creating their own Computer Vision and Machine Learning algorithms before they could program and test their real-world Computer Vision applications. This made the development of such applications a long, hard, and costly process. But 20 years ago, open-source projects for both Computer Vision and Machine Learning started to pop up. Intel started the OpenCV library in C++ for real-time Computer Vision functions. This both freed up time, and licensing fees for developing Computer Vision Applications, and developers can now focus on creating the applications themselves.
Also, in just the past two years, two Machine Learning libraries started to gain traction: Google’s Tensorflow and Keras, and Facebook’s PyTorch. Tensorflow is in C++ and Python, while PyTorch and Keras are solely in Python. All are Open Source libraries and are capable of executing Artificial Neural Network (commonly known as Deep Learning) algorithms. ANN is the most popular Machine Learning algorithm as of now because, given huge amounts of data, it can learn patterns that are undetectable even by humans. Before, such algorithms were not only hard to implement but if there were such implementations, they were only accessible by paying huge royalties or fees. But since the above-mentioned libraries are Open Source, integrating ANN into all fields including Computer Vision is now easier, faster, and cheaper, making it possible for more companies to use this technology to solve real-world problems.
Listen to Episode 108 of the Startup Hustle Podcast – Computer Vision
How Valuable is the Computer Vision Industry?
According to ResearchAndMarkets.com, the Computer Vision Industry is poised to reach USD 25.32 billion by 2023 from USD 3.62 billion in 2018, with software applications holding the largest share of the Market within those years.
Investments have also been flowing into Computer Vision startups. According to AngelList, there are 707 startup companies delving in Computer Vision, and each company is valued at USD 5.3 million on average. These numbers will only increase due to the huge potential that the Computer Vision field has.
How is Computer Vision Used by Companies?
As we have mentioned previously, Computer Vision is a huge field. Computers can recognize visual cues and derive information from them better than humans. State-of-art technology is still far from the end goal of surpassing Human Vision, but we already have lots of commercially available applications that are helping out in these fields:
Manufacturing and Robotics
We’ve mentioned previously in the history that one of the first commercial applications of Computer Vision was Smart Cameras. The Manufacturing Industry was the earliest and most extensive adopter of this technology. They used this technology to implement Machine Vision which uses Computer Vision techniques to control mechanical components. They use Machine Vision to automate assembly line processes like Inspection and Quality Assurance, Actuating Control Parts, and Process Monitoring. Smart Cameras can be programmed to detect and measure features, orientation, and defects on items on a manufacturing line, read and verify barcodes or printed characters, and other redundant tasks that need accuracy, consistency, and speed.
Consumer Electronics and Mobile
In recent years, Consumer Electronics Manufacturers have been releasing products that are Smart Things. Smart Things are regular appliances, gadgets, and tools with added sensing, computing, and networking capabilities enabling them to adjust automatically to their surroundings and the users’ needs, consume energy efficiently, finish tasks faster, and other cool stuff your regular appliances couldn’t do before. A lot of the new Smart Things that are coming out have Computer Vision capabilities, like Dyson’s 360 Eye robot vacuum. This autonomous vacuum uses its 360-degree vision to map out its surroundings to avoid obstacles and clean a closed area thoroughly and efficiently.
We also cannot forget about our Mobile Phones with their front and back cameras becoming more and more powerful every release. In fact, the camera capabilities of smartphones have doubled in just the past 5 years. Smartphone companies like Apple, Samsung, Huawei, Sony, and Google, are constantly competing to see whose smartphone has the most advanced camera.
These companies are not only improving their sensor hardware to improve their cameras, but also utilize Computer Vision technology on the software side to improve the image and video quality that is produced. Their Camera applications can do face detection, video stabilization through gyroscope data integration, motion anticipation, autofocus, low light detection, and depth of field (or bokeh) simulation. Eventually, these smartphones will catch up in terms of image quality with DSLR cameras, despite their hardware advantage, by using Computer Vision techniques to integrate all the hardware the Smartphone has.
Many app development companies are now looking for mobile developers with experience in Computer Vision and related skills like machine learning. Having developers with knowledge in these areas give these companies a unique edge over their competition.
Check out other business trends on this episode of Startup Hustle: “10 Top Business Trends in 2019”
You’ve probably heard of the Driverless Cars and Driverless Trucks by Tesla, Google, Uber, BMW, and other Car Manufacturers. Computer Vision is a main driver of this technology and all Autonomous Vehicles are peppered with different kinds of cameras all over their bodies for this purpose. But there are important issues that must be addressed before we can commercialize the technology. One issue is that Computer Scientists currently cannot pinpoint how exactly the computer learns and identifies patterns in a Deep Learning system. That is not acceptable for Car manufacturers because they have to know exactly what went wrong with their Smart Car when it encountered an accident.
This does not mean that Computer Vision is not yet applied in the Automotive Industry. Current car models have rear cameras to help drivers when reversing. ADAS (Advanced Driver Assistance System) would be the state-of-the-art application of Computer Vision, especially in Blind Spot Detection, Collision Avoidance Systems, Lane Departure, Driver Drowsiness Detection, and Night Vision.
Retail and Advertising
Retail has been catching up with the automation trend too. Amazon’s Amazon Go and Zippin are implementing a cashier-less store where all items in stock are tracked via Computer Vision so that when the customer takes an item off the shelf and goes out of the store, the item would be automatically charged to the customer. In China, Alibaba is also trying to innovate in the Retail Industry with Computer Vision by enabling users to pay by just scanning with their faces.
Computer Vision also helps advertisers get more context from users through Visual input instead of just Textual input, thus enabling them to target users more effectively. Companies like McDonald’s, BMW, and Toyota use their on-site cameras to identify customer features, like their gender, age, clothes, accessories, to show them the appropriate advertisements or product suggestions. Meanwhile, Instagram and GumGum have been analyzing the images and videos posted on the page to adjust the digital ads they are displaying.
Security and Surveillance
Computer Vision is a critical component in improving Security and Surveillance technology. Currently, we rely on humans watching on-site or over a real-time video feed for analyzing incidents, detecting suspicious or criminal activity, managing traffic flow and safety, and watching over stuff in general. Such a task is tiring for humans because it is time-intensive and recurrent, and our performance drops whenever we’re tired. So companies like IC Realtime and Boulder AI, have been developing Surveillance Systems that incorporate advanced Computer Vision techniques to add face, object, motion, and event recognition. Their Internet-Connected Cameras would send or generate data based on the object or actions they are watching over. These cameras would either send the data to a cloud server or go straight to the user application wherein they would alert the users on suspicious behavior happening in the area.
Some companies like Lolli & Pops and RetailNext also utilize Surveillance Systems that capture Retail Data like the demographics of the customers entering the store, the number of customers inside the store at a given time, which items are taken or put back on the shelf the most, etc., and generate analytics out of this data.
Sports and Entertainment
Sports have a huge potential to become heavy users of Computer Vision because there is a lot of motion involved in this field. Tracking these actions and generating data from them would be a great help to both athletes, coaches, and fans alike. For example, the NBA has been using Camera Tracking systems and Computer Vision techniques since 2009, through their initial partners SportVU and then Second Spectrum by 2018, to track player and ball movement, identify actions being done, infer stats, metrics, and probabilities, and then visualize data generated for the benefit of players, coaching staff, general managers, fans, broadcasters, and bettors.
The Entertainment Industry is another field that is using Computer Vision to applications like generating CGI graphics, classifying Video archives, or even giving life to your favorite creatures like Gollum from Lord of The Rings, and the Na’vis from Avatar. Moviemakers have been a long-time user of Computer Vision technology. They used Computer Vision and specialized facial cameras to capture facial expressions from actors and recreate realistic CGI-generated characters.
Agriculture has similar applications to Manufacturing and Robotics because companies like RSIP Vision and Prospera from Israel are incorporating Industrial Automation techniques, especially Computer Vision algorithms and Smart Cameras, on farms. Applications such as Automated Crop Identification, Monitoring, Sorting and Grading, Precision Field Robots, and Livestock Monitoring all utilize Computer Vision to help farms to be more efficient in land use, chemical application, manpower, and time, while increasing their yield.
Companies like Queensland Drones from Australia and Gamaya from Switzerland also utilize drones combined with Computer Vision techniques like Optical Tracking and Remote Sensing to help farmers monitor large tracts of land, field geography, soil composition, irrigation, weed growth, and the crops themselves. Drones are also used to precisely apply pesticide, fertilizer, and water to crops that need it. This method is much faster than the farmer doing it, and also much more efficient than using other farm machinery like tractors.
Healthcare also has the potential to become a huge beneficiary of Computer Vision, especially in Medical Imaging. Companies like MaxQ AI, Microsoft’s Inner Eye, and Arterys have been using the technology to identify tumors and other anomalies from CT Scans, X-Rays, and MRIs. Another company, Gauss, uses an iPad app called Triton to measure the amount of blood loss of a patient just by scanning the bloody sponge used during surgery. These technologies make diagnosis much faster and more accurate compared to pure human-based analysis.
The application of Computer Vision in this field is still quite new, but it is already being used to help people find property and estimate property value. It does this by analyzing pictures of homes to extract their features such as how many bedrooms and bathrooms they have or what sort of countertop and kitchen they have. Aside from helping people looking for a suitable place to live, such data is also helpful for financial institutions to determine the credit rating of the owner of that home. Zillow already uses this technology to make their value estimations 15% more accurate than their human counterparts.
Computer Vision as a field is still growing and there is still a myriad of opportunities for commercial applications. Some of the companies we’ve mentioned already benefit from adopting these technologies, while others have yet to produce results. But despite the field still being in its infancy, companies, big and small, have already started placing their bets on this fresh and exciting field.
For you to be able to use Computer Vision for your company, you would first need people who are highly proficient in this field. In our next blog posts, we will show you the basics of how to implement a simple Computer Vision application, and how Full Scale can help you out with this.