Machine Learning in Computer Vision

Machine Learning in Computer Vision

Machine learning is the study of algorithms and statistical models. Systems use it to perform a task without explicit instructions, and instead rely on patterns and inference. Because of this, it can be applied to computer vision and pattern recognition. It can be classified as supervised learning, semi-supervised learning, and unsupervised learning. It uses data to make decisions and uses a subset of artificial intelligence. Additionally, it has applications in advertising, computer vision, and software engineering. To put it simply, machine learning is done by computers with minimal assistance from software programmers. This allows it to be used in interesting ways in a wide variety of industries.

Supervised Learning

Supervised learning is a machine learning task that maps input to output. Each input object has a desired output value. It has a wide range of algorithms for different supervised learning problems. It has applications in object recognition which is used in computer vision and artificial intelligence. Supervised learning is a way of training the computer to associate an object with a desired output.

What is Computer Vision

Computer vision allows computers to gain an understanding of digital images and videos. It seeks to automate tasks that human vision can achieve. This involves methods of acquiring, processing, analyzing and understanding digital images, and extraction of data from the real world to produce information. It allows computers to understand digital images and videos. It also has sub-domains in object recognition, video tracking, and motion estimation thus having applications in medicine, navigation, and object modeling. To put it simply, computer vision uses a device with a camera to take pictures or videos, then analyze them.

The Relationship between Machine Learning and Computer Vision

Machine learning and computer vision are two fields that have become closely related to one another. Machine learning has improved computer vision with regards to recognition and tracking. Computer vision, in turn, has broadened the scope of machine learning.

Machine learning offers effective methods for acquisition, image processing, and object focus which are used in computer vision. Computer vision involves a digital image or video, a sensing device, an interpreting device, and the interpretation. Machine learning comes into the picture in the interpreting device and interpretation stage.

Relatively, machine learning is the broader field and this is evident in the algorithms that can be applied to other fields. Computer vision, on the other hand, primarily deals with digital images and videos. Computer vision also has relationships in the fields of artificial intelligence, information engineering, physics, neurobiology, and signal processing. The fields most closely related to computer vision are image processing and image analysis. The analysis of a digital recording is done with the use of machine learning principles.

Tasks involving Computer Vision

Some typical tasks of computer vision are recognition and motion analysis. Many of these involve the use of machine learning. Computer vision tasks use a variety of methods for acquiring, processing, and analyzing digital images to produce data.

Recognition in Computer Vision

Recognition in computer vision involves object recognition, identification, and detection. Some specialized tasks of recognition are optical character recognition, image retrieval, and facial recognition.

Object recognition – Object recognition involves finding and identifying objects in a digital image or video. It is most commonly applied in face detection and recognition. Object recognition can be approached through the use of either machine learning or deep learning.

Machine learning approach – Object recognition using machine learning requires the features to be defined first before being classified. A common approach that uses machine learning is the scale-invariant feature transform (SIFT). SIFT uses key points of objects and stores them in a database. When categorizing an image, SIFT checks the key points of the image which matches with those found in the database.

Deep learning approach – Object recognition using deep learning does not need features specifically defined. The common approaches that use deep learning are based on convolutional neural networks. A convolutional neural network is a type of a deep neural network which is an artificial neural network with multiple layers between the input and output. An artificial neural network is a computing system inspired by the biological neural network in the brain. ImageNet is the best example of this. It is a visual database designed for object recognition. Its performance is said to be close to that of humans.

Motion Analysis

Motion Analysis in computer vision involves a digital video that is processed to produce information. Simple processing can detect motion of an object. More complex processing tracks an object over time and can determine the direction of the motion. It has applications in motion capture, sports, and gait analysis.

Motion capture – Motion capture involves recording the movement of objects. Markers are worn near joints to identify motion. It has applications in animation, sports, computer vision, and gait analysis. Typically, only the movements of the actors are recorded and the visual appearance is not included.

Gait analysis – Gait analysis is the study of locomotion and the activity of muscles using instruments. It involves quantifying and interpreting the gait pattern. Several cameras linked to a computer are required. The subject wears markers at various reference points of the body. As the subject moves, the computer calculates the trajectory of each marker in three dimensions. It can be applied in sports biomechanics.

Applications of Computer Vision using Machine Learning

Video Tracking

Video tracking is a process of locating a moving object over time. Object recognition is used to aid in video tracking. As sports involve a lot of movement, video tracking and object recognition are ideal for tracking the movement of players.

Autonomous vehicles

Computer vision is used in autonomous vehicles such as a self-driving car. Cameras are placed on top of the car providing 360 degrees field of vision up to 250 meters of range. The cameras aid in lane finding, road curvature estimation, obstacle detection, traffic sign detection, and much more. Computer vision has to implement object detection and classification.


Computer vision is used in sports to improve broadcast experience, athlete training, analysis and interpretation, and decision making. Sports biomechanics is a quantitative study and analysis of athletes and sports. For broadcast improvement, virtual markers can be drawn across the field or court. As for athlete training, creating a skeleton model of an acrobat and estimating the center of mass allows for improvement in form and posture. Finally, for sports analysis and interpretation, players are tracked in live games allowing for real-time information.

In order to achieve basketball analytics, computer vision is used to acquire the data. Video tracking and object recognition are used to track the movement of the players. Motion analysis methods are also used to assist in motion tracking. Deep learning using convolutional neural networks is used to analyze the data.

Second Spectrum, the official tracking partner of the NBA, uses big data, machine learning, and computer vision to provide analytics and to build machines that understand sports. Using optical tracking data, they have found that three-pointers and close shots are more effective than mid-range shots and that potential rebounds are clustered close to the basket.

Listen to this episode


These are just some of the many applications involving computer vision and machine learning. There is vast room for improvement in the algorithms used. In object recognition, there are still improvements to be made in their methods. In basketball, there is still much more analytics that can be done to aid in the decision making of each team. The same can be said of other sports. Computer vision and machine learning make these decisions possible.

At Full Scale, we believe in technology and innovation and how these things help us grow into the future. Our pool of field experts in Machine Learning and Computer vision are available to help you achieve the systems and technology you need to SCALE UP your business.

Contact us now and let’s realize your vision!

Contact us now to start building your team!