Internal Functioning of Image Recognition

( 4 mins read )

AI that can think for itself is a technological marvel in itself. But the dizzying speed of invention and development has no longer left room for awe for simple self-learning. Now, the buzz revolves around machines that can exercise proper discretion, that can decide between right and wrong. Understandably so. The applicative benefits are simply mind-boggling and plenty, but this blog centres around just one of the applications: image recognition. But first, let us understand the basis of the entire discerning AI industry.

Deep Learning

Typical AI/ML systems work by extensively studying training data sets. The data is then ‘made sense of’ and can be manipulated and processed to give the result we want. However, that’s about it; the system can hardly go beyond what it was fed as training data. Discerning machines on the other hand can distinguish between objects, entities, certain truths or falsities. This can be achieved using a learning mechanism called as deep learning.

Deep learning algorithms are algorithms that work hand in glove with neural networks, which are the machine equivalent of the human being’s thinking brain. They learn under the supervision of example training sets, which in turn consist of domain classes and labels (truth values). The neural networks once trained don’t just stop there. There is a feedback loop accompanied by a weight bias that enables the algorithm to justify right or wrong/true or false/ correct or incorrect. The end result? A discerning, self-learning technology that can determine the correctness of problem solutions. Moreover, such an expert system can also divide a problem into a comprehensive solution set and then parallelly branch out over a number of possible solution paths. This avoids a dead-end scenario where we find out too late that a solution path was incorrect. The advantages of such parallelism are immense.

Factoring in Image Recognition

Image Recognition systems are intelligent systems that harness deep learning’s ability to ‘divide and conquer’. They can perform a complete analysis of an image and break it down into separate components. Logos, faces, objects, boundaries etc. are separated from the image and can then be processed individually. This field of AI is widely known as computer vision on account of a machine’s ability to understand what it is seeing.

The bedrock of IR is classification, which is a fundamental characteristic of supervised learning. In a nutshell, it means assigning value to things, matching patterns, sorting things into category buckets.

Mimicry of the Human Eye

Colours and visions enter through the eyes as signals and encounter the brain’s visual cortex. We’ve already discussed that neural networks are imitations of the human brain so it is but natural to model IR systems on the human eye. Much like the cortex processes the vivid signals and understands what the eye is seeing, IR perceives the image it is exposed to as a vector/raster image. Vectors are ‘color polygons’ storing numeric values as multi-dimensional arrays. Raster images work on pixel sequences having color values assigned to them.

The first task of image recognition after it has been fed an image is to get rid of all unnecessary data points in that image. IR treats this garbage data as noise, and its presence might later lead to incorrect image analysis later. Next in line is shape detection and edge detection, whereby the recognition system starts off the process of breaking down what it has understood from the image. The output of this initial pre-processing is a feature vector containing various descriptors of the image.

Moving forward, the image recognition system is tasked with prediction i.e. estimating the class label / category of an individual object (background or foreground, part of an object or not etc.). The neural network which forms the backbone of the IR system needs to ‘learn’ a lot of algorithms before it can function optimally to discern the various objects in the image. Some algorithms for image classification are bag-of-words, support vector machines (SVM), face landmark estimation (face recognition), K-nearest neighbors (KNN), logistic regression etc.

Some applications of Image Recognition

Google Photos: Organizing a photo gallery on the basis of places, people and even occasion.
Reverse Search: Clicking a photograph and getting directed to Internet resources providing info on the photograph.
Social Media: Facebook boasts of 98% accuracy in their image recognition algorithms. It uses IR to identify people, places, and image content. One of their novel uses of IR is to translate content for blind people and also filtering out potentially offensive content.
Biometrics: A widely known use of Image Recognition is in Face Detection. Face Detection can help in sign-in systems, phone security, criminology and heaps of other applications, limited only by imagination.

Claim Genius’s Participation in the world of Image Recognition

We are members of the insurance industry. It is our philosophy to work our expert systems in a way that automates the convoluted tasks. Using Image Recognition, Claim Genius’ GeniusCLAIM tool instantly and accurately determines damage severity, affected parts, repair/replace decisions, and total loss determination, giving carriers a comprehensive, point-of-accident view of vehicle status. To facilitate elegant user experience, we have introduced a mobile app that can upload accident-site photographs. From the photograph, the system performs AI parts grading, and reverts back with a complete damage analysis.

If you are a carrier looking to flourish your relationship with your insured clients, our tools and services can help you revolutionise the whole claims experience. Better technology, better business!

Post Views: 3,830