Non Modelling Methods for Image Representation Learning

Sarvesh Khetan

3 min readNov 14, 2024

Method 1 : Image Flattening Representation
a. For GrayScale Images
b. For RGB Images
Method 2 : Kernel Based Representation
a. Motivation
b. For GrayScale Images
c. For RGB Images
d. Drawbacks

Method 1 : Image Flattened Representations

For GrayScale Images

For RGB Images

Above I have shown taking average of all channels but instead you can also take sum of all three channels

Method 2: Kernel / Filter Based Representations

Motivation

Drawback of above method is that flattening leads to loss of spatial information in the image and spatial information is crucial. Hence to solve this problem researchers came up with this idea!! Taking inspiration from NLP wherein we used surrounding context and current context to create embedding for current word similarly here in images to inculcate spatial information we will make use of surrounding / neighbouring pixels and the current pixel while creating embedding of the current pixel

For GrayScale Images

Hence Image Flattened Representation is nothing but a single kernel of size 1*1

For RGB Images

Drawbacks

This is a manual approach and hence needs domain expertise to specify which kernel needs to be used for specific use-case. For example : If you want to detect if a person is sleeping in the class or not, a vital feature is to detect the distance between the eyelids. Hence you need to design a kernel which can perform this.

Non Modelling Methods for Image Representation Learning

Table of Contents

Method 1 : Image Flattened Representations

For GrayScale Images

For RGB Images

Method 2: Kernel / Filter Based Representations

Motivation

For GrayScale Images

For RGB Images

Drawbacks

Written by Sarvesh Khetan

No responses yet