Introduction to AutoEncoders (UnderComplete / OverComplete / Sparse)

GenAI Modelling

3 min readJun 14, 2024

Problem Description

Our aim is to create fake data i.e. given an N feature dataset of let’s say 1000 datapoint, we need to create 1001st datapoint such that it closely represents those 1000 datapoints and is not something very random.

Solution

So since we know we need to use feed forward neural networks and have an N feature dataset (means input layer will have N inputs) for which we need to generate a fake datapoint which will also have N features (means output layer will have N outputs), this is how to architecture would look like…

The above architecture does solve the task at hand i.e. generate fake data of N dimensions but partially because it has some issues. Let’s understand these issues via a simpler architecture i.e. only 1 hidden layer!!