Revise Key Terms and Concepts of Deep Learning in 5 minutes


Tensor: It is a mathematical object and can be a number, vector, matrix, or an n-dimensional array.

Padding: Increase the image size shape by adding the given amount of pixels when it is being processed by the kernel of a CNN

Stride: The value determines the kernel's jumping over how many pixels while moving the input.

Max-Pooling: Decrease the height and width of the output tensor from each convolution layer by replacing the max of the block.

Use of DataLoader: Split the dataset into batches of data of given size and also provides the utilities like shuffling, random sampling while forming a batch.

Use of Validation set: Helps in evaluating the model during training i.e adjusting hyperparameters and pick the best version of the model. By this, we can also identify the occurrence of overfitting.

Can accuracy be a loss function for a classification problem?

No. The accuracy is not a differential function so we cant compute the gradients as there is no mathematical form; la and it also can't provide feedback from improvement as it doesn't take the actual probabilities to predict into account.

Is ReLU a liner function?

No.Its not a linear function it returns the input if the input > 0 and for all inputs less < 0 it return 0. 

Some problems with feed-forward Neural Networks?

Every pixel is independent is the assumption made there which is not correct. We don't consider spatial and local relationships.

How do you avoid overfitting?

GIving more data to model, Batch normalization, and other regularization techniques. Early stopping can also be used when we observe the increase in the validation set loss 

What is data augmentation?

Randomly chosen transformation while loading training  data like random horizontal flip, pad with reflect mode and randomly crop, 

What is a Batch normalization?

It is a process to make neural networks faster and more stable by adding extra layers in a deep neural network. The new layer performs the standardizing and normalizing operations on the input of a layer coming from a previous layer. It's done for batches, not single input.

What is Weight decay?

It's a regularization technique that prevents the weights from becoming too large by adding an additional term to the load's function so all weights are in some range.

What is LeakyReLU?

It's an activation layer similar to REelu but the difference is that it will give a small negative number when the input less than 0. In ReLU it's always 0 if input less than 0 but here y = ax where lies between 0 and 1.

Comments

Popular posts from this blog

Loss Functions Part-2

Learning Optimization(SGD) Through Examples