Deep Neural Networks

The era between the AlexNet moment (2012) and "Attention Is All You Need" (2017) — and the foundational machinery of every modern model.

Reading path

Building Blocks — perceptron → MLP → backprop → losses → initialization.
Optimization — SGD, Adam, learning-rate schedules.
Regularization & Generalization — dropout, batch norm, augmentation, double descent.
Convolutional Networks — convolution, LeNet/AlexNet, VGG/Inception/ResNet.
Recurrent & Sequence Models — RNN, LSTM/GRU, seq2seq, Bahdanau attention.
Generative Models (pre-diffusion) — autoencoders, VAEs, GANs, normalizing flows, PixelRNN/CNN.
Graph Neural Networks — GCN, message passing, GAT.
Reinforcement Learning — full track imported from NoteNextra · CSE510.

For vision-specific deep architectures, jump to Computer Vision →.
For language models and the post-attention era, jump to Large Language Models → and then The Transformer Era →.