Deep Neural Networks
The era between the AlexNet moment (2012) and "Attention Is All You Need" (2017) — and the foundational machinery of every modern model.
Reading path
- Building Blocks — perceptron → MLP → backprop → losses → initialization.
- Optimization — SGD, Adam, learning-rate schedules.
- Regularization & Generalization — dropout, batch norm, augmentation, double descent.
- Convolutional Networks — convolution, LeNet/AlexNet, VGG/Inception/ResNet.
- Recurrent & Sequence Models — RNN, LSTM/GRU, seq2seq, Bahdanau attention.
- Generative Models (pre-diffusion) — autoencoders, VAEs, GANs, normalizing flows, PixelRNN/CNN.
- Graph Neural Networks — GCN, message passing, GAT.
- Reinforcement Learning — full track imported from NoteNextra · CSE510.
Continue from here
- For vision-specific deep architectures, jump to Computer Vision →.
- For language models and the post-attention era, jump to Large Language Models → and then The Transformer Era →.