The Perceptron

The perceptron is the historical ancestor of every neural network. Frank Rosenblatt introduced it in 1958 as a mechanical model of biological perception; Minsky & Papert's 1969 book Perceptrons showed its limits and triggered the first AI winter; the multi-layer perceptron and backpropagation eventually revived it.

The algorithm

Given labelled data ${(x_{i}, y_{i})}$ with $y_{i} \in {- 1, + 1}$ , initialise $w \leftarrow 0$ . Iterate over examples:

If $y_{i} (w^{⊤} x_{i}) \leq 0$ (mistake), update $w \leftarrow w + y_{i} x_{i}$ .

That is the entire learning rule.

Convergence (Novikoff, 1962)

If the data is linearly separable with margin $γ$ and $‖ x_{i} ‖ \leq R$ , the perceptron makes at most $(R / γ)^{2}$ mistakes — independent of dataset size.

Why it eventually mattered

The mistake-bound proof is the prototype for online learning.
Stacking perceptrons gave the MLP, which with backpropagation became deep learning.
The dual form (storing only support vectors) anticipates SVMs.

The Perceptron ​

The algorithm ​

Convergence (Novikoff, 1962) ​

Why it eventually mattered ​

What to read next ​

The Perceptron

The algorithm

Convergence (Novikoff, 1962)

Why it eventually mattered

What to read next