Randomize all 60,000 samples each epoch to prevent last-class bias.
MNIST Neural Network.
A handwritten digit classifier built from first principles in Java: initialization, forward propagation, ReLU, softmax, cross-entropy, backpropagation, gradient clipping, and serialization.
Draw a digit
Your strokes are downsampled into the same 784 normalized pixel values used by the Java Guess class.
Prediction
LIVE INFERENCE This demo loads weights exported from trained_brain.ser and runs the network forward pass in your browser.
No libraries.
No black boxes.
The goal was not simply to classify digits. It was to understand every operation hidden behind a modern machine-learning API.
Each 28×28 image becomes 784 normalized inputs. Two hidden layers learn useful features through ReLU activation. Ten raw output logits are converted into probabilities with a numerically stable softmax, and the highest probability becomes the prediction.
How the model learns
Compute hidden activations and ten raw output logits.
Measure error with cross-entropy against the one-hot label.
Propagate output error backward through every weight.
Clamp gradients to ±0.1, then apply learning rate 0.01.
Exploding gradients
Large unstable updates prevented useful learning.
Always predicting 9
Ordered training made the last class dominate recent updates.