03 Machine learning · Feb-May 2025

MNIST Neural Network.

A handwritten digit classifier built from first principles in Java: initialization, forward propagation, ReLU, softmax, cross-entropy, backpropagation, gradient clipping, and serialization.

MODEL784 → 128 → 64 → 10

STACKPure Java

RESULT90% test accuracy

INPUT / 28×28 GRAYSCALE

Draw a digit

Draw here

Your strokes are downsampled into the same 784 normalized pixel values used by the Java Guess class.

OUTPUT / SOFTMAX

Prediction

Loading weights

CONFIDENCEWaiting for model

00%

10%

20%

30%

40%

50%

60%

70%

80%

90%

01784 pixelsNormalize 0-1

02128 neuronsWeighted sum + ReLU

0364 neuronsLearned features + ReLU

0410 logitsStable softmax

LIVE INFERENCE This demo loads weights exported from trained_brain.ser and runs the network forward pass in your browser.

No libraries.
No black boxes.

The goal was not simply to classify digits. It was to understand every operation hidden behind a modern machine-learning API.

Each 28×28 image becomes 784 normalized inputs. Two hidden layers learn useful features through ReLU activation. Ten raw output logits are converted into probabilities with a numerically stable softmax, and the highest probability becomes the prediction.

TRAINING / 10 EPOCHS

How the model learns

01Shuffle

Randomize all 60,000 samples each epoch to prevent last-class bias.

02Forward

Compute hidden activations and ten raw output logits.

03Loss

Measure error with cross-entropy against the one-hot label.

04Backprop

Propagate output error backward through every weight.

05Clip + update

Clamp gradients to ±0.1, then apply learning rate 0.01.

EPOCH 0 / 10LOSS --

PROBLEM 01

Exploding gradients

Large unstable updates prevented useful learning.

FIX Xavier initialization plus gradient clipping at ±0.1.

PROBLEM 02

Always predicting 9

Ordered training made the last class dominate recent updates.

FIX Fisher-Yates-style sample shuffling before every epoch.

60,000MNIST training images

109K+learned weights and biases

0machine-learning libraries

Next projectAnnouncements Infrastructure→