Saturday 30 March 2019

Machine Learning: Rubber Ducking a Neural Network

In the book The Pragmatic Programmer, a programmer used a rubber duck to help him understand his code issues.

He described his code to the duck, line by line, easy enough for a rubber duck to understand. After describing a couple of lines, he understood the issue himself.

I'll try to do this for a simple neural network program that I have. It is based on a tutorial for a very simple neural network:
The indata is weighted and summed up to generate a prediction. In the example
In the example above, a farmer uses a dataset of eight flowers that are red or blue. Each flower has a width and a length.
The length on the X axis and the width on the Y axis.
The training data consists of blue flowers (0) and red flowers(1).
The gray flowers are to be classified by the neural network.
I start by guessing the bias and the weight factors for the neural network.
After that, I iterate over all known flowers and estimate the color using the weights. I sum up the errors as a metric of the progress of the neural network. The new weights and bias are adjusted by calculating some derivatives (slope of a cost function).

Each iteration will try to bring the predictions closer to the target:

I ran the program several times for the same dataset but with different number of iterations over the training data.
Parameters Flower 0 Flower 1 Training Data
Iterations w1 w2 b Prediction Cost Prediction Cost Cost
0 (guess) -0.134 -0.033 -0.253 0.381 0.145 0.291 0.502 NaN
1 0.109 0.142 0.165 0.616 0.379 0.690 0.096 2.51
10 0.345 -0.334 -0.954 0.317 0.100 0.566 0.188 1.742
100 0.900 -0.530 -3.126 0.091 8.228e-03 0.598 0.162 1.394
1 000 1.576 -0.316 -5.865 21.51e-03 462-5e-06 0.713 82.45e-03 1.316
10 000 2.250 -0.102 -8.392 5.948e-03 3.538e-06 0.836 26.78e-03 1.316
10 000 2.250 -0.103 -8.391 5.945e-03 3.536e-06 0.836 26.73e-03 1.316
10 000 2.251 -0.106 -8.390 5.945e-03 3.534e-06 0.837 26.66e-03 1.315
100 000 2.907 0.120 -10.78 1.833e-03 3.360e-06 0.918 6.660e-03 1.239
1 000 000 3.547 0.373 -13.12 0.595e-03 3.543e-06 0.961 1.491e-03 1.192

As expected, the first iteration is basically a guess. It takes a lot of iterations to get predictions that are close to the actual values. For flower 0 (Blue), it takes thousands of iterations and for flower 1 (red), it takes hundreds of thousands iterations. 

I also ran a prediction on the same training data but with a test flower that had a sightly shorter blade. For that flower, it was much harder to predict for the neural network (it said 51% red). As one can see from the training data, the rightmost gray (unknown) flower is surrounded by red ones. However, it is still possible that the blue ones in the middle can reach out to that flower. For the leftmost gray flower, it is easier to predict where it belongs.

Iterating hundreds of thousands of times takes some time. I need to find better ways to estimate the new parameters, such as optimizations and built-in functions.

I see three factors that will make machine learning difficult:
  • Machine learning takes a lot of computational resources.
  • The data is often imperfect due to bad sensors, operator errors and other factors
  • The world it self is often irregular with stochastic processes and unknown unknowns that will confuse the learning of neural networks.
Why code the algorithm myself instead of using any of the existing ones? The purpose of this experiment is to learn neural networks from the ground, not making cool predictions without understanding what I'm doing. 

In the next blog post, I'll try to use some real world data to see if there can be any predictions of the outcome.

No comments:

Post a Comment