This is the third part of the stock project, where I will use the validated data to train a machine learning system to predict the performance of a stock based on the data that was available at the date of the stock record.
The first step is to make some design considerations for the algorithm. The earliest program will only look at the current stock record for features (X) and the results (y) will be the daily change of the stock.
If I select the prediction window to one week, one example of features and results would be:
The results from a query in the database will look like this:
The X data can consist of values of P, P/E, P/C, Yield, PMI, RSI and the time to/since the last dividend. The y data for the stock record of 2018-10-31 will be the change (%) divided by the days:For training data to be useful, there must be a price record in the future that can be used ad y.
Regression or Neural Networks
Both regression and neural networks has some pros for this dataset.
Linear regression
A linear regression would be intuitive for predicting the change in stock price. But there are some zeros in the underlying data that I fear will skew the results. It also seems to be tricky to handle XOR relationships between features.
Multi-Layer Neural Networks can model complex relationships such as XOR relations and zeroed records. I'll start with this one initially. The
SciKit-Learn
Sklearn has a neural-network-ish regressor that I will investigate.
No comments:
Post a Comment