Stock Prediction [7]: Creating the Model
Our dataset has now been created. We have a CSV with a bunch of numbers, and it’s time for the cool part – using Deep Learning to adjust article sentiment to the publishing source. In Part Seven of our Stock Prediction using Artificial Intelligence series, we learn how to split/organize data, build a model, and train the model for future predictions.
We first need to organize our data. In Deep Learning, this is a common technique called Data Splitting, where we divvy up the data into a “Training” batch and a “Testing” batch. We can think of this concept in simple mathematical terms:
Imagine a dataset {(0,0) (1,2) (2,4) (3,6) (4,8) (5,10) (6,12)}.
We would take the first five data points {(0,0) (1,2) (2,4) (3,6) (4,8)} and feed it into our model. These points are called the Training Set.
The model then plays around with the data and finds the function y = 2x. Then with this model/function, we can plug in 5 and 6 and see if it returns 10 and 12, respectively. These two points, {(5,10) (6,12)}, are called our Testing Set.
In a similar way, we need to divide our newly created dataset into a Training and Testing batch. 70-30 is generally a good split, so we will allocate roughly 70% of the dataset for Training purposes and 30% of the dataset for Testing.
split = 0.7 print("Number of Instances: " + str(NumRows)) X_train = df.iloc[0:int(NumRows*split),[0,1]].as_matrix() X_test = df.iloc[int(NumRows*split):NumRows,[0,1]].as_matrix() Y_train = df.iloc[0:int(NumRows*split),[2]].as_matrix() Y_test = df.iloc[int(NumRows*split):NumRows,[2]].as_matrix()
With our 4 separate batches, X_train, Y_train, X_test, and Y_test, we can now create our model. We use Keras’ Sequential model for this. Our first layer will be a Dense layer with 12 neurons, our second layer with 8 neurons, and our output layer with 1 neuron, since it’s returning a simple number. We use Binary Crossentropy loss (‘binary_crossentropy’) and the ‘Adam’ optimizer. Dr. Jason Brownlee has a great post on what Adam is and why it’s so powerful. We then use model.fit on our X_train and Y_train batch to get the model to find a ‘function’ for the data points we just fed it.
Once this is finished, we then serialize the model to a JSON and the weights to an HDF5. This basically saves the model (our new function) so that in the future we can load it up in a couple of seconds instead of having to re-train the entire model every time we want to use it.
We can test our model using X_test and see how many of the results match up to Y_test.
We can then use model.predict to use the model to find a Y result given an X result we give it.
How did you decide on the number of layers and neurons? Is it basically trial and error to see what works best?
Hi Hiro – Yes, the number of layers is largely based on “trial and error” – there is no golden rule on how many hidden layers a model should have. Generally, more complicated inputs require more layers – keep in mind, though, that larger models are more computationally expensive. Good luck!