Using Deep Learning for Stock Market Prediction seems fun, but before you can do that, you have to set up your computer’s Python environment! If you already have a Deep Learning environment set up, make sure you have the Stock-Specific Packages (listed below)
Setup/Dependencies:
… as well as some
Stock-Specific Packages: (install via
pip)
We recommend installing the upper list of packages using
Anaconda. If not, pip will always work. Also note that installing Scipy automatically installs other packages such as Numpy!
Installations work with:
conda install <package_name>=<version>
pip install <package_name>==<version>
For example:
Conda install theano=0.8.2
Pip install keras==1.2.0
After installing Keras, type in:
vi ~/.keras/keras.json
Change the backend from “Tensorflow” to “theano”. You can do this by using the
VI commands. In a summary, typing “i” will allow you to insert text before cursor. Clicking the “esc” key and then typing “:x”will save and exit VI.
In a new Jupypter Notebook, here is the code used to in our project, as well as things we will may need in the future:
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasRegressor
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import KFold
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
from sklearn import datasets, linear_model, preprocessing
from sklearn.metrics import mean_squared_error
from sklearn.model_selection import train_test_split
from matplotlib import pyplot as plt
import pylab
import numpy as np
import pandas as pd
from pandas import read_csv
import requests
from contextlib import closing
import math
import csv
import os
#Alpha Vantage - Data Point Requests:
from alpha_vantage.globalstockquotes import GlobalStockQuotes
apikey = 'insert_apikey_here_later'
gsq = GlobalStockQuotes(key=apikey)
#News Articles URL Searches: - news-corpus-builder
#from gnewsclient import gnewsclient
from news_corpus_builder import NewsCorpusGenerator
corpus_dir = './CorpusLogs/'
NewsGen = NewsCorpusGenerator(corpus_dir,'flat')
#News Article Text Retriever:
from newspaper import Article
#Slope Algorithm
import math
#Numpy Random
seed = 7
np.random.seed(seed)
#Variables
look_ahead = 20 #How far to look ahead when calculating slope
ROAD MAP:
The basic model works like this: We take a stock symbol and get a list of recent news URLS specific to that stock. We make sure that these news sources are from a certain list of trusted companies. Then we convert them into Body Text. We define a dictionary of Positive and Negative words, and check them against the Body Text by counting the number of Positive and Negative words, and returning a score if it is more Positive of Negative.
For training purposes, we need to create a
Dataset using the method above. We already have our X values to plug into the function – a Pos/Neg score, along with the article source. Now we need our Y. In other words, how did the article affect the stock prices? To do this, we take a historical CSV chart of the stock’s data points, and take the slope of the
Adjusted Close price in the range of the date’s publishing, showing us if the stock shows an upward or downwards trend.
We can then define and train our model, in order to use for real-time analysis!
Want to Follow This Series?
Where can I get the MasterDataSet.csv file? I tried the website listed https://drive.google.com/file/d/0B4niqV00F3msaFZGUEZNTGtBbIU/view
but I get page not found