This project is no longer maintained.
I created this project during my University times, but now I do not have the time nor willing to continue this by myself.
The repo will be put on read-only mode, but you are free to clone/fork it as you prefer to continue the work I've done.
This project represents a machine learning program that is able to recognize patterns inside Forex or stock data.
Currently all the data loaded are the ones represented inside the GBPUSD1d.txt file, which contains bids and asks for 1-day tick data.
You can also try out the system loading the 1-month tick data (1.63 milions entries) simply by changing the followin code
# settings.py
# Use GBPUSD1d for 1-day tick data (~62k lines) or GBPUSD1m for 1-month tick data (~1.63 mln lines)
file_name = "data/GBPUSD1d.txt"
This data was downloaded from the Sentdex.com website, following the url http://sentdex.com/GBPUSD.zip (direct download).
This code was written following the YouTube playlist Machine Learning for Forex and Stock analysis and algorithmic trading uploaded by the YouTube channel sentdex, adapting it to Python 3 and performing some minor performance improvements. This was done as a way to get into practical machine learning after having followed the Machine Learning of the Master Degree in Computer Science at the University of Padua.
In order to run the code you need to have the following libraries and programs installed on your computer
- Python 3.6
numpy
matplotlib
You can install matplotlib
by following the matplotlib official guide, and numpy
following numpy's official guide.
The program will run the following tasks
- Take the first
end_point
(insidesettings.py
) lines of data. - Recognize all the different patterns of data made by
dots_for_pattern
entries each one and store them. - Take
dots_for_pattern
entries that have never seen before (30 steps into the future). - Recognize the patterns that are most similar to the new data. This means that
- Each pattern must be made of points that are at least 50% similar to the corresponding point inside the new data
- Each pattern must be overall at least
pattern_similarity_value
% similar to the new pattern
- Save into the
patterns
folder a graph with the following data plotted inside- All the patterns that have been recognized, each one with a different color
- The pattern that was been searched for, in a turquoise and thicker line
- The predictions of the outcome of all the recognized patterns
- A green dot for each outcome that represents a prediction of a rise
- A red dot for each outcome that predicts a fall
- The average predicted outcome value as a blue dot
- The real outcome value as a turquoise dot
- Increment by one the considered set of data and repat from point (1).
I'm open to all kinds of improvements that can be possibily made, as long as they are submitted using well-documented pull requests.
Copyright 2018 Riccardo Montagnin.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.