Previously on “How to design a machine learning trading bot”
We have started with “Collecting Data”:
We found out what is OHLCV data and, we learned why we need historical data and online data both together.
Then we continued with “Data Analysis”
We have seen how cleaning the data including features engineering is important. We learned that to make a stable machine learning model we need to prepare data in the right way and finally, we found out how visualizing data can help us to reach the goal.
Then we continued with finding a pattern:
We noted that you can be trapped like a horoscopist to find a pattern in data, and always follow the scientific ways and act as an astronomer. Then we had found a very simple pattern which was “SMA20” and we discussed how to label them to [0,1].
Now, it’s time to make a model. Like always I need to mention that at this season we design our product and in the next season we will develop it.
In regard, you should always design your pipeline on paper or even in your mind and then go for development. At least, I can say this is my way to follow the process and so far it works for me ;)
Step 4: Building a Model based on the determined pattern
Let's start with the format of data in our hands at this moment.
We have OHLCV plus SMA20 and a column named target, like this:
Source: https://gist.github.com/25mordad/64a39ecc0ef71140140251b61db93572
Machine Learning
So, at this step machine can help us to build a model, and the main question is how? Let’s review how machine learning classification techniques work:
First, the machine gets huge labeled data. (we already prepared it)
Then, by splitting the data to train and test, the machine is going to find a proper measure for the training set.
Finally, we do testing on the test dataset to find out if the model we built is fit or not.
I’ve just simplified the whole procedures, it’s a little bit more complicated than what I have told here
In many problems, they use machine learning to solve their problem that step is their final step and they reached their models, but in finance and trading problems, that level is not the end. We still need more jobs to do to find a fit model. We need to do “Backtesting” on the test dataset to review the result.
Backtesting
At this point, we need to define a strategy to buy and sell. In our sample (SMA20) our strategy to buy/sell could be very simple like this:
-The first time you see a label [1] open a long position (Buy)
- If you are in a Long position, close it, as soon as you see the label [0]
If you couldn’t follow what we talking about, check the following data. You can see the time we open the long position and the time we close it.
Source: https://gist.github.com/25mordad/511f39f7cb15260736248ce261008a0f
Conclusion:
A trading model is a package of machine learning methods plus backtesting. Even, in some cases to have better performance, you can apply some rules to your prediction.
Also, from my point of view, a complete model contains a Long and Short Strategy together. In this case, you need to have more than one ML model.
As an example, an ML model for the Long position and an ML model for the Short position. At this place, your creativity helps you to build a comprehensive model.