How to design a machine learning trading bot - Part 1: Data Collection


Bahman in How to

Jan 01, 2020

Trading with the machine learning method has just been started and many people want to know more about it. In this series of articles, I’m going to tell you how to design and develop intelligent automation to trade in the market.

This article is the first episode from a series of articles with the title of “How to design a machine learning trading bot”

In this article, we are going to design a pipeline. We don’t develop a program to trade at this moment. However, when we make our plan, in the next season we will implement the program.

Here is a table of content regarding what we expected to follow in future articles:

Season1 Episode1: Collecting Data (We are here!)

S1E2: Analyzing Data

S1E3: Finding a pattern in analyzed data

S1E4: Building a Model based on the determined pattern

S1E5: Running an automation

S1E6: Monitoring the trade and Risk Management

S2: Development

It’s so important before you start your development, you have a clear idea of what you want to do and what you want to implement.

Step 1: Collecting Data


Which market do you want to trade? Is it the crypto market or Forex? Maybe it’s the stock market? Before starting anything, you should make sure you have the right data in your hand. Let start with the cryptocurrency market as a sample.

Collecting Data

In the cryptocurrency markets, each exchange has different data, but most of them at the most time are following the main trend.

I suggest selecting one of the main exchanges and most effective in the market which is Binance. Then, select the pair you want to work on such as Bitcoin/USD or Bitcoin/Euro. In our sample, we choose BTC/USDT in Binance.

Now, we need two items:

First, we should start to collect data from Binance, and also, we need the historical data of BTC/USDT on Binance. There are a lot of websites and web services that provide historical data for free.

So, why we need historical data, and why we should start to collect data now?

In the following, we will use historical data to analyze them and find a pattern in data. Plus, we need to start collecting the data to use them in our prediction segment. The prediction part is a part of the “Building model” in our process.

Before we finish this article, let’s talk a little bit more specifically about our expectations of data.

A minimum numeric data we need is OHLCV data and, I recommend collecting the 1m (one minute) data frames.


OHLCV stands for

O: Open Price

H: High Price

L: Low Price

C: Close Price

V: Volume

In our following article in the next season (Development Season), we will show how to develop a program to collect 1m OHLCV from BTC/USDT.


This post was originally published on