Applying Machine Learning To Cryptocurrency Trading

The post features an account of a machine learning enabled software project in the domain of financial investments optimization / automation in blockchain-based cryptocurrency markets. The article specifies the domain problem addressed as well as describes the solution development process and the key project takeaways.

The motivation behind the project stemmed from the challenge to utilize machine learning to train a model which would give buy/hold/sell signals for certain markets, which could possibly lead to increasing the portfolio value over time. As regards the trading context, we chose to experiment with blockchain-based cryptocurrency markets, such as Ethereum, Litecoin, Stratis and many more – we worked with about 70 markets in the research project. Since cryptocurrency markets are very volatile and they are still not strongly dominated by high-frequency trading bots, there are a lot of opportunities for making good trades facilitated by bots. That was at least our assumption or rather a hypothesis to be verified in the experiment we undertook.

What is the main reason behind using trading bots? Computers act logically and are not biased by things like hype, fear of missing out, greed etc. Unlike humans, bots are free from emotions that often drive people to make incorrect trading decisions. There are some strategies which do involve sentiment analysis of social media posts, but in our case, we decided not to take advantage of this kind of information. In a nutshell, we set out to build a bot that would help us trade in blockchain-based cryptocurrency markets more effectively and thus increase the value of our investment in the market.

Crafting a solution to meet the challenge

The project consisted of two components:

A stateful communication layer between the trading bot and the cryptocurrency exchange (we chose to go for the Poloniex exchange). This part was implemented with Elixir. The latter is a fairly young programming language running inside the battle-tested Erlang VM. Since in this specific context we needed a tool that could handle a high volume of concurrent communication, Elixir seemed a great fit for the job.
The trading bot itself which is the subject of the article; the bot was made with Python.

The overall data flow was designed in such a way that the Elixir component would fetch all the relevant data about the markets of interest and pass the data to the bot, which would in turn respond with a prediction (buy/hold/sell). Then the Elixir service would act upon the prediction by submitting a buy/sell order or doing nothing.

While building the solution, we chose to use the scikit-learn library (written in Python), as it comes with a large number of well-documented, ready-to-use data preprocessing tools, algorithms as well as solutions to visualize the results generated. This choice made it easier for us to focus on the domain problem itself rather than the technical intricacies of the implementation. Since the task was to fit each entry of our dataset into a single category (buy/hold/sell), the major problem we faced was about classification.

In order to train and verify the performance of the model, we gathered and processed historical price data from the last few years (between 1.5 and 4 years depending on the market data available) for about 70 different Bitcoin / cryptocurrency pairs on a popular Bitcoin exchange. The latter source provided us with millions of data entries which were transformed into feature vectors. The model was trained using the data from just one market, whereas the simulations were run on the data from the remaining markets. As for simulations, we assumed a starting portfolio of 0.008 BTC ($10 at the time) and the highest possible transaction fees on a given exchange. Initially, the input dataset needed to have the corresponding buy/hold/sell classes pre-assigned so that it could use the classes as examples to learn from. At first, the only features we extracted were the closing price from the previous and current time periods.

At the beginning the simulated results were not very promising – they revealed two very significant problems. Firstly, the model would make incorrect decisions most of the time, which would lead to a steady decrease in portfolio value over time. The other issue – which arose in another set of market cases – was that the model would make too few trades over a timeframe of a few years, without making any significant profit. At this point, we took a few steps in order to improve the performance of the model. The first step consisted in extracting additional features. We did some research on technical analysis indicators and eventually came up with a list of about 10 indicators which seemed to ensure the best results in similar trading challenges. In the second step, we worked to apply different classification algorithms and tweaked their parameters. We experimented with Support Vector Machines, Linear/Quadratic Discriminant Analysis, Random Forest, KNeighbors and many others, yet our final choice which provided the best results was the AdaBoost Classifier. All in all, in the end we built a classifier which relied on recent price changes as well as technical analysis indicators.

Key takeaways

Having applied all the tweaks mentioned above, we were surprised to find out that in more than half (43/77) of the simulations run the portfolio reached at least 1 BTC (a 12500% return) at some point, and the gains were not correlated with the performance of the given market. The results surpassed our expectations at this stage of the experiment.

Since the simulations went exceptionally well, we wanted to start testing the bot against real exchange markets as fast as possible. The initial results proved what we had actually expected: the simulations were not perfect and some new problems surfaced.

Unfortunately, when we started testing the bot in a real environment, we hit the moment when the Poloniex exchange was gaining a lot of momentum in terms of the number of online users, which resulted in very poor performance of the exchange overall. To combat the phenomenon, Poloniex engineers decided to limit the allowed API requests in a certain timeframe. Since this is what our project significantly relied upon, testing our bot there at that time became impossible at some point and we had to back off.
Sometimes, the orders being placed by the bot, would not be filled due to the bot being too optimistic/pessimistic when it comes to the buying/selling price – it was something that we had not accounted for in our simulations. Mostly due to this, the bot would make a lot of conflicting decisions which would lead to a slow decrease in the portfolio value. We made a set of small tweaks to alleviate the problem, yet the corrective measures worked only to some extent.

We feel that it is still too early to judge the project conclusively, i.e., whether it was successful or not. Despite the problems described, we keep on testing and improving the trading bot as it does look very promising given the early stage of its development. We managed to eliminate a number of defects by constantly evaluating and adjusting the performance of the bot. We still have a few more ideas about what can be improved to make it an even better solution.

The project was implemented and the article was written by Paweł Duda.

Tags: bitcoin, btc, cryptocurrency, machinelearning, ML, scikit-learn, trading