Blockchain: Concepts, Issues, and Applications
Estimation of Bitcoin Price Using Long Short-Term Memory (LSTM) by Feature Selection with Genetic Algorithm
Coşkun Parim, Tuğba Güz, Erhan ÇeneBitcoin (BTC) is a well-known cryptocurrency traded in financial markets based on blockchain technology. In recent years, bitcoin draws more attention than other financial investment instruments due to its capabilities of 24- hour trading, high volatility, and high profit potential with the price of high risk. The aim of the study is to estimate bitcoin price using 26 variables which consists of blockchain information, macroeconomic factors and global currency ratio variables such as average block size, cost per transaction, hash rate, gold futures, USD/EUR. Classical methods for estimation may not be appropriate when there are multicollinearity between variables and their level of stationary is not in the same order. Thus, LSTM will be used instead of classical methods for estimation which doesn’t effected above mentioned restrictions.
Three machine learning based regression models are developed for estimating bitcoin prices for the 1 day, 7 days and 30 days ahead. The time series dataset to be used in the study is from 7/19/2010 to 7/19/2022. First of all, genetic algorithm is used to select relevant variables that affects bitcoin prices for the 1st, 7th and 30th days, respectively. For each set of variables, datasets are chronologically divided into training, and test data. Afterward, the bitcoin price is estimated by using the selected variables with LSTM. The performance of the models are assessed with R square, MAE, and RMSE metrics to compare the performance of Bitcoin estimation models.
Among constructed models one day ahead model has the highest R2 (0.98), and lowest RMSE (8204.57) and MAE (6341.62). As the estimation period gets longer R2 tends to get lower (0.96 for seven days ahead and 0.90 for thirty days ahead), RMSE and MAE tend to get higher. This is because it becomes harder to make estimation for further in time as uncertainty gets higher. Also in the case of Bitcoin, prices increased suddenly in the test data set which makes it difficult for the models to catch it when the time interval increased. This sudden increase in the Bitcoin prices also resulted the LSTM model estimation to be lower than the actual values.
In conclusion, it is stated that the bitcoin price using LSTM for the 1st, 7th, and 30th days are estimated with low error and models can detect the changes in bitcoin price adequately.