I developed a RUT crash prediction system using the random forest model to guide trading and assist decision-making.
Feature engineering includes
1 Rut ohlcv price data, and some technical indicators, e.g. macd, rsi, etc.,
2 Macroeconomic data, such as: non-agricultural employment index, CPI, PPI, GDP, etc.
The system obtains daily RUT daily OHLCV data from Yahoo Finance, and gives the probability of crash >7% in the future 20 trading days (30 natural days) through statistical analysis.
On the second day, the code collects the data of the previous trading day and retrain the model. The prediction probability graph is as follows
On February 24, 2025, the system issued a warning signal on February 3, 2025, which means: there will be a >6% drop in the future 20 trading days from Feb 3.
On February 26, 2025, the system issued a warning signal on February 5, 2025, which means: there will be a >6% drop in the future 20 trading days from Feb 5.
On February 28, 2025, the system issued a warning signal on February 6, 2025, which means: there will be a >6% drop in the future 20 trading days from Feb 6.
One possible explanation is that when new price data enters the model, the model will be retrained and adjust the past historical predictions so that the results of the test set can be self-consistent.
Problem: The model’s behavior of modifying the historical signal date after new price data enters the model makes it impossible for traders to use the system accurately.
In other words, the model is more about explaining itself using new data instead of making predictions.
For example: When the trader is on February 24, 2025, the system issues a warning signal on February 3, 2025. At this time, the system prompts a decline, the decline has occurred, the short kinetic energy is released, and the trader opens a long position.
As a result, within a week, the model continuously modifies the date of the warning signal, which confuses the trader.
How to solve this problem?
Thanks
2025.2.24 crash prediction - signal date: 2025.2.3
2025.2.26 crash prediction - signal date: 2025.2.5
2025.2.28 crash prediction - signal date: 2025.2.6