Next day's deviation price

Red_Red · March 29, 2025, 2:09pm

Hi,
The code to predict next day’s up & down deviation is missing. It should have at present day. For example, by running the model today on 1st Apr, there should be prediction code for 2nd April and so on. Which function is applicable for this request? Thanks.

Course Name: Trading with Machine Learning: Regression, Section No: 7, Unit No: 7, Unit type: Notebook

Ajay_Pawar · April 2, 2025, 2:55am

Hi RR,

Following code shows how to predict for last row, the model is just for demonstration:


import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import yfinance as yf
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
from sklearn.model_selection import GridSearchCV, TimeSeriesSplit
import matplotlib.dates as mdates
from datetime import datetime

# Define date range
start_date = '2000-01-01'
end_date = datetime.now().strftime('%Y-%m-%d')

# Download data
print(f"Downloading data from {start_date} to {end_date}")
nifty = yf.download('^NSEI', start=start_date, end=end_date)
gold = yf.download('GLD', start=start_date, end=end_date)

# Handle MultiIndex columns if they exist
if isinstance(nifty.columns, pd.MultiIndex):
    nifty.columns = nifty.columns.droplevel(1)
if isinstance(gold.columns, pd.MultiIndex):
    gold.columns = gold.columns.droplevel(1)

# Calculate monthly average prices
nifty_monthly_avg = nifty['Close'].resample('M').mean()
gold_monthly_avg = gold['Close'].resample('M').mean()

# Combine data and drop any rows with missing values
data = pd.DataFrame({
    'nifty': nifty_monthly_avg,
    'gold': gold_monthly_avg
}).dropna()

# Create lag features and target variable
data['nifty_lag1'] = data['nifty'].shift(1)
data['gold_lag1'] = data['gold'].shift(1)
data['future_gold_price'] = data['gold'].shift(-1)

# Save last row for prediction
last_row = data[['nifty', 'nifty_lag1', 'gold_lag1', 'gold']].iloc[-1]
last_date = data.index[-1]

# Create dataset for predictions
features_df = data[['nifty', 'nifty_lag1', 'gold_lag1', 'gold']].copy()
features_df_clean = features_df.dropna()

# Remove rows with NaN values after creating features
data_clean = data.dropna()

# Split the data into features and target
X = data_clean[['nifty', 'nifty_lag1', 'gold_lag1', 'gold']]
y = data_clean['future_gold_price']

# Split into training and testing sets (80% train, 20% test)
train_size = int(len(data_clean) * 0.8)
X_train, X_test = X.iloc[:train_size], X.iloc[train_size:]
y_train, y_test = y.iloc[:train_size], y.iloc[train_size:]

# Create pipeline with scaling and linear regression
pipeline = Pipeline([
    ('scaler', StandardScaler()),
    ('linear', LinearRegression())
])

# Define parameters for grid search
parameters = {'linear__fit_intercept': [True, False]}

# Use TimeSeriesSplit for cross-validation
tscv = TimeSeriesSplit(n_splits=5)

# Perform grid search with cross-validation
model = GridSearchCV(
    pipeline, 
    parameters,
    scoring='neg_mean_squared_error', 
    cv=tscv
)
model.fit(X_train, y_train)

# Get best parameters
best_params = model.best_params_
print("Best parameters:", best_params)

# Train final model with best parameters
final_model = LinearRegression(fit_intercept=best_params['linear__fit_intercept'])
final_model.fit(X_train, y_train)

# Make predictions on the test set
y_pred = model.predict(X_test)

# Calculate RMSE for the test set
rmse = np.sqrt(np.mean((y_test - y_pred) ** 2))
print(f"Test RMSE: {rmse:.2f}")

# Compare predicted vs actual values
comparison = pd.DataFrame({'Actual': y_test, 'Predicted': y_pred})
print(comparison.tail())  # Show the last few predictions vs actual values

# === Final Prediction for Next Month ===
# Prepare input for the final prediction (only last row)
last_row_df = pd.DataFrame([last_row])

# Predict the future gold price
predicted_next_gold_price = final_model.predict(last_row_df)[0]

# Compute the prediction date (1 month after the last date)
prediction_date = last_date + pd.DateOffset(months=1)

# Display result
print(f"\nPredicted Gold Price for {prediction_date.strftime('%Y-%m-%d')}: ${predicted_next_gold_price:.2f}")

Thanks,
AJ

Red_Red · April 2, 2025, 4:06am

[quote=“Red Red, post:1, topic:26381, full:true, username:Red_Red”]
Hi,
The code to predict next day’s up & down deviation is missing. It should have at present day. For example, by running the model today on 1st Apr, there should be prediction code for 2nd April and so on. Which function is applicable for this request? Thanks.

Course Name: Trading with Machine Learning: Regression, Section No: 7, Unit No: 7, Unit type: Notebook
[/quote]`

Can i have another person to answer this? Thank you.