Time series indicators

philip_hoy · May 21, 2025, 2:12am

Is there a library or a lesson relative to rolling calculations on time series data. I need to be able to go from call it I min data and get a rolling RSI or any calculation for that matter. So lets say I want a 30min RSi rolling I want row 36 to be from row 6 to 36 , than 37 to be from 7 to 37. But also a 60min or 240 and have it be rolling. This way my aggregate study of raw data study will match a live indicator on a chart.

I am having trouble finding the engineering logic. Everything is from a time to that time and I am looking for aggregate data say in sec to rolling calculation on any given second side by side. That way if I create an instance let just say from a day pivot call it R1 = Close, I can look and see what the 30min RSI and 60min RSI are doing in that second within that row at R1 = Close…which can happen any given second. Everything seems to be from download a time and calculate that time on that last. I am working from sec O H L C data and want to keep everything within any given second of a row relative to any calculation.

philip_hoy · May 21, 2025, 2:14am

O H L C data in seconded …

QuantInsti · May 21, 2025, 2:07pm

Hi Philip Hoy,

Here’s a simple example that might help clarify the difference between rolling and resampling approaches when working with 1-minute OHLC data.

You can adjust the window size — for instance, set window=30 to get a 30-minute rolling SMA. I’ve also included a resampling approach to compare.

Try this:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from datetime import datetime, timedelta

# Create sample 1-minute OHLC data
np.random.seed(42)
start_time = datetime(2023, 1, 1, 9, 30)
periods = 480  # 8 hours of 1-min data

# Generate timestamps
timestamps = [start_time + timedelta(minutes=i) for i in range(periods)]

# Generate price data
close_prices = np.random.normal(100, 1, periods).cumsum() + 1000
high_prices = close_prices + np.random.uniform(0, 1, periods)
low_prices = close_prices - np.random.uniform(0, 1, periods)
open_prices = close_prices - np.random.uniform(-0.5, 0.5, periods)

# Create DataFrame
df_1min = pd.DataFrame({
    'timestamp': timestamps,
    'open': open_prices,
    'high': high_prices,
    'low': low_prices,
    'close': close_prices
})
df_1min.set_index('timestamp', inplace=True)

# Function to calculate SMA
def calculate_sma(data, window=14):
    return data.rolling(window=window).mean()

# Rolling SMA on 1-min data
df_1min['sma_1min'] = calculate_sma(df_1min['close'], window=14)

# Resample to different timeframes
timeframes = ['5min', '15min', '30min']

# Resample, compute SMA, and map back to 1-min with forward fill
for tf in timeframes:
    df_tf = df_1min.resample(tf).agg({
        'open': 'first',
        'high': 'max',
        'low': 'min',
        'close': 'last'
    })
    df_tf[f'sma_{tf}'] = calculate_sma(df_tf['close'], window=14)
    df_1min[f'sma_{tf}'] = df_tf[f'sma_{tf}'].reindex(df_1min.index, method='ffill')

It’ll be more useful if you take what’s here and apply the same logic to RSI:

Implement the same approach for RSI using 30min, 60min, or 240min windows.
Visualize it — compare rolling vs resampled if needed.
Try aligning it by second or minute as you originally wanted.

Learn it by doing. That’s the best way to get comfortable with the engineering logic you’re after.

Thanks,
Ajay

philip_hoy · May 21, 2025, 8:04pm

This is not the method I need. This gives repeating values per time increment. When you resample back to original data, the 5minSMA has the same value for 5 rows. The 15min has the same value for 15 rows and its rolling relative to the time definition and not any give second. This would not mimic a real live setting of loading a 5minSma and a 15minSMA to say a visual chart. Every tick or call it second both of these calculations would change and update relative to the instance and relative to the time of the sma. SO let’s say you download this into second data. At any given second. Call is second 5,553 or row 5,553 both 5minSma and 15minSMA would have real updated values and not repeating for the length of the calculation. This is what I am trying to accomplish. I am trying to study second by second of any given instance to any calcuatoin. So say I put a RSI on a 30min chart and a 60min chart. and was able to download the data in seconds. Every second you would have a 30minRSI and 60minRSI updating per second relative to that cal and that time frame in seconds. This si what I am trying to accomplish. The method you have given. Is look ahead bias relative to any given minute. Because at minute one you have the same SMA value for the next 5 min and in real time this value would be adjusting per minute. IF this all makes sense.

philip_hoy · May 21, 2025, 8:08pm

The method you’re giving is essentially end of bar logic. I need real time at any given second. Like loading an indicator to a chart. It updates per tick. I need to mimic this from aggregate data in seconds to any calculation and any time frame.

philip_hoy · May 21, 2025, 8:23pm

| Asset | Date.Time | ENTRY | Computed Outcome X240.STO.ANGLE | X240.STO.DIFF |
|--------|---------------------|--------|------------------|----------------|----------------|
| GBPUSD | 2019-01-04 15:31:40 | 1.2686 | 0.0055 | 84.99 | -4.85 |
| GBPUSD | 2019-01-09 14:37:44 | 1.2767 | -0.0035 | 87.88 | 17.96 |
| GBPUSD | 2019-01-10 17:37:20 | 1.2751 | 0.0129 | -87.53 | -18.63 |
| GBPUSD | 2019-01-11 10:55:28 | 1.2832 | 0.0056 | 89.01 | 33.85 |
| GBPUSD | 2019-01-11 09:11:22 | 1.2722 | 0.0129 | -88.39 | -14.64 |
| GBPUSD | 2019-01-14 11:57:12 | 1.2876 | -0.0035 | 82.02 | 0.26 |
| GBPUSD | 2019-01-15 15:22:48 | 1.281 | 0.0098 | -88.59 | -29.67 |
| GBPUSD | 2019-01-15 03:14:46 | 1.2902 | 0.0074 | 85.14 | 5.47 |
| GBPUSD | 2019-01-16 15:05:36 | 1.2889 | 0.0043 | 69.52 | 6.26 |
| GBPUSD | 2019-01-17 11:29:36 | 1.2887 | -0.0035 | 73.54 | 0.94 |
| GBPUSD | 2019-01-17 08:20:44 | 1.2838 | 0.0086 | -89.02 | -31.65 |
| GBPUSD | 2019-01-18 15:31:34 | 1.2894 | -0.0035 | -88.84 | -40.74 |
| GBPUSD | 2019-01-22 17:33:28 | 1.2968 | 0.0011 | 87.26 | 17.67 |
| GBPUSD | 2019-01-23 11:25:32 | 1.3021 | -0.0035 | 86.67 | 6.73 |
| GBPUSD | 2019-01-25 00:18:42 | 1.3114 | 0.0052 | 88.82 | 22.84 |

I can do this in Sierra relative to an alert condition. I can set an alert condition and it will populate a spreadsheet for what the RSI or what ever calcuatoin and I can see the value relative to the instance. But I am build ML models from 1sec data relative to the calculations and this method of getting data from SC is very time consuming as I have to do it from a replay. Download is always end of bar data. This data is from a signal also. There is a way to download is the same way per second. But it also has to be done from a replay and it can take me 20 days to get 5 years of data. However getting the O H L C in sec takes about 20min, into a CSV file. now I have to figure out how to compute the calculations so they mimic real time update per second.

Mohak_Pachisia · May 23, 2025, 9:43am

The implementation approach involves using pandas’ rolling function with a window size equal to your timeframe in seconds, but applying it to your base second-level data without any resampling. For a 30-minute indicator, you’d use a rolling window of 1,800 seconds applied directly to your second-by-second close prices. The key is ensuring the rolling calculation function receives the raw price array for that window and computes the indicator from scratch each time.
For performance with large datasets spanning years, you’ll want to vectorize these operations or use numba to compile the calculation functions. The pattern is to iterate through your dataframe, and for each row, extract the appropriate window of historical data ending at that second, then calculate your indicator using only that window’s data.

This approach will give you indicators that update every second with unique values, perfectly matching what you’d see on live charts and eliminating the look-ahead bias you’re concerned about. Each second will have its own calculated value based on the most recent data available at that exact moment, which is exactly what your ML models need to train on realistic market conditions.

But please note that, if the data extraction is leading to delay or time issues, you might want to include using websockets instead of downloading data into CSVs and proceed the calculation.