Stratifying in train_test_split and time-series data

In Section 3, Unit 1 of decision tree models course of the ML track, it is said that we should use stratify parameter of 'train_test_split' method in order to preserve the ratio of label classes in train set, after spliting.



Isn't it violate the temporal nature and order of time-series data? I also think that it will lead to look-ahead bias, too.

Hello Mohammad,



Thanks for pointing that out. If fact, not just the parameter stratify … even using the  train_test_split  method will lead to look ahead bias as it samples indices randomly which may lead to an index from the end being used in training and an index from the beginning to be used in testing. We'll rectify this and let you know. Thanks. 

 

Thanks.