In Section 3, Unit 1 of decision tree models course of the ML track, it is said that we should use stratify parameter of 'train_test_split' method in order to preserve the ratio of label classes in train set, after spliting.
Isn't it violate the temporal nature and order of time-series data? I also think that it will lead to look-ahead bias, too.
Hello Mohammad,
Thanks for pointing that out. If fact, not just the parameter stratify … even using the train_test_split method will lead to look ahead bias as it samples indices randomly which may lead to an index from the end being used in training and an index from the beginning to be used in testing. We'll rectify this and let you know. Thanks.
Thanks.