Linear Regression

Course Name: EPAT-Introduction to Machine Learning for Trading, Section No: 4, Linear Regression

In Linear regression model, second point for  Independent variables for split the dataset into training and testing the dataset.

1) why instructor has used [-20] and [20] value?

2) what is mean by [:-20] and [-20:] ?   

1) why instructor has used -20 and 20 value?
2) 
what is mean by [:-20] and [-20:] ?   


The code you've shared is used for splitting a dataset into two parts for training and testing a machine learning model. Let's break down what each line of the code is doing:
 
diabetes_X_train = diabetes_X[:-20]


This line of code creates a new variable diabetes_X_train by assigning a slice of the diabetes_X array to it. The slice includes all elements of diabetes_X except for the last 20 elements, indicated by [:-20]. In other words, this code is taking the first n-20 elements of diabetes_X where n is the total number of elements in the diabetes_X array.
This variable diabetes_X_train will be used to train a machine learning model.
diabetes_X_test = diabetes_X[-20:]



This line of code creates another new variable diabetes_X_test by assigning a slice of the diabetes_X array to it. The slice includes only the last 20 elements of diabetes_X, indicated by [-20:]. In other words, this code is taking the last 20 elements of diabetes_X.



This variable diabetes_X_test will be used to test the performance of the machine learning model that was trained using diabetes_X_train. The value of 20 is used for illustration purpose only. You can use anyother value to split the data into train and test. But that value should be less than n (number of elements) and you have to balance the number of records in train and test dataset. In either of the set the number of records shouldn't be too small or too large.



Slicing



Slicing in Python is a powerful feature that allows you to extract a specific subset of elements from a sequence such as a list, tuple or string. The general syntax for slicing is:

 

sequence[start:stop:step]

where sequence is the sequence you want to slice, start is the index of the first element to include in the slice, stop is the index of the first element to exclude from the slice, and step is the stride or step size of the slice (i.e., the number of elements to skip between included elements).

Here are some examples of slicing in Python:
# slicing a list 
my_list = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] 

print(my_list[2:7]) # [2, 3, 4, 5, 6]

print(my_list[:5]) # [0, 1, 2, 3, 4]