Section 4, Unit 12

Alex_Castro_3hZL1 · May 25, 2020, 8:06pm

Dear friends, for the regression trees on numerical features, how are we deciding when to split a feature?

Say X = [X_1 | X_2 | … | X_n]

and the DT decides to pick X_1 based on variance reduction, how is the rule X_1 <= a decided? In other. words, how to choose a?

Many thanks

Akshay_Nautiyal_2ycOh · May 26, 2020, 10:42am

Dear Alex,

Both the node or attribute on which to split and the point (a) on where to split is decided based on brute force. For each split in each attribute, the variance is calculated to get the reduction. Thereafter, the highest one is chosen among all attributes and splits.