Learn Before
Regression Based Features
Regression based features represent the majority of features in algorithms using predefined features. They come in various forms, such as the errors of polynomial regressions of various degrees, independence of the residuals with the cause of the polynomial regression.
Features of conditional distribution variability have been introduced by Fonollosa. One of those, called standard deviation of the conditional distributions (CDS) manages to achieve good performance even when used alone. The CDS score measures the spread of the conditional distributions after normalization of the bins: where represents the normalized conditional probability and the sample variance over .
This feature proved itself very useful for causality detection, as it captures the distribution asymmetry; typically, it standalone yields a score of 0.69 on the Tübingen dataset.
0
1
Tags
Data Science