Learn Before
Statistical Features of the Distribution
Independence tests are always included in the algorithms, either to avoid testing for causal relationships if the variables are independent, or to maximize the accuracy of the predictions as a class in the challenge was dedicated to independent pairs.
The independence test statistics used consist in mainly two types: the correlation-based and the kernel-based tests. Firstly, the correlation-based tests consist in the well-known statistic tests such as the Pearson’s correlation and the Spearman’s correlation, but also tests based on mutual information. The challenges contained all types of data, including continuous data. In order to compute mutual information out of the empirical distributions, algorithms binned their continuous variables prior to computing mutual information based features. Here we will consider U , V , obtained by binning X and Y . Examples of these features are
- mutual information
- normalized mutual information
- adjusted mutual information.
0
1
Tags
Data Science