A primary limitation is due to the examples used to train the classifiers. As widely known, the accuracy of trained classifiers depends on the quality of the training examples. In quite a few causal examples however, the joint distributions present typical features giving away the causality label, a phenomenon referred to as data leakage . 

Another issue would be the presence of biases in the training set of the classifiers. For example, if the causal pairs with one categorical variable and one numerical variable are always labelled such as categorical → numerical, the learning algorithms might learn biased features on distributions due to the training set. Such biases hinder the generality of the causality classifiers, as they might be exploited by learning algorithms and induce the biased hypotheses.

University of Michigan - Ann Arbor

There are two primary limitations of the Mother Distribution framework where we cast the pairwise causal discovery problem as a supervised learning problem, regarding the quality and the quantity of the training data respectively.

Limitations of the Framework

 Isabelle Guyon, Alexander Statnikov, Berna Bakir Batu (2019)
Cause Effect Pairs in Machine Learning (The Springer Series on Challenges in Machine Learning)
https://www.amazon.com/Effect-Machine-Learning-Springer-Challenges/dp/3030218090

Cause Effect Pairs in Machine Learning

Limitation of Training Data Quality

A second limitation related with the data is their insufficient amount. As far as neural nets and deep learning are involved in the learning process, the quantity of examples also becomes essential. 

Given the comparatively few variable pairs for which the causality label is known from prior knowledge, many authors thus rely on data augmentation, generating new artificial examples from scratch or by perturbing the available examples. However, theoretical results require that causal classifiers be trained and evaluated on examples following the same Mother Distribution. 

As in all machine learning problem, the simplest setting is the i.i.d. setting in which training and test data are drawn from the same distribution. The same applies to the cause-effect pair problem: higher performance is attained when the pairs are drawn from the same mother distribution. 

Unfortunately, in many real world applications, one does not know from which “mother distribution” a new incoming pair to be classified is drawn and one does not have labeled examples of cause-effect pairs from the “mother distribution” of interest.

Learn Before

Related