1Cademy - Linguistic Features For Hate Speech Detection

Relation

Linguistic Features For Hate Speech Detection

Combining n-gram features with POS-information-enriched tokens, but classifier performance was not increased significantly.
Deep syntactic information( long distance relation between words in a sentence).
Dependency relationships in a sentence can be used as a feature which increases performance of the classification tasks.
Dependency relations are selected manually or using Bayesian Logistic Regression Some researchers have used an offensive score which is calculated based on the number of times an offensive word repeats in a sentence or passage
In an approach called Smokey system researcher uses specific linguistic and syntactic features to detect hate speech. In addition to this, praise rules and politeness rules are implemented to detect co-occurence of good words.
For example imperative statements like imperative statements (e.g. Get lost!,Get a life!) the co occurrence of the pronoun you modified by noun phrases (as in you bozos)