Algorithms Against Abusive Online Comments

There’s a new way of fighting against online abusive comments: Georgia Institute of Technology researchers have created a new method that provides an efficient and cheap way for internet communities to moderate abusive content. The method is called the Bag of Communities (BoC).

BoC is a technique that uses large-scale, preexisting data from internet communities to teach an algorithm to identify abusive behavior within a specific community. The team managed to identify 9 different communities, out of which 5 are full of abusive comments (4chan being one of them), and 4 are heavily moderated and therefore supportive and positive (MetaFilter being one of them).

Using linguistic characteristics from these two types of communities (abusive and supportive), they created an algorithm that can learn from the comments and predict whether a new post is abusive or not.

The team provides two of these algorithms:

A static model, which works with no training examples from the target community and offers roughly 75% accuracy. This model should be used in a new community that doesn’t yet have enough resources to build automated algorithms to detect abusive content.

And a dynamic model, which learns from data that comes in batches and over time. This model offers 91.18% accuracy after seeing 100,000 human-moderated posts, and should be used in seasoned communities with many comments.

Both of these algorithms outperformed an in-domain model from a major internet community.

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s