GSoC 2017: A phrase-matching bias detector

Google Summer of Code 2017 with Red Hen Lab

In their own words:

The Distributed Little Red Hen Lab is a global laboratory and consortium for research into multimodal communication. Its main goal is theory. Its second goal is the development of new computational, statistical, and technical tools to assist research on multimodal communication.

Among the medley of fascinating explorations in multimodal communication, Red Hen specializes in communicative learning, analyzed in terms of communicative effects and communicative intent.

Our first bias detector

To get an idea of the final outcome of the project, I spent some time implementing a bare-bones Python version of the newslint project, itself based on the JavaScript joblint project that checks for biased language in job descriptions. I considered a vocabulary of phrases with more focus on partisan vocabulary for news snippets, in order to pinpoint biased language as well as sensationalist pieces.

The current implementation merely checks for certain phrases in the input text to decide sensationalism or partisanship. I scraped these terms from various sources for this initial rough experiment. The sources for partisan language include this article from The Atlantic comparing Democrat and Republican language, and this paper on “What Drives Media Slant?” by Gentzkow and Shapiro. Some terms that could constitute sensational language were found in this document of Homeland Security watch-phrases.

Sample outputs

$ python detect_bias.py "Grenfell disaster victims murdered by political decisions"

NEWS BIAS DETECTOR

Sensationalism issues: 1
Partisanship issues: 0

* Sensationalist vocabulary is used

Mentioned: disaster


$ python detect_bias.py "If Republican efforts to repeal and replace Obamacare 
are successful, one of the biggest winners would be the wealthy. The Senate's 
bill -- released this week -- differs in key ways from the House-passed version. 
But proposals eliminate the taxes imposed on high-income Americans to help 
pay for an expansion of health benefits under the Affordable Care Act. The 
legislation also would let people contribute more to certain tax-advantaged accounts."

NEWS BIAS DETECTOR

Sensationalism issues: 0
Partisanship issues: 2

* Partisan vocabulary is used

Mentioned: Obamacare, Affordable Care Act

I am building on this pipeline to take the text and do a more intelligent analysis and return results in a similar manner, to try to detect various media story slants.