Google Summer of Code 2017 with Red Hen Lab
In their own words:
The Distributed Little Red Hen Lab is a global laboratory and consortium for research into multimodal communication. Its main goal is theory. Its second goal is the development of new computational, statistical, and technical tools to assist research on multimodal communication.
Among the medley of fascnating explorations in multimodal communication, Red Hen specializes in communicative learning, analyzed in terms of communicative effects and communicative intent.
What we’ve set out to achieve
This summer, I have been selected to participate in GSoC 2017 with Red Hen, mentored by Prof. Francis Steen and his student from last year’s GSoC program, Shruti. The challenge I am presented with is to try to model “interpretive frames” computationally.
A brief foray into the field of Communication Studies demystifies several terms:
With every passing day, we can see that the representation of our world that has been conjured up by the mainstream media is far removed from the facts. The media depiction of events may disregard the facts and rely on other means to achieve the desired communicative effects in the minds of the people, depending on the communicative intent of that media house. Framing, in social science, is how individuals and societies perceive and communicate about reality. Mass media, politics, religion, advertising, and so on, have a tremendous role to play in moulding public perception. For instance, a child is prone to saying “My toy broke” instead of “I broke my toy”, side-stepping responsibility using the same syntax as “Mistakes were made”. Similarly, the Bush administration carefully changed their initial response to the 9/11 attacks calling it an act of terror or crime. It resulted in the war metaphor, calling for the “war on terror”, which implies military action and war powers for government rather than bringing criminals to justice.
Interpretive frames can be used to manipulate facts as per convenience. In the current political debates, opposing camps are often confident about their own interpretations of facts, or the dismissal of facts as irrelevant, or the emphasis of facts as crucial. We see consistency within frames and a marked difference across frames, entirely changing the language of the debate.
An interesting study is to construct a computational model of such media representations of the world, considering features from social discourse such as environment, crime, race, and so on. In other words, we generate interpretive frames that introduce selected media biases and predispose the system to look at the world in a certain way. It would be intriguing to see what contrasting frames on the same topic reveal about the media.
The beginning
The initial discussion about interpretive frames was narrowed down to Republican versus Democratic vocabulary. Media bias is no secret, with it being common knowledge that Fox News leans conservative while MSNBC leans more liberal.
Having been granted access to vast the NewsScape archive of annotated news videos, I spent some time identifying the controversial topics such as immigration, tax reform, healthcare, etc.
In addition, I scoured online sources to find words and phrases that were commonly associated with either conservatives or liberals, that is, partisan vocabulary. I came across several intriguing resources that nearly sidetracked me from my actual goals. Here’s a small sampler:
- A project showing the different word associations based on political stance, called Partisan Thesaurus.
- An analysis of the abortion debate, and the loaded terminology used by various media houses.
The resulting list of words and phrases allowed me to establish the initial structure and pipeline for the project. The basic version merely compares input text against the list of phrases to detect partisan or subjective language.