GSoC 2017: Finding sentence SVO and sentiment

Google Summer of Code 2017 with Red Hen Lab

In their own words:

The Distributed Little Red Hen Lab is a global laboratory and consortium for research into multimodal communication. Its main goal is theory. Its second goal is the development of new computational, statistical, and technical tools to assist research on multimodal communication.

Among the medley of fascinating explorations in multimodal communication, Red Hen specializes in communicative learning, analyzed in terms of communicative effects and communicative intent.

Our SVO detector

I have written code for subject-verb-object (SVO) detection for each sentence from a document, based on spaCy tutorials. I am trying to resolve issues with sentences that are in passive voice. Moreover, there is a need for pronoun resolution before the SVO information can be used effectively.

SpaCy is also useful to generate the entire parse tree of a given sentence. This gives details of part-of-speech and the various noun phrases of the sentences input.

Using this SVO and parse tree information along with the sentiment analysis code at sentence level, I produce an output file for every news article document input, where I get sentence-by-sentence information in the form of SVO (or only SV, as the case may be), noun phrases, and sentiment scores (compound, negative, neutral, positive).

The sentiment score of the sentence generally reflects that of the subject towards the object, or that of the speaker towards the topics in the sentence. Feeding these inputs to the classification model will help to reveal bias.

Sample outputs

$ python svo-senti.py "Trump defends son in Paris. The President's son has 
shown astonishingly poor judgement."

Trump defends son in Paris.
SVOs:  [(u'trump', u'defends', u'son')]
Noun phrases:  Trump,  son,  Paris, 
Polarity scores:  compound: 0.0  neg: 0.0  neu: 1.0  pos: 0.0 

The President's son has shown astonishingly poor judgement.
SVOs:  [(u'son', u'shown', u'judgement')]
Noun phrases:  The President's son,  astonishingly poor judgement, 
Polarity scores:  compound: -0.4767  neg: 0.307  neu: 0.693  pos: 0.0