counterfacto

small software tool to analyze twitter and highlight counterfactual statements
git clone git://parazyd.org/counterfacto.git
Log | Files | Refs | README | LICENSE

commit c197f14a541519e2e261e6ff6bebbfaf742181d0
parent 96a0a3e5c2992c9c5b687762972a385773f51061
Author: parazyd <parazyd@dyne.org>
Date:   Mon,  2 Jan 2017 20:07:15 +0100

move credentials to its own file, add makefile

Diffstat:
A.gitignore | 2++
AMakefile | 11+++++++++++
MREADME.md | 57+++++++++++++++++++++++++++++++++++++++------------------
Mcounterfacto | 15++++++++-------
4 files changed, 60 insertions(+), 25 deletions(-)

diff --git a/.gitignore b/.gitignore @@ -0,0 +1,2 @@ +credentials +twokenize.pyc diff --git a/Makefile b/Makefile @@ -0,0 +1,11 @@ +all: + @echo downloading python nltk tagger + @python2 -c "import nltk; nltk.download('averaged_perceptron_tagger')" + @echo + @echo please create a file called credentials with the following data: + @echo "oatoken = 'yourOAuthToken'" + @echo "oasecret = 'yourOAuthSecret'" + @echo "conskey = 'yourConsumerKey'" + @echo "conssecret = 'yourConsSecret'" + @echo + @echo you can get these by creating a twitter app diff --git a/README.md b/README.md @@ -3,17 +3,26 @@ Counterfactual (noun) -Definition: the tendency to create possible alternatives to life events that have already occurred; something that is contrary to what actually happened. +Definition: the tendency to create possible alternatives to life events +that have already occurred; something that is contrary to what actually +happened. -Effects: it starts off with disappointment, then one will be able to uncover insights or knowledge that can be used to enhance future performance, leading to a better outcome in life. +Effects: it starts off with disappointment, then one will be able to +uncover insights or knowledge that can be used to enhance future +performance, leading to a better outcome in life. ---------------------------------------------------------------------------------- -Counterfacto is a small software tool that can analyse search results on twitter to highlight counterfactual statements on certain topics. +Counterfacto is a small software tool that can analyse search results +on twitter to highlight counterfactual statements on certain topics. -This tool is used by PIEnews.eu researchers for sentiment analysis about poverty and other related topics, to understand actual stories elaborated as counterfactual. +This tool is used by PIEnews.eu researchers for sentiment analysis +about poverty and other related topics, to understand actual stories +elaborated as counterfactual. -We deem such a tool as a useful experiment, considering the importance of counterfactual analysis for political sentiment assessments and focus on news stories. +We deem such a tool as a useful experiment, considering the importance +of counterfactual analysis for political sentiment assessments and +focus on news stories. ## Dependencies @@ -23,28 +32,30 @@ Python is required along the following packages: python-twitter python-nltk ``` -Then run the `python` console in a terminal and type +Your distro may have an outdated nltk (less than 3.2) without the +perceptron module, in that case an update from `pip` is needed: ``` -import nltk -nltk.download('averaged_perceptron_tagger') +pip install nltk --upgrade ``` -This will download the nltk_data folder and place it in your `$HOME`. +After installing the necessary python modules, run `make`, which will +then download the needed data for nltk, and tell you how to use your +twitter credentials in counterfacto -Your distro may have an outdated nltk (less than 3.2) without the perceptron module, in that case an update from `pip` is needed: +## Usage ``` -pip install nltk --upgrade +usage: ./counterfacto [-a account] [-f tweetfile] [-s searchterm] ``` ## References -- Learning Representations for Counterfactual Inference (2016) http://jmlr.org/proceedings/papers/v48/johansson16.pdf +- [Learning Representations for Counterfactual Inference (2016)](http://jmlr.org/proceedings/papers/v48/johansson16.pdf) -- Bounding and Minimizing Counterfactual Error (2016) https://arxiv.org/abs/1606.03976 +- [Bounding and Minimizing Counterfactual Error (2016)](https://arxiv.org/abs/1606.03976) -- "Counterfactuals in the Language of Social Media: A Natural Language Processing Project in Conjunction with the World Well Being Project" (2015) http://www.seas.upenn.edu/~cse400/CSE400_2015_2016/reports/report_15.pdf +- [Counterfactuals in the Language of Social Media: A Natural Language Processing Project in Conjunction with the World Well Being Project (2015)](http://www.seas.upenn.edu/~cse400/CSE400_2015_2016/reports/report_15.pdf) ## Licensing @@ -54,10 +65,20 @@ as part of the PIEnews project Software written by Ivan J. <parazyd@dyne.org> with contributions by Denis Roio <jaromil@dyne.org> -This source code is free software; you can redistribute it and/or modify it under the terms of the GNU Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version. +This source code is free software; you can redistribute it and/or +modify it under the terms of the GNU Public License as published by the +Free Software Foundation; either version 3 of the License, or (at your +option) any later version. -This source code is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. Please refer to the GNU Public License for more details. +This source code is distributed in the hope that it will be useful, but +WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. Please refer to +the GNU Public License for more details. -You should have received a copy of the GNU Public License along with this source code; if not, write to: Free Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. +You should have received a copy of the GNU Public License along with +this source code; if not, write to: Free Software Foundation, Inc., +Mass Ave, Cambridge, MA 02139, USA. -This project has received funding from the European Union’s Horizon 2020 Programme for research, technological development and demonstration under grant agreement nr. 687922 +This project has received funding from the European Union’s Horizon 2020 +Programme for research, technological development and demonstration under +grant agreement nr. 687922 diff --git a/counterfacto b/counterfacto @@ -35,11 +35,12 @@ global tweetfile global taggedFile taggedFile = 'tagged.txt' -## get these by creating a twitter app -oatoken = '' -oasecret = '' -conskey = '' -conssecret = '' +try: + with open('credentials') as fd: + exec(fd.read()) +except: + print('no credentials file found. please create it.') + exit(1) def main(): @@ -103,7 +104,7 @@ def main(): classify(tweetfile) except: - print("usage: counterfacto [-a account] [-f tweetfile] [-s searchterm]") + print("usage: " + sys.argv[0] + " [-a account] [-f tweetfile] [-s searchterm]") exit(1) ## {{{ processing functions @@ -300,6 +301,6 @@ def classify(tweetfile): tweetFile.close() tagFile.close() - print("counterfactuals: " + str(count)) + print("counterfactuals: " + str(count) + "/100") main()