commonshare

Python scripts for simulating Commonfare data, calculating commonshare, calculating recommendations, and visualisation

Requirements:

Python 3.x, NetworkX 2.2, Louvain community detection, dateutil (and the random names generator if running the simulation). Install with the following commands:

pip install networkx==2.2
pip install scipy python-louvain python-dateutil names

Important Contents:

For detailed information on setting up, running and deploying commonshare, please read the documentation!
python/

parsegexf.py: Main class for parsing GEXF file, which then calls makegraphs.py to calculate commonshare and output JSON files.
config.py: Contains key constants used in the simulation. Values in here can be adjusted to determine how many users are generated, the number of actions per day, and how many days the simulation runs for. It now also contains constants to allow adjustment of collusion detection.
kcore.py: Contains adjusted core_number method from the 'core.py' file of NetworkX. Additional methods have been implemented to calculate the weighted, directed core number values at particular points in time. Also contains an implemented collusion detection algorithm.
makegraphs.py: Uses the methods in kcore.py to calculate Commonshare values for each node in the graph every two weeks. Outputs JSON files, described below.
pagerank.py: Contains an implementation of the 'Personalised PageRank' algorithm used in the story recommender (details below)

Classes for simulation (in the /simulation directory):

graphclasses.py: Base classes that represent entities in the simulation
listinggenerator.py: Generates listing names by picking an adjective and a noun from requisite dictionaries
phrases.py: Generates story 'names' in the simulation by picking four random words from a dictionary
simulation.py: Run 'python simulation.py' from the python/simulation directory to generate simulated data (this gets stored in data/input/simulateddata.gexf)

data/output/

graphdata/biweekly/...: Contains graph-based JSON files representing every two weeks of Commonfare interactions, with Commonshare values calculated for each node (1.json ... X.json) Also contains a cumulative graph-based JSON file of every interaction made in Commonfare since its initiation (0.json)
userdata/...: Contains a file for every user, named <USER_ID>.json, which represents their entire interaction history
recommenderdata.gexf: Contains a cleaned version of the original GEXF, used for generating story recommendations

Docker image

A very basic Docker image is available to run the python scripts parsegexf.py and pagerank.py, the methods of which are exposed through a simple web API, as described below.

Input and output data is exchanged through the files in ./data directory which is mounted as a volume.

Building

To build this image make sure you have Docker installed in your host. It that is the case you just run:

$ docker build -t commonfare/commonshare-python .

If you now check docker images available in your host machine you would notice one named commonfare/commonshare-python.

$ docker images
...
commonfare/commonshare-python         latest              323a3b42764f        30 minutes ago      297MB
...

Running

This Docker image runs the Flask app, which exposes a simple API for running the following two Python scripts:

parsegexf.py which takes as input a file in GEXF format and produces as output a series of files in ./data/output/ directory.
pagerank.py which takes as input a story id and a user id and calculates the recommended stories for such user based on the input story.

Parameters and environment variables
The following environment variables are used as parameters and can be set when calling the docker image:

TASK - can be either parse or pagerank depending on which task you want to be performed. Default: parse
GEXF_INPUT - is the gexf input file used which will be parsed when running the parse task. Default: ./data/input/latest.gexf
PAGERANK_FILE - is the input file used when calculating the recommendations through the pagerank task. Default: ./data/output/recommenderdata.gexf
STORY_ID - input story used for the pagerank
USER_ID - input user used for the pagerank

A few examples are provided in the sections below to better clarify how to use this docker image.

The following command will start the service, connecting port 5000 of the Docker container (Flask default) to port 5000 of your machine:

$ docker run -it --rm -p 5000:5000 -v "$PWD/data":/usr/src/app/data commonfare/commonshare-python

Specify a different input file via the GEXF_INPUT environment variable.

$ docker run -it --rm -p 5000:5000 -v "$PWD/data":/usr/src/app/data -e GEXF_INPUT=./data/input/input3.gexf commonfare/commonshare-python

Docker compose

If you like docker-compose, you can build and run using

$ docker-compose build
$ docker-compose up

Testing running status

To run parsegexf.py, use the following URL...

#This will return a simple JSON object {success: true} on successful completion (note this takes a few minutes)
http://127.0.0.1:5000/parse

...and to run pagerank.py...

#This will return a JSON array of three IDs corresponding to stories that the user specified by *userid* should be recommended on reading story *storyid* If the story or user ID cannot be found, [0,0,0] will be returned instead.
http://127.0.0.1:5000/recommend/*storyid*/*userid*

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

commonshare

Important Contents:

Docker image

Building

Running

Docker compose

Testing running status

Files

README.md

Latest commit

History

README.md

File metadata and controls

commonshare

Important Contents:

Docker image

Building

Running

Docker compose

Testing running status