branch | build status |
---|---|
develop | |
master |
A Python client for Sage Bionetwork's Synapse, a collaborative compute space that allows scientists to share and analyze data together. The Python client can be used as a library for development of software that communicates with Synapse or as a command-line utility.
There is also a Synapse client for R.
For more information about the Python client, see:
For more information about interacting with Synapse, see:
The Python Synapse client has been tested on Python 2.7, 3.4 and 3.5 on Mac OS X, Ubuntu Linux and Windows.
The Python Synapse Client is on PyPI and can be installed with pip:
(sudo) pip install synapseclient[pandas,pysftp]
...or to upgrade an existing installation of the Synapse client:
(sudo) pip install --upgrade synapseclient
The dependencies on pandas
and pysftp
are optional. Synapse Tables integrate
with Pandas. The library pysftp
is required for users of
SFTP file storage. Both libraries require native code
to be compiled or installed separately from prebuilt binaries.
Clone the source code repository.
git clone git://github.com/Sage-Bionetworks/synapsePythonClient.git
cd synapsePythonClient
python setup.py install
Installing the develop branch can be useful for testing or for access to the latest features, with the acceptance of an increased risk of experiencing bugs. Using virtualenv to create an isolated test environment is a good idea.
git clone git://github.com/Sage-Bionetworks/synapsePythonClient.git
cd synapsePythonClient
git checkout develop
python setup.py install
Replace python setup.py install
with python setup.py develop
to make the installation follow the head without having to reinstall.
Pip will nicely install from a branch in one step:
pip install git+https://github.com/Sage-Bionetworks/synapsePythonClient.git@develop
Checking out a tagged version will ensure that JIRA issues are validated on the correct version of the client code. Instead of checking out the develop branch, check out the tag instead, for example:
git checkout v1.0.dev2
The synapse client can be used from the shell command prompt. Valid commands include: query, get, cat, add, update, delete, and onweb. A few examples are shown.
querying for entities that are part of the Synapse Commons Repository
synapse -u [email protected] -p secret query 'select id, name from entity where parentId=="syn150935"'
querying for a test entity
The test entity is tagged with an annotation test_data whose value is "bogus". We'll use the ID of this entity in the next example.
synapse -u [email protected] -p secret query 'select id, name, parentId from entity where test_data=="bogus"'
synapse -u [email protected] -p secret get syn1528299
synapse -h
Note that a synapse account is required.
The synapse client can be used to write software that interacts with the Sage Synapse repository.
import synapseclient
syn = synapseclient.Synapse()
## log in using cached API key
syn.login('joeuser')
## retrieve a 100 by 4 matrix
matrix = syn.get('syn1901033')
## inspect its properties
print(matrix.name)
print(matrix.description)
print(matrix.path)
## load the data matrix into a dictionary with an entry for each column
with open(matrix.path, 'r') as f:
labels = f.readline().strip().split('\t')
data = {label: [] for label in labels}
for line in f:
values = [float(x) for x in line.strip().split('\t')]
for i in range(len(labels)):
data[labels[i]].append(values[i])
## load the data matrix into a numpy array
import numpy as np
np.loadtxt(fname=matrix.path, skiprows=1)
profile = syn.getUserProfile()
query_results = syn.query('select id,name from project where project.createdByPrincipalId==%s' % profile['ownerId'])
querying for entities that are part of the Synapse Commons Repository
syn.query('select id, name from entity where parentId=="syn150935"')
querying for entities that are part of TCGA pancancer that are also RNA-Seq data
syn.query('select id, name from entity where freeze=="tcga_pancancer_v4" and platform=="IlluminaHiSeq_RNASeqV2"')
Authentication toward synapse can be accomplished in a few different ways. One is by passing username and password to the syn.login
function.
import synapseclient
syn = synapseclient.Synapse()
syn.login('[email protected]', 'secret')
It is much more convenient to use an API key, which can be generated and cached locally by doing the following once:
syn.login('[email protected]', 'secret', rememberMe=True)
Then, in subsequent interactions, specifying username and password is optional and only needed to login as a different user. Calling login
with no arguments uses cached credentials when they are available.
syn.login('[email protected]')
As a short-cut, creating the Synapse
object and logging in can be done in one step:
import synapseclient
syn = synapseclient.login()
Caching credentials can also be done from the command line client:
synapse login -u [email protected] -p secret --rememberMe
The synapse utils contain helper functions such as synu.copy().
import synapseutils as synu
import synapseclient
syn = synapseclient.login()
#COPY: copies all synapse entities to a destination location
synu.copy(syn, "syn1234", destinationId = "syn2345")
#WALK: Traverses through synapse directories, behaves exactly like os.walk()
walkedPath = synu.walk(syn, "syn1234")
for dirpath, dirname, filename in walkedPath:
print(dirpath)
print(dirname)
print(filename)
© Copyright 2013-15 Sage Bionetworks
This software is licensed under the Apache License, Version 2.0.