Synapse Python Client documentation¶
+Synapse Python Client Documentation¶
Overview¶
The synapseclient
package provides an interface to Synapse, a collaborative workspace
@@ -108,24 +113,18 @@
Overviewsynapseclient package lets you communicate with the cloud-hosted Synapse service to access data and create shared data analysis projects from within Python scripts or at the interactive Python console. Other Synapse clients -exist for R, +exist for R, Java, and the web. The Python client can also be used from the command line.
If you’re just getting started with Synapse, have a look at the Getting Started guides for -Synapse and -the Python client.
-Good example projects are:
--
-
- TCGA Pan-cancer (syn300013) -
- Development of a Prognostic Model for Breast Cancer Survival in an Open Challenge Environment (syn1721874) -
- Demo projects (syn1899339) -
Installation¶
The synapseclient package is available from PyPI. It can be installed or upgraded with pip:
-(sudo) pip install (--upgrade) synapseclient[pandas,pysftp]
+(sudo) pip install (--upgrade) synapseclient[pandas, pysftp]
The dependencies on pandas and pysftp are optional. The Synapse synapseclient.table
feature integrates with
@@ -150,17 +149,22 @@
Installation
+Python 2 Support¶
+The sun is setting on Python 2. Many major open source Python packages are moving to require Python 3.
+The Synapse engineering team will step down Python 2.7 support to only bug fixes, and require Python 3 on new feature releases. Starting with Synapse Python client version 2.0 (will be released in Q1 2019), Synapse Python client will require Python 3.
+
Connecting to Synapse¶
-To use Synapse, you’ll need to register for an account. The Synapse +
To use Synapse, you’ll need to register for an account. The Synapse website can authenticate using a Google account, but you’ll need to take the extra step of creating a Synapse password to use the programmatic clients.
Once that’s done, you’ll be able to load the library, create a Synapse
object and login:
import synapseclient
syn = synapseclient.Synapse()
-syn.login('me@nowhere.com', 'secret')
+syn.login('my_username', 'my_password')
For more information, see:
@@ -213,7 +217,7 @@Accessing Data
-Organizing data in a Project¶
+Organizing Data in a Project¶
You can create your own projects and upload your own data sets. Synapse stores entities in a hierarchical or tree
structure. Projects are at the top level and must be uniquely named:
import synapseclient
@@ -247,7 +251,7 @@ Organizing data in a Project
-Annotating Synapse entities¶
+Annotating Synapse Entities¶
Annotations are arbitrary metadata attached to Synapse entities, for example:
test_entity.genome_assembly = "hg19"
@@ -302,31 +306,8 @@ Evaluationssynapseclient.Synapse.getSubmissionStatus()
-
-Querying¶
-Synapse supports a SQL-like query language:
-results = syn.query('SELECT id, name FROM entity WHERE parentId=="syn1899495"')
-
-for result in results['results']:
- print(result['entity.id'], result['entity.name'])
-
-
-Querying for my projects. Finding projects owned by the current user:
-profile = syn.getUserProfile()
-results = syn.query('SELECT id, name FROM project WHERE project.createdByPrincipalId==%s' % profile['ownerId'])
-
-for result in results['results']:
- print(result['project.id'], result['project.name'])
-
-
-See:
-
-
-Access control¶
+Access Control¶
By default, data sets in Synapse are private to your user account, but they can easily be shared with specific users,
groups, or the public.
See:
@@ -336,9 +317,9 @@ Access control
-Accessing the API directly¶
+Accessing the API Directly¶
These methods enable access to the Synapse REST(ish) API taking care of details like endpoints and authentication.
-See the REST API documentation.
+See the REST API documentation.
See:
synapseclient.Synapse.restGET()
@@ -348,24 +329,26 @@ Accessing the API directly
-Synapse utilities¶
+Synapse Utilities¶
There is a companion module called synapseutils that provide higher level functionality such as recursive copying of
content, syncing with Synapse and additional query functionality.
See:
- synapseutils
-More information¶
-For more information see the Synapse User Guide. These API docs are browsable
-online at http://docs.synapse.org/python/.
+More Information¶
+For more information see the Synapse User Guide. These Python API docs are browsable
+online at https://python-docs.synapse.org/.
-Getting updates¶
-To get information about new versions of the client including development versions see
-synapseclient.check_for_updates() and
-synapseclient.release_notes().
+Getting Updates¶
+To get information about new versions of the client, see:
+synapseclient.check_for_updates().
+
+
+Reference¶
- Synapse Client
@@ -381,6 +364,25 @@ Getting updatesUsing the Synapse Python Client with SFTP data storage
+
+
+Release Notes¶
+
@@ -401,13 +403,13 @@ Navigation
-
next |
-
+
import synapseclient
@@ -247,7 +251,7 @@ Organizing data in a Project
-Annotating Synapse entities¶
+Annotating Synapse Entities¶
Annotations are arbitrary metadata attached to Synapse entities, for example:
test_entity.genome_assembly = "hg19"
@@ -302,31 +306,8 @@ Evaluationssynapseclient.Synapse.getSubmissionStatus()
-
-Querying¶
-Synapse supports a SQL-like query language:
-results = syn.query('SELECT id, name FROM entity WHERE parentId=="syn1899495"')
-
-for result in results['results']:
- print(result['entity.id'], result['entity.name'])
-
-
-Querying for my projects. Finding projects owned by the current user:
-profile = syn.getUserProfile()
-results = syn.query('SELECT id, name FROM project WHERE project.createdByPrincipalId==%s' % profile['ownerId'])
-
-for result in results['results']:
- print(result['project.id'], result['project.name'])
-
-
-See:
-
-
-Access control¶
+Access Control¶
By default, data sets in Synapse are private to your user account, but they can easily be shared with specific users,
groups, or the public.
See:
@@ -336,9 +317,9 @@ Access control
-Accessing the API directly¶
+Accessing the API Directly¶
These methods enable access to the Synapse REST(ish) API taking care of details like endpoints and authentication.
-See the REST API documentation.
+See the REST API documentation.
See:
synapseclient.Synapse.restGET()
@@ -348,24 +329,26 @@ Accessing the API directly
-Synapse utilities¶
+Synapse Utilities¶
There is a companion module called synapseutils that provide higher level functionality such as recursive copying of
content, syncing with Synapse and additional query functionality.
See:
- synapseutils
-More information¶
-For more information see the Synapse User Guide. These API docs are browsable
-online at http://docs.synapse.org/python/.
+More Information¶
+For more information see the Synapse User Guide. These Python API docs are browsable
+online at https://python-docs.synapse.org/.
-Getting updates¶
-To get information about new versions of the client including development versions see
-synapseclient.check_for_updates() and
-synapseclient.release_notes().
+Getting Updates¶
+To get information about new versions of the client, see:
+synapseclient.check_for_updates().
Reference¶
- Synapse Client @@ -381,6 +364,25 @@
Getting updatesUsing the Synapse Python Client with SFTP data storage
Release Notes¶
+Release Notes¶
+1.9.0 (2018-09-28)¶
+In version 1.9.0, we deprecated and removed query() and chunkedQuery(). These functions used the old query services which does not perform well. To query for entities filter by annotations, please use EntityViewSchema.
+We also deprecated the following functions and will remove them in Synapse Python client version 2.0. +In the Activity object:
+-
+
- usedEntity() +
- usedURL() +
In the Synapse object:
+-
+
- getEntity() +
- loadEntity() +
- createEntity() +
- updateEntity() +
- deleteEntity() +
- downloadEntity() +
- uploadFile() +
- uploadFileHandle() +
- uploadSynapseManagedFileHandle() +
- downloadTableFile() +
Please see our documentation for more details on how to migrate your code away from these functions.
+ +Bug Fixes¶
+-
+
- SYNPY-195 - Dangerous exception handling +
- SYNPY-261 - error downloading data from synapse (python client) +
- SYNPY-694 - Uninformative error in copyWiki function +
- SYNPY-805 - Uninformative error when getting View that does not exist +
- SYNPY-819 - command-line clients need to be updated to replace the EntityView ‘viewType’ with ‘viewTypeMask’ +
Tasks¶
+ +Improvements¶
+-
+
- SYNPY-583 - Better error message for create Link object +
- SYNPY-810 - simplify docs for deleting rows +
- SYNPY-814 - fix docs links in python client __init__.py +
- SYNPY-822 - Switch to use news.rst instead of multiple release notes files +
- SYNPY-823 - Pin keyring to version 12.0.2 to use SecretStorage 2.x +
1.8.2 (2018-08-17)¶
+In this release, we have been performed some house-keeping on the code base. The two major changes are:
++++
+- +
making syn.move() available to move an entity to a new parent in Synapse. For example:
+++import synapseclient +from synapseclient import Folder + +syn = synapseclient.login() + +file = syn.get("syn123") +folder = Folder("new folder", parent="syn456") +folder = syn.store(folder) + +# moving file to the newly created folder +syn.move(file, folder) +- +
exposing the ability to use the Synapse Python client with single threaded. This feature is useful when running Python script in an environment that does not support multi-threading. However, this will negatively impact upload speed. To use single threaded:
+++import synapseclient +synapseclient.config.single_threaded = True +
Bug Fixes¶
+-
+
- SYNPY-535 - Synapse Table update: Connection Reset +
- SYNPY-603 - Python client and synapser cannot handle table column type LINK +
- SYNPY-688 - Recursive get (sync) broken for empty folders. +
- SYNPY-744 - KeyError when trying to download using Synapse Client 1.8.1 +
- SYNPY-750 - Error in downloadTableColumns for file view +
- SYNPY-758 - docs in Sphinx don’t show for synapseclient.table.RowSet +
- SYNPY-760 - Keyring related error on Linux +
- SYNPY-766 - as_table_columns() returns a list of columns out of order for python 3.5 and 2.7 +
- SYNPY-776 - Cannot log in to Synapse - error(54, ‘Connection reset by peer’) +
- SYNPY-795 - Not recognizable column in query result +
Features¶
+ +Tasks¶
+ +1.8.1 (2018-05-17)¶
+This release is a hotfix for a bug. +Please refer to 1.8.0 release notes for information about additional changes.
+ +1.8.0 (2018-05-07)¶
+This release has 2 major changes:
+-
+
- The client will no longer store your saved credentials in your synapse cache (~/synapseCache/.session). The python client now relies on keyring to handle credential storage of your Synapse credentials. +
- The client also now uses connection pooling, which means that all method calls that connect to Synapse should now be faster. +
The remaining changes are bug fixes and cleanup of test code.
+Below are the full list of issues addressed by this release:
+Bug Fixes¶
+-
+
- SYNPY-654 - syn.getColumns does not terminate +
- SYNPY-658 - Security vunerability on clusters +
- SYNPY-689 - Wiki’s attachments cannot be None +
- SYNPY-692 - synapseutils.sync.generateManifest() sets contentType incorrectly +
- SYNPY-693 - synapseutils.sync.generateManifest() UnicodeEncodingError in python 2 +
Tasks¶
+ +1.7.5 (2018-01-31)¶
+v1.7.4 release was broken for new users that installed from pip. v1.7.5 has the same changes as v1.7.4 but fixes the pip installation.
+1.7.4 (2018-01-29)¶
+-
+
- This release mostly includes bugfixes and improvements for various Table classes: +
-
+
- Fixed bug where you couldn’t store a table converted to a pandas.Dataframe if it had a INTEGER column with some missing values. +
- EntityViewSchema can now automatically add all annotations within your defined scopes as columns. Just set the view’s addAnnotationColumns=True before calling syn.store(). This attribute defaults to True for all newly created EntityViewSchemas. Setting addAnnotationColumns=True on existing tables will only add annotation columns that are not already a part of your schema. +
- You can now use synapseutils.notifyMe as a decorator to notify you by email when your function has completed. You will also be notified of any Errors if they are thrown while your function runs. +
+- We also added some new features: +
-
+
- syn.findEntityId() function that allows you to find an Entity by its name and parentId, set parentId to None to search for Projects by name. +
- The bulk upload functionality of synapseutils.syncToSynapse is available from the command line using: synapse sync. +
+
Below are the full list of issues addressed by this release:
+Features¶
+ +Improvements¶
+-
+
- SYNPY-267 - Update Synapse tables for integer types +
- SYNPY-304 - Table objects should implement len() +
- SYNPY-416 - warning message for recursive get when a non-Project of Folder entity is passed +
- SYNPY-482 - Create a sample synapseConfig if none is present +
- SYNPY-489 - Add a boolean parameter in EntityViewSchema that will indicate whether the client should create columns based on annotations in the specified scopes +
- SYNPY-494 - Link should be able to take an entity object as the parameter and derive its id +
- SYNPY-511 - improve exception handling +
- SYNPY-512 - Remove the use of PaginatedResult’s totalNumberOfResult +
- SYNPY-539 - When creating table Schemas, enforce a limit on the number of columns that can be created. +
Bug Fixes¶
+-
+
- SYNPY-235 - can’t print Row objects with dates in them +
- SYNPY-272 - bug syn.storing rowsets containing Python datetime objects +
- SYNPY-297 - as_table_columns shouldn’t give fractional max size +
- SYNPY-404 - when we get a SynapseMd5MismatchError we should delete the downloaded file +
- SYNPY-425 - onweb doesn’t work for tables +
- SYNPY-438 - Need to change ‘submit’ not to use evaluation/id/accessRequirementUnfulfilled +
- SYNPY-496 - monitor.NotifyMe can not be used as an annotation decorator +
- SYNPY-521 - inconsistent error message when username/password is wrong on login +
- SYNPY-536 - pre-signed upload URL expired warnings using Python client sync function +
- SYNPY-555 - EntityViewSchema is missing from sphinx documentation +
- SYNPY-558 - synapseutils.sync.syncFromSynapse throws error when syncing a Table object +
- SYNPY-595 - Get recursive folders filled with Links fails +
- SYNPY-605 - Update documentation for getUserProfile to include information about refreshing and memoization +
1.7.3 (2017-12-08)¶
+Release 1.7.3 introduces fixes and quality of life changes to Tables and synapseutils:
+-
+
Changes to Tables:
++
+-
+
- You no longer have to include the etag column in your SQL query when using a tableQuery() to update File/Project Views. just SELECT the relevant columns and etags will be resolved automatically. +
- The new PartialRowSet class allows you to only have to upload changes to individual cells of a table instead of every row that had a value changed. It is recommended to use the PartialRowSet.from_mapping() classmethod instead of the PartialRowSet constructor. +
+Changes to synapseutils:
++
+-
+
- Improved documentation +
- You can now use ~ to refer to your home directory in your manifest.tsv +
+
We also added improved debug logging and use Python’s builtin logging module instead of printing directly to sys.stderr
+Below are the full list of issues addressed by this release:
+Bug Fixes¶
+-
+
- SYNPY-419 - support object store from client +
- SYNPY-499 - metadata manifest file name spelled wrong +
- SYNPY-504 - downloadTableFile changed return type with no change in documentation or mention in release notes +
- SYNPY-508 - syncToSynapse does not work if “the file path in “used” or “executed” of the manifest.tsv uses home directory shortcut “~” +
- SYNPY-516 - synapse sync file does not work if file is a URL +
- SYNPY-525 - Download CSV file of Synapse Table - 416 error +
- SYNPY-572 - Users should only be prompted for updates if the first or second part of the version number is changed. +
Features¶
+ +1.7.1 (2017-11-17)¶
+Release 1.7 is a large bugfix release with several new features. The main ones include:
+-
+
We have expanded the synapseutils packages to add the ability to:
++
+-
+
- Bulk upload files to synapse (synapseutils.syncToSynapse). +
- Notify you via email on the progress of a function (useful for jobs like large file uploads that may take a long time to complete). +
- The syncFromSynapse function now creates a “manifest” which contains the metadata of downloaded files. (These can also be used to update metadata with the bulk upload function. +
+File View tables can now be created from the python client using EntityViewSchema. See fileviews documentation.
+
+The python client is now able to upload to user owned S3 Buckets. Click here for instructions on linking your S3 bucket to synapse.
+
+
We’ve also made various improvements to existing features:
+-
+
- The LARGETEXT type is now supported in Tables allowing for strings up to 2Mb. +
- The –description argument when creating/updating entities from the command line client will now create a Wiki for that entity. You can also use –descriptionFile to write the contents of a markdown file as the entity’s Wiki +
- Two member variables of the File object, file_entity.cacheDir and file_entity.files is being DEPRECATED in favor of file_entity.path for finding the location of a downloaded File +
- pandas dataframe`s containing `datetime values can now be properly converted into csv and uploaded to Synapse. +
We also added a optional convert_to_datetime parameter to CsvFileTable.asDataFrame() that will automatically convert Synapse DATE columns into datetime objects instead of leaving them as long unix timestamps
+Below are the full list of bugs and issues addressed by this release:
+Features¶
+-
+
- SYNPY-53 - support syn.get of external FTP links in py client +
- SYNPY-179 - Upload to user owned S3 bucket +
- SYNPY-412 - allow query-based download based on view tables from command line client +
- SYNPY-487 - Add remote monitoring of long running processes +
- SYNPY-415 - Add Docker and TableViews into Entity.py +
- SYNPY-89 - Python client: Bulk upload client/command +
- SYNPY-413 - Update table views via python client +
- SYNPY-301 - change actual file name from python client +
- SYNPY-442 - set config file path on command line +
Improvements¶
+-
+
- SYNPY-407 - support LARGETEXT in tables +
- SYNPY-360 - Duplicate file handles are removed from BulkFileDownloadRequest +
- SYNPY-187 - Move –description in command line client to create wikis +
- SYNPY-224 - When uploading to a managed external file handle (e.g. SFTP), fill in storageLocationId +
- SYNPY-315 - Default behavior for files in cache dir should be replace +
- SYNPY-381 - Remove references to “files” and “cacheDir”. +
- SYNPY-396 - Create filehandle copies in synapseutils.copy instead of downloading +
- SYNPY-403 - Use single endpoint for all downloads +
- SYNPY-435 - Convenience function for new service to get entity’s children +
- SYNPY-471 - docs aren’t generated for synapseutils +
- SYNPY-472 - References to wrong doc site +
- SYNPY-347 - Missing dtypes in table.DTYPE_2_TABLETYPE +
- SYNPY-463 - When copying filehandles we should add the files to the cache if we already donwloaded them +
- SYNPY-475 - Store Tables timeout error +
Bug Fixes¶
+-
+
- SYNPY-190 - syn.login(‘asdfasdfasdf’) should fail +
- SYNPY-344 - weird cache directories +
- SYNPY-346 - ValueError: cannot insert ROW_ID, already exists in CsvTableFile constructor +
- SYNPY-351 - Versioning broken for sftp files +
- SYNPY-366 - file URLs leads to wrong path +
- SYNPY-393 - New cacheDir causes cache to be ignored(?) +
- SYNPY-409 - Python client cannot depend on parsing Amazon pre-signed URLs +
- SYNPY-418 - Integration test failure against 167 +
- SYNPY-421 - syn.getWikiHeaders has a return limit of 50 (Need to return all headers) +
- SYNPY-423 - upload rate is off or incorrect +
- SYNPY-424 - File entities don’t handle local_state correctly for setting datafilehandleid +
- SYNPY-426 - multiple tests failing because of filenameOveride +
- SYNPY-427 - test dependent on config file +
- SYNPY-428 - sync function error +
- SYNPY-431 - download ending early and not restarting from previous spot +
- SYNPY-443 - tests/integration/integration_test_Entity.py:test_get_with_downloadLocation_and_ifcollision AssertionError +
- SYNPY-461 - On Windows, command line client login credential prompt fails (python 2.7) +
- SYNPY-465 - Update tests that set permissions to also include ‘DOWNLOAD’ permission and tests that test functions using queries +
- SYNPY-468 - Command line client incompatible with cache changes +
- SYNPY-470 - default should be read, download for setPermissions +
- SYNPY-483 - integration test fails for most users +
- SYNPY-484 - URL expires after retries +
- SYNPY-486 - Error in integration tests +
- SYNPY-488 - sync tests for command line client puts file in working directory +
- SYNPY-142 - PY: Error in login with rememberMe=True +
- SYNPY-464 - synapse get syn4988808 KeyError: u’preSignedURL’ +
1.6.1 (2016-11-02)¶
+What is New¶
+In version 1.6 we introduce a new sub-module _synapseutils_ that +provide convenience functions for more complicated operations in Synapse such as copying of files wikis and folders. In addition we have introduced several improvements in downloading content from Synapse. As with uploads we are now able to recover from an interrupted download and will retry on network failures.
+ +Improvements¶
+We have improved download robustness and error checking, along with extensive recovery on failed operations. This includes the ability for the client to pause operation when Synapse is updated.
+-
+
- SYNPY-270 - Synapse READ ONLY mode should cause pause in execution +
- SYNPY-308 - Add md5 checking after downloading a new file handle +
- SYNPY-309 - Add download recovery by using the ‘Range’: ‘bytes=xxx-xxx’ header +
- SYNPY-353 - Speed up downloads of fast connections +
- SYNPY-356 - Add support for version flag in synapse cat command line +
- SYNPY-357 - Remove failure message on retry in multipart_upload +
- SYNPY-380 - Add speed meter to downloads/uploads +
- SYNPY-387 - Do exponential backoff on 429 status and print explanatory error message from server +
- SYNPY-390 - Move recursive download to Python client utils +
Bug Fixes¶
+-
+
- SYNPY-154 - 500 Server Error when storing new version of file from command line +
- SYNPY-168 - Failure on login gives an ugly error message +
- SYNPY-253 - Error messages on upload retry inconsistent with behavior +
- SYNPY-261 - error downloading data from synapse (python client) +
- SYNPY-274 - Trying to use the client without logging in needs to give a reasonable error +
- SYNPY-331 - test_command_get_recursive_and_query occasionally fails +
- SYNPY-337 - Download error on 10 Gb file. +
- SYNPY-343 - Login failure +
- SYNPY-351 - Versioning broken for sftp files +
- SYNPY-352 - file upload max retries exceeded messages +
- SYNPY-358 - upload failure from python client (threading) +
- SYNPY-361 - file download fails midway without warning/error +
- SYNPY-362 - setAnnotations bug when given synapse ID +
- SYNPY-363 - problems using provenance during upload +
- SYNPY-382 - Python client is truncating the entity id in download csv from table +
- SYNPY-383 - Travis failing with paramiko.ssh_exception.SSHException: No hostkey +
- SYNPY-384 - resuming a download after a ChunkedEncodingError created new file with correct size +
- SYNPY-388 - Asynchronous creation of Team causes sporadic test failure +
- SYNPY-391 - downloadTableColumns() function doesn’t work when resultsAs=rowset is set for for syn.tableQuery() +
- SYNPY-397 - Error in syncFromSynapse() integration test on Windows +
- SYNPY-399 - python client not compatible with newly released Pandas 0.19 +