Skip to content

Running Fusera

Matt Bianchi edited this page Feb 11, 2019 · 14 revisions

Access the help with fusera help:


Usage:
  fusera [command]

Available Commands:
  help        Help about any command
  mount       Mount a running instance of Fusera to a folder.
  unmount     Unmount a running instance of Fusera.
  version     Print the version number of Fusera

Flags:
  -h, --help      help for fusera
  -s, --silent    Fusera prints nothing, most useful for using fusera in scripts.
  -v, --verbose   Fusera prints everything, most useful for troubleshooting.

Use "fusera [command] --help" for more information about a command.

The 'mount' command builds a filesystem presenting the files associated with a collection of SRA accession numbers. The 'unmount' command tears down a fusera-created filesystem, and terminates the associated fusera invocation.

$ fusera help mount
Mount a running instance of Fusera to a folder.
Usage:
  fusera mount [flags] /path/to/mountpoint
Flags:
  -a, --accession string     A list of accessions to mount or path to accession file.
                             EXAMPLES: ["SRR123,SRR456" | local/accession/file | https://<bucket>.<region>.s3.amazonaws.com/<accession/file>]
                             NOTE: If using an s3 url, the proper aws credentials need to be in place on the machine.
                             Environment Variable: [$DBGAP_ACCESSION]
      --aws-batch int        ADVANCED: Adjust the amount of accessions put in one request to the SDL API when using an AWS location.
                             Environment Variable: [$DBGAP_AWS-BATCH] (default 50)
      --aws-profile string   The desired AWS credentials profile in ~/.aws/credentials to use for instances when files require the requester (you) to pay for accessing the file.
                             Environment Variable: [$DBGAP_AWS-PROFILE]
                             NOTE: This account will be charged all cost accrued by accessing these certain files through fusera. (default "default")
      --eager                ADVANCED: Have fusera request that urls be signed by the API on start up.
                             Environment Variable: [$DBGAP_EAGER]
  -e, --endpoint string      ADVANCED: Change the endpoint used to communicate with SDL API.
                             Environment Variable: [$DBGAP_ENDPOINT] (default "https://www.ncbi.nlm.nih.gov/Traces/sdl/1/retrieve")
  -f, --filetype string      A list of the only file types to copy.
                             EXAMPLES: "cram,crai,bam,bai"
                             Environment Variable: [$DBGAP_FILETYPE]
      --gcp-batch int        ADVANCED: Adjust the amount of accessions put in one request to the SDL API when using a GCP location.
                             Environment Variable: [$DBGAP_GCP-BATCH] (default 25)
      --gcp-profile string   The desired GCP credentials profile in ~/.aws/credentials to use for instances when files require the requester (you) to pay for accessing the file.
                             Environment Variable: [$DBGAP_GCP-PROFILE]
                             NOTE: This account will be charged all cost accrued by accessing these certain files through fusera. These credentials should be in the AWS supported format that Google provides in order to work with their AWS compati
ble API. (default "gcp")
  -h, --help                 help for mount
  -l, --location string      Cloud provider and region where files should be located.
                             FORMAT: [cloud.region]
                             EXAMPLES: [s3.us-east-1 | gs.US]
                             NOTE: This can be auto-resolved if running on AWS or GCP.
                             Environment Variable: [$DBGAP_LOCATION]
  -t, --token string         A path to one of the various security tokens used to authorize access to accessions in dbGaP.
                             EXAMPLES: [local/token/file | https://<bucket>.<region>.s3.amazonaws.com/<token/file>]
                             NOTE: If using an s3 url, the proper aws credentials need to be in place on the machine.
                             Environment Variable: [$DBGAP_TOKEN]
Global Flags:
  -s, --silent    Fusera prints nothing, most useful for using fusera in scripts.
  -v, --verbose   Fusera prints everything, most useful for troubleshooting.

Most of the options and environment variables are intended for advanced users and debugging. The only options intended for regular use by users are for passing the token file and specifying the list of accessions.

A simple run of Fusera:

$ fusera mount --token ~/file.ngc --accession "SRR123,SRR456" --location s3.us-east-1 ~/studies

NOTE: fusera needs to continue running in order to operate. So this command will not "end" and bring a terminal prompt back up until fusera is quit (CTRL-C) or unmounted from another terminal command in another shell (using fusera unmount ~/studies). fusera can be run in the background, as described below.

Tips and Tricks

Shortening the call length

For ease of use, all command-line flags have equivalent environment variables ($DBGAP_TOKEN, $DBGAP_ACCESSION, $DBGAP_LOCATION, etc). Using the environment variables, a call to fusera could look like so:

$ fusera mount ~/studies

Another way to ease the use of fusera is through using it on a compute instance on either AWS or GCP. When fusera is not given a location through the flag or environment variable, it will attempt to utilize known ways of resolving where fusera is running with respect to that cloud platform and will use the location it finds.

Running fusera in the background

If you want to run fusera in the background you can do so with shell commands. Example:

$ fusera mount ~/tmp > output.log  2>&1 &
[1] 12464
$ disown %1

Breakdown:
> output.log
This redirects stdout to a file named output.log. If you don't want the output, use > /dev/null instead.
2>&1
The way to redirect stderr to print with stdout so it is caught in output.log (or /dev/null) as well.
&
Run this process in the background so you can continue using the shell.
[1] 12464
This is an example of the printout that will appear after entering the whole command. The numbers outside the brackets will most likely be different than this example, but it doesn't matter. What this information means is that this is the first ([1]) command started in the background from this terminal and its process id is 12464. Again, this doesn't matter except now one knows what to pass to the disown command described below.
disown %1
This will keep fusera running even if the terminal is closed. This example passes %1 because a 1 was in the brackets of the output after executing the fusera command above. If a different number is displayed for one while attempting this, that number should be used instead.

Using fusera's unmount command on the folder fusera is mounted to will kill the process, as long as nothing is using the file system at that time.

Advice

The <mountpoint> must be an existing, empty directory, to which the user has read and write permissions.

It is recommended that the mountpoint be a directory owned by the user. Creating the mountpoint in system directories such as /mnt, /dev, and /tmp have special uses in unix systems and should be avoided.

Because of the nature of FUSE systems, only the user who ran fusera will be able to read the files mounted. This can be changed by editing a config file (reference) on the machine to allow_others, but be warned that there are security implications to be considered: https://github.com/libfuse/libfuse#security-implications.

Clone this wiki locally