Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

htslib protocol plugins (https/S3) cannot be loaded on macOS #1176

Closed
ccwang002 opened this issue Nov 4, 2020 · 3 comments · Fixed by #1184
Closed

htslib protocol plugins (https/S3) cannot be loaded on macOS #1176

ccwang002 opened this issue Nov 4, 2020 · 3 comments · Fixed by #1184

Comments

@ccwang002
Copy link

htslib 1.11 cannot read any files using HTTPS or S3 on macOS. I tested with the following files:

$ htsfile -vv https://github.com/brentp/cyvcf2/raw/master/cyvcf2/tests/test.vcf.gz
[D::init_add_plugin] Loaded "knetfile"
[D::init_add_plugin] Loaded "mem"
[D::init_add_plugin] Loaded "crypt4gh-needed"
[W::hfile_add_scheme_handler] Couldn't register scheme handler for s3
[W::hfile_add_scheme_handler] Couldn't register scheme handler for s3+http
[W::hfile_add_scheme_handler] Couldn't register scheme handler for s3+https
[D::init_add_plugin] Loaded "/Users/liang/miniconda3/envs/cyvcf2-test/libexec/htslib/hfile_s3.bundle"
[W::hfile_add_scheme_handler] Couldn't register scheme handler for s3w
[W::hfile_add_scheme_handler] Couldn't register scheme handler for s3w+http
[W::hfile_add_scheme_handler] Couldn't register scheme handler for s3w+https
[D::init_add_plugin] Loaded "/Users/liang/miniconda3/envs/cyvcf2-test/libexec/htslib/hfile_s3_write.bundle"
[W::hfile_add_scheme_handler] Couldn't register scheme handler for dict
[W::hfile_add_scheme_handler] Couldn't register scheme handler for file
[W::hfile_add_scheme_handler] Couldn't register scheme handler for ftp
[W::hfile_add_scheme_handler] Couldn't register scheme handler for ftps
[W::hfile_add_scheme_handler] Couldn't register scheme handler for gopher
[W::hfile_add_scheme_handler] Couldn't register scheme handler for http
[W::hfile_add_scheme_handler] Couldn't register scheme handler for https
[W::hfile_add_scheme_handler] Couldn't register scheme handler for imap
[W::hfile_add_scheme_handler] Couldn't register scheme handler for imaps
[W::hfile_add_scheme_handler] Couldn't register scheme handler for pop3
[W::hfile_add_scheme_handler] Couldn't register scheme handler for pop3s
[W::hfile_add_scheme_handler] Couldn't register scheme handler for rtsp
[W::hfile_add_scheme_handler] Couldn't register scheme handler for scp
[W::hfile_add_scheme_handler] Couldn't register scheme handler for sftp
[W::hfile_add_scheme_handler] Couldn't register scheme handler for smb
[W::hfile_add_scheme_handler] Couldn't register scheme handler for smbs
[W::hfile_add_scheme_handler] Couldn't register scheme handler for smtp
[W::hfile_add_scheme_handler] Couldn't register scheme handler for smtps
[W::hfile_add_scheme_handler] Couldn't register scheme handler for telnet
[W::hfile_add_scheme_handler] Couldn't register scheme handler for tftp
[D::init_add_plugin] Loaded "/Users/liang/miniconda3/envs/cyvcf2-test/libexec/htslib/hfile_libcurl.bundle"
[W::hfile_add_scheme_handler] Couldn't register scheme handler for gs
[W::hfile_add_scheme_handler] Couldn't register scheme handler for gs+http
[W::hfile_add_scheme_handler] Couldn't register scheme handler for gs+https
[D::init_add_plugin] Loaded "/Users/liang/miniconda3/envs/cyvcf2-test/libexec/htslib/hfile_gcs.bundle"
htsfile: can't open "https://github.com/brentp/cyvcf2/raw/master/cyvcf2/tests/test.vcf.gz": Protocol not supported

This error persists using the bioconda's htslib 1.11 or the one I built on macOS using the following steps:

./configure --prefix=/tmp/local --enable-libcurl --with-libdeflate --enable-plugins --with-plugin-dir=/tmp/local/htslib --enable-gcs --enable-s3
make -j4
make install
/tmp/local/bin/htsfile ...

It seems that the plugin bundle is loaded after registering the protocol handler. I came across with this issue while investigating brentp/cyvcf2#174 which links to htslib 1.10. Version 1.10 can read the two files fine using the build from bioconda or self-compiled:

$ htsfile -vv 'https://github.com/brentp/cyvcf2/raw/master/cyvcf2/tests/test.vcf.gz'
[D::init_add_plugin] Loaded "knetfile"
[D::init_add_plugin] Loaded "mem"
[D::init_add_plugin] Loaded "/Users/liang/miniconda3/envs/cyvcf2/libexec/htslib/hfile_s3.bundle"
[D::init_add_plugin] Loaded "/Users/liang/miniconda3/envs/cyvcf2/libexec/htslib/hfile_s3_write.bundle"
[D::init_add_plugin] Loaded "/Users/liang/miniconda3/envs/cyvcf2/libexec/htslib/hfile_libcurl.bundle"
[D::init_add_plugin] Loaded "/Users/liang/miniconda3/envs/cyvcf2/libexec/htslib/hfile_gcs.bundle"
https://github.com/brentp/cyvcf2/raw/master/cyvcf2/tests/test.vcf.gz:	VCF version 4.1 BGZF-compressed variant calling data

$ set -x AWS_DEFAULT_REGION us-east-1
$ htsfile -vv 's3://3kricegenome/test/test.vcf.gz'
[D::init_add_plugin] Loaded "knetfile"
[D::init_add_plugin] Loaded "mem"
[D::init_add_plugin] Loaded "/Users/liang/miniconda3/envs/cyvcf2/libexec/htslib/hfile_s3.bundle"
[D::init_add_plugin] Loaded "/Users/liang/miniconda3/envs/cyvcf2/libexec/htslib/hfile_s3_write.bundle"
[D::init_add_plugin] Loaded "/Users/liang/miniconda3/envs/cyvcf2/libexec/htslib/hfile_libcurl.bundle"
[D::init_add_plugin] Loaded "/Users/liang/miniconda3/envs/cyvcf2/libexec/htslib/hfile_gcs.bundle"
s3://3kricegenome/test/test.vcf.gz:	VCF version 4.1 BGZF-compressed variant calling data
@jmarshall
Copy link
Member

This (also reported in bioconda/bioconda-recipes#15415 (comment)) is a macOS problem whereby when both static libhts.a and shared libhts.dylib are linked into or loaded into a running program, two separate copies of the library's functions and global variables (notably hfile.c's schemes) are present — as described in this thread. Since PR #1072, htsfile has libhts.a statically linked but the plugins now also bring in the dynamic libhts.dylib.

I believe this problem is limited to macOS. The proper fix needs thinking about, and may involve finally linking htsfile/tabix/samtools/etc against libhts.dylib dynamically and/or rethinking linking the plugins against libhts.

In the meantime, rebuilding the plugins with this patch should make this work on macOS:

diff --git a/Makefile b/Makefile
index 245b7a1..e0d8b18 100644
--- a/Makefile
+++ b/Makefile
@@ -310,8 +312,8 @@ hts-object-files: $(LIBHTS_OBJS)
 %.so: %.pico libhts.so
        $(CC) -shared -Wl,-E $(LDFLAGS) -o $@ $< libhts.so $(LIBS) -lpthread
 
-%.bundle: %.o libhts.dylib
-       $(CC) -bundle -Wl,-undefined,dynamic_lookup $(LDFLAGS) -o $@ $< libhts.dylib $(LIBS)
+%.bundle: %.o
+       $(CC) -bundle -Wl,-undefined,dynamic_lookup $(LDFLAGS) -o $@ $< $(LIBS)

and I will soon be proposing a similar update to the bioconda recipe to make the bioconda package work again in this respect.

@ccwang002
Copy link
Author

Thank you for identifying and taking care of the problem!

jmarshall added a commit to jmarshall/bioconda-recipes that referenced this issue Nov 5, 2020
Backport symbol name typo fix; apply simplified form of upcoming
fix for samtools/htslib#1176.

Patches applicable to HTSlib 1.11; will not be needed for future
upstream versions.
dpryan79 pushed a commit to bioconda/bioconda-recipes that referenced this issue Nov 6, 2020
Backport symbol name typo fix; apply simplified form of upcoming
fix for samtools/htslib#1176.

Patches applicable to HTSlib 1.11; will not be needed for future
upstream versions.
@jmarshall
Copy link
Member

This is resolved in the bioconda package (by way of a local patch), but not yet resolved in HTSlib itself. So let's keep this open as a reminder that I still have to make an HTSlib PR for this.

@jmarshall jmarshall reopened this Nov 12, 2020
jmarshall added a commit to jmarshall/htslib that referenced this issue Dec 1, 2020
PR samtools#1072 changed plugin linking so that plugins are linked back to
the dynamic libhts.so/.dylib, to facilitate use when libhts is itself
dynamically dlopen()ed with RTLD_LOCAL, e.g., by the Python runtime
which uses default dlopen() flags which on Linux means RTLD_LOCAL.

This broke plugin loading on macOS when opening plugins in an executable
in which libhts.a has been statically linked, as there were then two
copies of the library globals (notably hfile.c::schemes), one from the
executable's libhts.a and one from the plugin's libhts.NN.dylib.
(The Linux loading model does not suffer from this issue.)

The default dlopen() flag on macOS is RTLD_GLOBAL, so this can be fixed
by reverting the change (on macOS only) and depending on the symbols
supplied by a static libhts.a, a dynamically linked libhts.NN.dylib,
or a RTLD_GLOBALly dlopen()ed libhts.NN.dylib. This rebreaks the case
of dlopen()ing libhts on macOS while explicitly specifying RTLD_LOCAL,
but this is not a common case. Fixes samtools#1176.

Disable the `plugins-dlhts -l` test case on macOS. Add a test of
accessing plugins from an executable with a statically linked libhts.a
(namely, htsfile) to test/test.pl.
jmarshall added a commit to jmarshall/htslib that referenced this issue Dec 1, 2020
PR samtools#1072 changed plugin linking so that plugins are linked back to
the dynamic libhts.so/.dylib, to facilitate use when libhts is itself
dynamically dlopen()ed with RTLD_LOCAL, e.g., by the Python runtime
which uses default dlopen() flags which on Linux means RTLD_LOCAL.

This broke plugin loading on macOS when opening plugins in an executable
in which libhts.a has been statically linked, as there were then two
copies of the library globals (notably hfile.c::schemes), one from the
executable's libhts.a and one from the plugin's libhts.NN.dylib.
(The Linux loading model does not suffer from this issue.)

The default dlopen() flag on macOS is RTLD_GLOBAL, so this can be fixed
by reverting the change (on macOS only) and depending on the symbols
supplied by a static libhts.a, a dynamically linked libhts.NN.dylib,
or a RTLD_GLOBALly dlopen()ed libhts.NN.dylib. This rebreaks the case
of dlopen()ing libhts on macOS while explicitly specifying RTLD_LOCAL,
but this is not a common case. Fixes samtools#1176.

Disable the `plugins-dlhts -l` test case on macOS. Add a test of
accessing plugins from an executable with a statically linked libhts.a
(namely, htsfile) to test/test.pl.
daviesrob pushed a commit that referenced this issue Dec 15, 2020
PR #1072 changed plugin linking so that plugins are linked back to
the dynamic libhts.so/.dylib, to facilitate use when libhts is itself
dynamically dlopen()ed with RTLD_LOCAL, e.g., by the Python runtime
which uses default dlopen() flags which on Linux means RTLD_LOCAL.

This broke plugin loading on macOS when opening plugins in an executable
in which libhts.a has been statically linked, as there were then two
copies of the library globals (notably hfile.c::schemes), one from the
executable's libhts.a and one from the plugin's libhts.NN.dylib.
(The Linux loading model does not suffer from this issue.)

The default dlopen() flag on macOS is RTLD_GLOBAL, so this can be fixed
by reverting the change (on macOS only) and depending on the symbols
supplied by a static libhts.a, a dynamically linked libhts.NN.dylib,
or a RTLD_GLOBALly dlopen()ed libhts.NN.dylib. This rebreaks the case
of dlopen()ing libhts on macOS while explicitly specifying RTLD_LOCAL,
but this is not a common case. Fixes #1176.

Disable the `plugins-dlhts -l` test case on macOS. Add a test of
accessing plugins from an executable with a statically linked libhts.a
(namely, htsfile) to test/test.pl.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants