Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-Ordering priority in SeekableStreamFactory #1693

Merged
merged 5 commits into from
Dec 12, 2023

Conversation

lbergelson
Copy link
Member

  • Now when constructing streams it will prefer NIO plugins.
  • If no http / ftp plugin exists it will fall back to the htsjdk bulit in
  • No longer special casing local files, these will now be handled through NIO as well.
  • Deprecating SeekableStreamFactory.isFilePath() since it is no longer used and interacts poorly
    with nio filesystem providers

* Now when constructing streams it will prefer NIO plugins.
* If no http / ftp plugin exists it will fall back to the htsjdk bulit in
* No longer special casing local files, these will now be handled through NIO as well.
* Deprecating SeekableStreamFactory.isFilePath() since it is no longer used and interacts poorly
  with nio filesystem providers
@lbergelson lbergelson marked this pull request as ready for review December 11, 2023 17:42
@droazen
Copy link
Contributor

droazen commented Dec 11, 2023

Can you also fix the order in ParsingUtils.openInputStream() (used by TribbleIndexedFeatureReader.readHeader())?

Copy link
Contributor

@droazen droazen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, modulo a few requests

* @return if the path is not being handled by a FileSystemProvider and it can be read by legacy streams
*/
public static boolean isSpecialCase(final String path){
return !new HtsPath(path).isPath() //if we have a provider for it that's what we'll use
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe break out out a hasNIOPlugin() (or similar) method for the new HtsPath(path).isPath() everywhere, for clarity?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or should you be using hasFileSystemProvider() instead?

}

//is this onee of the url types that has legacy htsjdk support built in?
private static boolean isSpecialCaseUrlType(final String path) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isSupportedByLegacyCloudReaders()?

public static SeekableStream getStreamFor(final IOPath path, Function<SeekableByteChannel, SeekableByteChannel> wrapper) throws IOException {
if(path.isPath()) {
return path.getScheme().equals("file")
? new SeekablePathStream(path.toPath()) //don't apply the wrapper to local files
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check usages of SeekablePathStream.getSource() to make sure the switch from File -> Path for local files is safe (concerned about file:// appearing unexpectedly downstream). Or to be super-safe, could stick with SeekableFileStream for now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I switched back to SeekableFileStream although it makes me sad.

* @param wrapper a wrapper to apply to the stream allowing direct transformations on the byte stream to be applied
* @throws IOException
*/
public static SeekableStream getStreamFor(final IOPath path, Function<SeekableByteChannel, SeekableByteChannel> wrapper) throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there aren't already tests covering file:// vs. local file paths (no protocol), we might want one

Copy link
Contributor

@droazen droazen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just some minor comments. Merge once tests pass and you've addressed them. Have you verified that with this patch we are using http-nio in tribble in GATK? Did you look at the SAM/BAM/CRAM path?

@@ -253,6 +253,10 @@ private URI getURIForString(final String pathString) {
tempURI = getCachedPath().toUri();
}
} catch (URISyntaxException uriException) {
//check that the uri wasn't a badly encoded absolute uri of some sort
//if you don't do this it will be treated as a badly formed file:/ url
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

file://

"\nCheck that it is fully encoded.", cause);
}
if(!split[0].equals("file")){
throw new IllegalArgumentException("Malformed url " + pathString + " includes a scheme: " + split[0] + ":/ but was an invalid URI." +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:/ -> ://


//is this onee of the url types that has legacy htsjdk support built in?
public static boolean canBeHandledByLegacyUrlSupport(final String path) {
return URL_SCHEMES_WITH_LEGACY_SUPPORT.stream().anyMatch(path::startsWith);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't you be checking that the path starts with "http://" (for example) instead of "http"? What if there's a file named "http"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -40,6 +43,14 @@
public class SeekableStreamFactory{

private static final ISeekableStreamFactory DEFAULT_FACTORY;
private static final String HTTP = "http";
private static final String HTTPS = "https";
private static final String FTP = "ftp";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See below -- recommend including "://" in these constants to guard against files named after a protocol

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of adding them to the constants which are also used directly compared against getScheme the isHandledBy method now appends '://' to them.

}

@Test(dataProvider = "getSpecialCasePaths")
public void testCanBeHandledByLegacyUrlSuppoort(String path, boolean notExpected){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suppoort....this test is identical to the one just before it. Did you mean to test canBeHandledByLegacyUrlSupport() here instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

Assert.assertEquals(new String(bytes, StandardCharsets.UTF_8), "hello");
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any easy way to get test coverage for the case where isBeingHandledByLegacyUrlSupport() detects that an NIO plugin exists for the legacy-supported URL? I'm guessing not, but thought I'd ask.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not an easy way, we can add an additional test suite in gradle but it's going to be a hassle.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

forget it then

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a manual test where I added the http-nio library to the test dependencies and isBeingHandledByLegacySupport for http / https is changes as expected.

@lbergelson lbergelson merged commit 40e3a8b into master Dec 12, 2023
4 checks passed
lbergelson added a commit that referenced this pull request Dec 13, 2023
* fixing edge cases as well as documentation and tests from the change to HtsPath in #1693
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants