-
Notifications
You must be signed in to change notification settings - Fork 243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implements collections of bundles. #1702
Changes from 3 commits
196c169
3df0b8b
3f0d79d
aa44a43
8b0e589
99cd8c2
36764e1
79ba8d9
e6cfc62
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,10 +4,12 @@ | |
import htsjdk.io.IOPath; | ||
import htsjdk.samtools.util.Log; | ||
import htsjdk.utils.ValidationUtils; | ||
import org.json.JSONArray; | ||
import org.json.JSONException; | ||
import org.json.JSONObject; | ||
|
||
import java.util.ArrayList; | ||
import java.util.Collection; | ||
import java.util.Collections; | ||
import java.util.HashSet; | ||
import java.util.List; | ||
|
@@ -30,11 +32,12 @@ public class BundleJSON { | |
public static final String JSON_PROPERTY_PRIMARY = "primary"; | ||
public static final String JSON_PROPERTY_PATH = "path"; | ||
public static final String JSON_PROPERTY_FORMAT = "format"; | ||
|
||
public static final String JSON_SCHEMA_NAME = "htsbundle"; | ||
public static final String JSON_SCHEMA_VERSION = "0.1.0"; // TODO: bump this to 1.0.0 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should this be 0.2.0 now? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Will update the schema in a separate PR when I move the bundle classes out of the beta package. |
||
|
||
final private static Set<String> TOP_LEVEL_PROPERTIES = Collections.unmodifiableSet( | ||
new HashSet<String>() { | ||
new HashSet<>() { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can these static declarations be migrated to use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh yeah, thats an improvement. Done. |
||
private static final long serialVersionUID = 1L; | ||
{ | ||
add(JSON_PROPERTY_SCHEMA_NAME); | ||
|
@@ -43,7 +46,7 @@ public class BundleJSON { | |
}}); | ||
|
||
/** | ||
* Serialize this bundle to a JSON string representation. All resources in the bundle must | ||
* Serialize a bundle to a JSON string representation. All resources in the bundle must | ||
* be {@link IOPathResource}s for serialization to succeed. Stream resources cannot be serialized. | ||
* | ||
* @param bundle the {@link Bundle} to serialize to JSON | ||
|
@@ -74,78 +77,176 @@ public static String toJSON(final Bundle bundle) { | |
} | ||
|
||
/** | ||
* Create a Bundle from jsonString. | ||
* | ||
* @param jsonString a valid JSON string conforming to the bundle schema | ||
* @return a {@link Bundle} created from jsonString | ||
* Convert a (non-empty) Collection of Bundles to a JSON array string representation. | ||
* @param bundles a Collection of Bundles to serialize to JSON | ||
* @return a JSON string (array) representation of the collection of bundles | ||
* @throw IllegalArgumentException if the collection is empty | ||
*/ | ||
public static Bundle toBundle(final String jsonString) { | ||
return toBundle(ValidationUtils.nonEmpty(jsonString, "resource list"), HtsPath::new); | ||
public static String toJSON(final Collection<Bundle> bundles) { | ||
if (bundles.isEmpty()) { | ||
throw new IllegalArgumentException("A bundle collection must contain at least one bundle"); | ||
} | ||
return bundles.stream() | ||
.map(BundleJSON::toJSON) | ||
.collect(Collectors.joining(",\n", "[", "]")); | ||
} | ||
|
||
/** | ||
* Create a Bundle from a jsonString. | ||
* | ||
* @param jsonString a valid JSON string conforming to the bundle schema (for compatibility, a bundle list is also | ||
* accepted, as long as it only contains a single bundle) | ||
* @return a {@link Bundle} created from jsonString | ||
*/ | ||
public static Bundle toBundle(final String jsonString) { | ||
return toBundle(ValidationUtils.nonEmpty(jsonString, "resource list"), HtsPath::new); | ||
} | ||
|
||
/** | ||
* Create a Bundle from jsonString using a custom class that implements {@link IOPath} for all resources. | ||
* (For compatibility, a bundle list string is also accepted, as long as it only contains a single bundle). | ||
* | ||
* @param jsonString a valid JSON string conforming to the bundle schema | ||
* @param ioPathConstructor a function that takes a string and returns an IOPath-derived class of type <T> | ||
* @param <T> the IOPath-derived type to use for IOPathResources | ||
* @return a newly created {@link Bundle} | ||
* @return a newly created {@link Bundle} | ||
*/ | ||
public static <T extends IOPath> Bundle toBundle( | ||
final String jsonString, | ||
final Function<String, T> ioPathConstructor) { | ||
ValidationUtils.nonEmpty(jsonString, "JSON string"); | ||
ValidationUtils.nonNull(ioPathConstructor, "IOPath-derived class constructor"); | ||
try { | ||
return toBundle(new JSONObject(jsonString), ioPathConstructor); | ||
} catch (JSONException | UnsupportedOperationException e) { | ||
// see if the user provided a collection instead of a single bundle, and if so, present it as | ||
// a Bundle as long as it only contains one Bundle | ||
try { | ||
final Collection<Bundle> bundles = toBundleCollection(jsonString, ioPathConstructor); | ||
if (bundles.size() > 1) { | ||
throw new IllegalArgumentException( | ||
String.format("A JSON string with more than one bundle was provided but only a single Bundle is allowed", | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The format string is missing the template variables. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh, fixed. |
||
e.getMessage(), | ||
e.getMessage())); | ||
} | ||
return bundles.stream().findFirst().get(); | ||
} catch (JSONException | UnsupportedOperationException e2) { | ||
throw new IllegalArgumentException( | ||
String.format("JSON can be interpreted neither as an individual bundle (%s) nor as a bundle collection (%s)", | ||
e2.getMessage(), | ||
e.getMessage()), | ||
e); | ||
} | ||
} | ||
} | ||
|
||
final List<BundleResource> resources = new ArrayList<>(); | ||
String primaryContentType; | ||
/** | ||
* Create a Collection<Bundle> from a jsonString, using a custom class that implements {@link IOPath} for all | ||
* resources. | ||
* @param jsonString the json string must conform to the bundle schema, and may contain an array or single object | ||
* @param ioPathConstructor constructor to use to create the backing IOPath for all resources | ||
* @return Collection<Bundle> | ||
* @param <T> IOPath-derived class to use for IOPathResources | ||
*/ | ||
public static <T extends IOPath> Collection<Bundle> toBundleCollection( | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we really want this to be Collection and not something with an order? I would think List might be better? The input order is usually important in some way, changing the encounter order can often change either errors generated or floating point aggregations even if it's something that seems like it should be file order agnostic. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Changed to Lists. |
||
final String jsonString, | ||
final Function<String, T> ioPathConstructor) { | ||
ValidationUtils.nonEmpty(jsonString, "json bundle string"); | ||
ValidationUtils.nonNull(ioPathConstructor, "IOPath-derived class constructor"); | ||
|
||
final List<Bundle> bundles = new ArrayList<>(); | ||
try { | ||
final JSONObject jsonDocument = new JSONObject(jsonString); | ||
if (jsonString.length() < 1) { | ||
final JSONArray jsonArray = new JSONArray(jsonString); | ||
jsonArray.forEach(element -> { | ||
if (! (element instanceof JSONObject jsonObject)) { | ||
throw new IllegalArgumentException( | ||
String.format("Bundle collections may contain only Bundle objects, found %s", | ||
element.toString())); | ||
} | ||
bundles.add(toBundle(jsonObject, ioPathConstructor)); | ||
}); | ||
} catch (JSONException | UnsupportedOperationException e) { | ||
// see if the user provided a single bundle instead of a collection, if so, wrap it up as a collection | ||
try { | ||
bundles.add(toBundle(new JSONObject(jsonString), ioPathConstructor)); | ||
} catch (JSONException | UnsupportedOperationException e2) { | ||
throw new IllegalArgumentException( | ||
String.format("JSON file parsing failed %s", jsonString)); | ||
String.format("JSON can be interpreted neither as an individual bundle (%s) nor as a bundle collection (%s)", | ||
e2.getMessage(), | ||
e.getMessage()), | ||
e); | ||
} | ||
} | ||
if (bundles.size() < 1) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nitpicky but I isEmpty might be better There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
throw new IllegalArgumentException("JSON bundle collection must contain at least one bundle"); | ||
} | ||
return bundles; | ||
} | ||
|
||
/** | ||
* Create a Collection<Bundle> from a jsonString. | ||
* | ||
* @param jsonString a JSON strings that conform to the bundle schema; may be an array or single object | ||
* @return a {@link Collection<Bundle>} created from a Collection of jsonStrings | ||
*/ | ||
public static Collection<Bundle> toBundleCollection(final String jsonString) { | ||
return toBundleCollection(jsonString, HtsPath::new); | ||
} | ||
|
||
private static <T extends IOPath> Bundle toBundle( | ||
final JSONObject jsonObject, // must be a single Bundle object | ||
final Function<String, T> ioPathConstructor) { | ||
try { | ||
// validate the schema name | ||
final String schemaName = getRequiredPropertyAsString(jsonDocument, JSON_PROPERTY_SCHEMA_NAME); | ||
final String schemaName = getRequiredPropertyAsString(jsonObject, JSON_PROPERTY_SCHEMA_NAME); | ||
if (!schemaName.equals(JSON_SCHEMA_NAME)) { | ||
throw new IllegalArgumentException( | ||
String.format("Expected bundle schema name %s but found %s", JSON_SCHEMA_NAME, schemaName)); | ||
} | ||
|
||
// validate the schema version | ||
final String schemaVersion = getRequiredPropertyAsString(jsonDocument, JSON_PROPERTY_SCHEMA_VERSION); | ||
final String schemaVersion = getRequiredPropertyAsString(jsonObject, JSON_PROPERTY_SCHEMA_VERSION); | ||
if (!schemaVersion.equals(JSON_SCHEMA_VERSION)) { | ||
throw new IllegalArgumentException(String.format("Expected bundle schema version %s but found %s", | ||
JSON_SCHEMA_VERSION, schemaVersion)); | ||
} | ||
primaryContentType = getRequiredPropertyAsString(jsonDocument, JSON_PROPERTY_PRIMARY); | ||
|
||
jsonDocument.keySet().forEach((String contentType) -> { | ||
if (! (jsonDocument.get(contentType) instanceof JSONObject jsonDoc)) { | ||
return; | ||
} | ||
|
||
if (!TOP_LEVEL_PROPERTIES.contains(contentType)) { | ||
final String format = jsonDoc.optString(JSON_PROPERTY_FORMAT, null); | ||
final IOPathResource ioPathResource = new IOPathResource( | ||
ioPathConstructor.apply(getRequiredPropertyAsString(jsonDoc, JSON_PROPERTY_PATH)), | ||
contentType, | ||
format == null ? | ||
null : | ||
jsonDoc.optString(JSON_PROPERTY_FORMAT, null)); | ||
resources.add(ioPathResource); | ||
} | ||
}); | ||
if (resources.isEmpty()) { | ||
LOG.warn("Empty resource bundle found: ", jsonString); | ||
final String primaryContentType = getRequiredPropertyAsString(jsonObject, JSON_PROPERTY_PRIMARY); | ||
final Collection<BundleResource> bundleResources = toBundleResources(jsonObject, ioPathConstructor); | ||
if (bundleResources.isEmpty()) { | ||
LOG.warn("Empty resource bundle found in: ", jsonObject.toString()); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we allow empty bundles at all? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Actually, we already don't. I'll remove this (the |
||
} | ||
return new Bundle(primaryContentType, bundleResources); | ||
} catch (JSONException | UnsupportedOperationException e) { | ||
throw new IllegalArgumentException(e); | ||
} | ||
} | ||
private static <T extends IOPath> IOPathResource toBundleResource( | ||
final String contentType, | ||
final JSONObject jsonObject, | ||
final Function<String, T> ioPathConstructor) { | ||
final String format = jsonObject.optString(JSON_PROPERTY_FORMAT, null); | ||
return new IOPathResource( | ||
ioPathConstructor.apply(getRequiredPropertyAsString(jsonObject, JSON_PROPERTY_PATH)), | ||
contentType, | ||
format == null ? null : format); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this ternary is unnecessary since you're not dereferencing format here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed. |
||
} | ||
private static <T extends IOPath> Collection<BundleResource> toBundleResources( | ||
final JSONObject jsonResources, | ||
final Function<String, T> ioPathConstructor) { | ||
|
||
return new Bundle(primaryContentType, resources); | ||
final List<BundleResource> bundleResources = new ArrayList<>(); // default capacity of 10 seems right | ||
jsonResources.keySet().forEach(key -> { | ||
if (!TOP_LEVEL_PROPERTIES.contains(key)) { | ||
if (jsonResources.get(key) instanceof JSONObject resourceObject) { | ||
bundleResources.add(toBundleResource(key, resourceObject, ioPathConstructor)); | ||
} else { | ||
throw new IllegalArgumentException( | ||
String.format("Bundle resources may contain only BundleResource objects, found %s", key)); | ||
} | ||
} | ||
}); | ||
return bundleResources; | ||
} | ||
|
||
private static String getRequiredPropertyAsString(JSONObject jsonDocument, String propertyName) { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow what's going on here when you have a LinkedHashMap. Shouldn't serializing a linked hashmap deserialize it to the same map?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm. I was having problems when roundtripping these through JSON, but I can no longer reproduce the issue, so reverting to our beloved LinkedHashMap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Long Live LinkedHashMap!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I realized what the issue was here.
JSONObject
internally uses aHashMap
(implying, I think, that JSON doesn't preserve the serialized order of JSON attributes), so when you roundtrip through JSON, the iteration order from JSON is based on theHashMap
order. If we useLinkedHashMap
inBundle
, then the order after a roundtrip gets scrambled, and tests fail (but only for some cases because sometimes the roundtrip order matches and sometimes it differs).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahah, that makes sense. Seems like a weird decision to not maintain internal order, but it's good to know about. Cana we change our tests to use an order independent comparison?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added an resource-order-independent equals method for use in the tests.