Releases: diana-hep/spark-root
Release 0.1.14
Updates for experimental package:
- Polishing of I/O
- Added Optimization Passes over the constructed Intermediate Schema:
- Remove Empty Rows Passes
- Remove nulls. Comes in 2 versions: Soft and Hard. Hard will remove all the branches that are not splittable and contains null as one of the fields. Soft just removes nulls without checking for this "branch safety".
- Schema Pruning - Prunes as deep as Spark allows. Takes effect together with Apache Spark PR: apache/spark#16578
- All of optimizations are enabled by default (w/o SoftRemove) and can be turned off/on with spark.sqlContext.read.option("OptimizationName", true/false or "on/off").
Updates for org.dianahep.sparkroot package:
- Default parallelism is the number of files.
v0.1.14_pre1 Release
Testing out zenodo
0.1.14 pre0 release
Preparing for 0.1.14 release:
- Various schema optimizations: Flattening out Base classes, removing empty rows
- Default Parallelism is number of input files for non-experimental version
spark-root_2.11-0.1.4 built against root4j-0.1.3
- adding support for TClonesArray that occupy their own branch
- adding support for cases when objects derive from TObject and ignore TObject has been enabled (streamer type = -1)
spark-root_2.11-0.1.3 built against root4j-0.1.3
Substantial updates include reading of:
-
STL Collections of basic types
-
Nested STL Collections of basic types (e.g. vector<vector<map<int, float> > >)
only map and vector are supported for now -
Composite types of basic types
-
Composite types of other composites
-
Composite of STL Collection of Composite
-
STL Collections of Composite of basic types
-
STL Collections of Composite with STL Collection of Composite as a member.
this is not bullet proof, however tested for both split and non-split branches.
Limitations:
- Objects that derive from TObject
- any type of pointers
- fixed size arrays of classes as class members - should be easy to add, but current release doesn't support it . It is a pointer, but not of varying length....
VK
spark-root_v0.1.0_alpha.4 with root4j_v0.1.2
Added support for Early pruning - only using the columns for the query that are required
pushed to central maven
VK
spark-root_v0.1.0_alpha.3 with root4j_v0.1.2
dependency changes only - to be used with root4j:0.1.2
spark-root-v0.1.0-alpha.1
Builds on top of alpha.0 and provides the support for C structs of Basic Types and Fixed Dim Arrays of BasicTypes.
First Release
- Basic Numerical Types (e.g. Int, Float, Double Byte , Short)
- Char is represented as Byte
- Single TLeaf for a branch
- C like structs stored for a branch (e.g. where leaflist has "var1/I:var2/I") are not yet supported
- 1D or N-Dimensional arrays are supported of fixed dimensions and of simple Numerical Types