From 6bf9e142fc0a0002b8c475a8dec75a702dee1c39 Mon Sep 17 00:00:00 2001 From: Binh-Minh Date: Sat, 16 Mar 2024 02:26:56 -0400 Subject: [PATCH 1/2] Added more page and improve format --- .../release_specifics/additional_APIs | 37 ++++++ .../hdf5-docs/release_specifics/hdf5_1_10.md | 2 +- .../hdf5-docs/release_specifics/hdf5_1_12.md | 1 + .../hdf5-docs/release_specifics/hdf5_1_14.md | 1 - .../release_specifics/new_features_1_10.md | 106 ++++++++++-------- .../release_specifics/new_features_1_12.md | 17 +-- 6 files changed, 105 insertions(+), 59 deletions(-) create mode 100644 documentation/hdf5-docs/release_specifics/additional_APIs diff --git a/documentation/hdf5-docs/release_specifics/additional_APIs b/documentation/hdf5-docs/release_specifics/additional_APIs new file mode 100644 index 00000000..391b7cd0 --- /dev/null +++ b/documentation/hdf5-docs/release_specifics/additional_APIs @@ -0,0 +1,37 @@ +--- +title: Additional New APIs +redirect_from: + - display/HDF5/Additional+New+APIs +--- + +This page describes various new functions, a new struct, and new macros that are either unrelated to new features described elsewhere or have aspects that are unrelated to the feature where they are otherwise described. The page includes the following sections: + +Versioned Functions and Struct with Associated Macros +Property List Encoding and Decoding Functions +External File Prefix Functions +Versioned Functions and Struct with Associated Macros +Two functions in the HDF5 C interface, and a struct associated with one of those functions, have been versioned in this release. At the same time, interface compatibility macros were created for programmatic management of these functions. See API Compatibility Macros in HDF5 for a detailed discussion of versioned functions and structs and the related macros. + +These functions, struct, and macros are documented in the HDF5 Reference Manual: + +H5F_GET_INFO Macro for programmatic control of versioned functions +H5F_GET_INFO1 Original function +H5F_GET_INFO2 New version introduced at Release 1.10.0 + +H5R_DEREFERENCE Macro for programmatic control of versioned functions +H5R_DEREFERENCE1 Original function +H5R_DEREFERENCE2 New version introduced at Release 1.10.0 +Property List Encoding and Decoding Functions +Functions have been added in the HDF5 C interface to encode and decode property lists. + +These functions are documented in the HDF5 Reference Manual: + +H5P_ENCODE +H5P_DECODE +External File Prefix Functions +Datasets that store raw data in external files can be created in HDF5. Functions have been added in the HDF5 C interface to enable such external files to be located via a relative path from the directory where the HDF5 file containing such a dataset is located: + +These functions are documented in the HDF5 Reference Manual: + +H5P_SET_EFILE_PREFIX +H5P_GET_EFILE_PREFIX diff --git a/documentation/hdf5-docs/release_specifics/hdf5_1_10.md b/documentation/hdf5-docs/release_specifics/hdf5_1_10.md index 49be8c0d..99d07374 100644 --- a/documentation/hdf5-docs/release_specifics/hdf5_1_10.md +++ b/documentation/hdf5-docs/release_specifics/hdf5_1_10.md @@ -8,7 +8,7 @@ redirect_from: ### New Features in HDF5 Release 1.10 -* [Additional New APIs](/documentation/hdf5-docs/release_specifics/) +* [Additional New APIs](/documentation/hdf5-docs/release_specifics/additional_APIs.md) * [Chunk Query Functionality (RFC)](https://docs.hdfgroup.org/hdf5/rfc/RFC-Chunking%20Functions-2018-06-20-v3.docx.pdf) * [Minimum Object Headers (RFC)](https://docs.hdfgroup.org/hdf5/rfc/RFC_Min_Obj_Headers_181231.pdf) * [Parallel Library Change (RFC)](https://docs.hdfgroup.org/hdf5/develop/_r_f_c.html) diff --git a/documentation/hdf5-docs/release_specifics/hdf5_1_12.md b/documentation/hdf5-docs/release_specifics/hdf5_1_12.md index d68a1596..ae7561f4 100644 --- a/documentation/hdf5-docs/release_specifics/hdf5_1_12.md +++ b/documentation/hdf5-docs/release_specifics/hdf5_1_12.md @@ -7,6 +7,7 @@ redirect_from: ### [Migrating from HDF5 1.10 to HDF5 1.12](/documentation/hdf5-docs/release_specifics/Migrating_from_HDF5_1.10_to_HDF5_1.14.md) ### [New Features in HDF5 Release 1.12](/documentation/hdf5-docs/release_specifics/new_features_1_12.md) + * [H5Sencode / H5Sdecode Format Change - RFC](https://docs.hdfgroup.org/hdf5/rfc/H5Sencode_format.docx.pdf) * [Update to References](https://docs.hdfgroup.org/hdf5/rfc/RFC_Update_to_HDF5_References.pdf) * [Update to Selections](https://docs.hdfgroup.org/hdf5/rfc/selection_io_RFC_210610.pdf) diff --git a/documentation/hdf5-docs/release_specifics/hdf5_1_14.md b/documentation/hdf5-docs/release_specifics/hdf5_1_14.md index 958678a1..55bb6277 100644 --- a/documentation/hdf5-docs/release_specifics/hdf5_1_14.md +++ b/documentation/hdf5-docs/release_specifics/hdf5_1_14.md @@ -7,7 +7,6 @@ redirect_from: ### [Migrating from HDF5 1.12 to HDF5 1.14](/documentation/hdf5-docs/release_specifics/Migrating_from_HDF5_1.12_to_HDF5_1.14.md) ### [New Features in HDF5 Release 1.14](/documentation/hdf5-docs/release_specifics/new_features_1_14.md) - * [H5Sencode / H5Sdecode Format Change - RFC](https://docs.hdfgroup.org/hdf5/rfc/H5Sencode_format.docx.pdf) * [Update to References](https://docs.hdfgroup.org/hdf5/rfc/RFC_Update_to_HDF5_References.pdf) diff --git a/documentation/hdf5-docs/release_specifics/new_features_1_10.md b/documentation/hdf5-docs/release_specifics/new_features_1_10.md index 10d6bed6..3683a964 100644 --- a/documentation/hdf5-docs/release_specifics/new_features_1_10.md +++ b/documentation/hdf5-docs/release_specifics/new_features_1_10.md @@ -6,27 +6,30 @@ redirect_from: HDF5 1.10 introduces several new features in the HDF5 library. These new features were added in the first three releases of HDF5-1.10. For a brief description of each new feature see: -[New Features Introduced in HDF5 1.10.8 -New Features Introduced in HDF5 1.10.7 -New Features introduced in HDF5 1.10.6 -New Features introduced in HDF5 1.10.5 -New Features Introduced in HDF5 1.10.2 -New Features Introduced in HDF5 1.10.1 -New Features Introduced in HDF5 1.10.0 +* [New Features Introduced in HDF5 1.10.8](#New-Features-Introduced-in-HDF5-1.10.8) +* [New Features Introduced in HDF5 1.10.7](#New-Features-Introduced-in-HDF5-1.10.7) +* [New Features introduced in HDF5 1.10.6](#New-Features-Introduced-in-HDF5-1.10.6) +* [New Features introduced in HDF5 1.10.5](#New-Features-Introduced-in-HDF5-1.10.5) +* [New Features Introduced in HDF5 1.10.2](#New-Features-Introduced-in-HDF5-1.10.2) +* [New Features Introduced in HDF5 1.10.1](#New-Features-Introduced-in-HDF5-1.10.1) +* [New Features Introduced in HDF5 1.10.0](#New-Features-Introduced-in-HDF5-1.10.0) + +~~~ This release includes changes in the HDF5 storage format. For detailed information on the changes, see: Changes to the File Format Specification -PLEASE NOTE that HDF5-1.8 cannot read files created with the new features described below that are marked with *. +PLEASE NOTE that HDF5-1.8 cannot read files created with the new features described below that are marked with \*. These changes come into play when one or more of the new features is used or when an application calls for use of the latest storage format (H5P_SET_LIBVER_BOUNDS). See the RFC for more details. Due to the requirements of some of the new features, the format of a 1.10.x HDF5 file is likely to be different from that of a 1.8.x HDF5 file. This means that tools and applications built to read 1.10.x files will be able to read a 1.8.x file, but tools built to read 1.8.x files may not be able to read a 1.10.x file. If an application built on HDF5 Release 1.10 avoids use of the new features and does not request use of the latest format, applications built on HDF5 Release 1.8.x will be able to read files the first application created. In addition, applications originally written for use with HDF5 Release 1.8.x can be linked against a suitably configured HDF5 Release 1.10.x library, thus taking advantage of performance improvements in 1.10. +~~~ - New Features Introduced in HDF5 1.10.8 +### New Features Introduced in HDF5 1.10.8 The following important new features and changes were introduced in HDF5-1.10.8. For complete details see the Release Notes and the Software Changes from Release to Release page. -New Features Introduced in HDF5 1.10.7 +### New Features Introduced in HDF5 1.10.7 The following important new features and changes were introduced in HDF5-1.10.7. For complete details see the Release Notes and the Software Changes from Release to Release page. Addition of AEC (open source SZip) Compression Library @@ -43,28 +46,29 @@ Performance has continued to improve in this release. Please see the images unde Addition of Hyperslab Selection Functions Several hyperslab selection routines introduced in HDF5-1.12 were ported to 1.10. See the Software Changes from Release to Release page for details. -New Features Introduced in HDF5 1.10.6 +### New Features Introduced in HDF5 1.10.6 The following important new features and changes were introduced in HDF5-1.10.6. For complete details see the Release Notes and the Software Changes from Release to Release page: -Improvements to the CMake Support +#### Improvements to the CMake Support Several improvements were added to the CMake support, including: Support was added for VS 2019 on Windows (with CMake 3.15). Support was added for MinGW using a toolchain file on Linux (C only). -Virtual File Drivers - S3 and HDFS +#### Virtual File Drivers - S3 and HDFS Two Virtual File Drivers (VFDs) have been introduced in 1.10.6: -The S3 VFD enables access to an HDF5 file via the Amazon Simple Storage Service (Amazon S3). -The HDFS VFD enables access to an HDF5 file with the Hadoop Distributed File System (HDFS). +* The S3 VFD enables access to an HDF5 file via the Amazon Simple Storage Service (Amazon S3). +* The HDFS VFD enables access to an HDF5 file with the Hadoop Distributed File System (HDFS). + See the Virtual File Drivers - S3 and HDFS page for more information. -Improvement to Performance +#### Improvement to Performance Performance was improved when creating a large number of small datasets. -New Features Introduced in HDF5 1.10.5 +### New Features Introduced in HDF5 1.10.5 The following important new features were added in HDF5-1.10.5. Please see the release announcement and Software Changes from Release to Release page for more details regarding these features: -Minimized Dataset Object Headers (RFC) +#### Minimized Dataset Object Headers (RFC) The ability to minimize dataset object headers was added to reduce the file bloat caused by extra space in the dataset object header. The file bloat can occur when creating many, very small datasets. See the Release Notes for more details regarding this issue. The following APIs were introduced to support this feature: @@ -85,16 +89,18 @@ H5P_SET_DSET_NO_ATTRS_HINT Sets the flag to create minimized dataset object headers -Parallel Library Change (RFC) +#### Parallel Library Change (RFC) + A change was added to the default behavior in parallel when reading the same dataset in its entirety (i.e. H5S_ALL dataset selection) which is being read by all the processes collectively. The dataset must be contiguous, less than 2GB, and of an atomic datatype. The new behavior in the HDF5 library uses an MPI_Bcast to pass the data read from the disk by the root process to the remaining processes in the MPI communicator associated with the HDF5 file. A CFD application was used to benchmark CGNS with: -compact storage -read-proc0-and-bcast -These results were reported by Greg Sjaardema from Sandia National Laboratories. +* compact storage +* read-proc0-and-bcast +These results were reported by Greg Sjaardema from Sandia National Laboratories. +(image missing) Series 1 is the read-proc0-and-bcast solution Series 2 is a single MPI_Bcast @@ -105,79 +111,81 @@ Compact 192 is also using compact storage Compact 384 is also using compact storage The last 3 “compact” curves are just three different batch jobs on 192, 384, and 552 nodes (with 36 core/node). The Series 2 and 3 curves are not related to the CGNS benchmark, but give a qualitative indication on the scaling behavior of MPI_Bcast. Both read-proc0-and-bcast and compact storage follow MPI_Bcast’s trend, which makes sense since both methods rely on MPI_Bcast. (See the RFC for better resolution.) -OpenMPI Support +#### OpenMPI Support Support for OpenMPI was added. For known problems and issues please see OpenMPI Build Issues. To better support OpenMPI, all MPI-1 API calls were replaced by MPI-2 equivalents. -Chunk Query Functions (RFC) +#### Chunk Query Functions (RFC) New functions were added to find locations, sizes and filters applied to chunks of a dataset. This functionality is useful for applications that need to read chunks directly from the file, bypassing the HDF5 library. H5D_GET_CHUNK_INFO Retrieves information about a chunk specified by the chunk index H5D_GET_CHUNK_INFO_BY_COORD Retrieves information about a chunk specified by its coordinates H5D_GET_NUM_CHUNKS Retrieves number of chunks that have nonempty intersection with a specified selection -New Features Introduced in HDF5 1.10.2 +### New Features Introduced in HDF5 1.10.2 Several important features and changes were added to HDF5 1.10.2. See the release announcement and blog for complete details. Following are the major new features: -Forward Compatibility for HDF5 1.8-based Applications Accessing Files Created by HDF5 1.10.2 ( RFC ) +#### Forward Compatibility for HDF5 1.8-based Applications Accessing Files Created by HDF5 1.10.2 ( RFC ) In HDF5 1.8.0, the H5P_SET_LIBVER_BOUNDS function was introduced for specifying the earliest ("low") and latest ("high") versions of the library to use when writing objects. With HDF5 1.10.2, new values for "low" and "high" were introduced: H5F_LIBVER_18 and H5F_LIBVER_LATEST is now mapped to H5F_LIBVER_V110. See the H5P_SET_LIBVER_BOUNDS function for details. -Performance Optimizations for HDF5 Parallel Applications +#### Performance Optimizations for HDF5 Parallel Applications Optimizations were introduced to parallel HDF5 for improving the performance of open, close and flush operations at scale. -Using Compression with HDF5 Parallel Applications +#### Using Compression with HDF5 Parallel Applications HDF5 parallel applications can now write data using compression (and other filters such as the Fletcher32 checksum filter). - -New Features Introduced in HDF5 1.10.1 -Metadata Cache Image ( RFC ) » Fine-tuning the Metadata Cache * +### New Features Introduced in HDF5 1.10.1 + +#### Metadata Cache Image ( RFC ) » Fine-tuning the Metadata Cache * HDF5 metadata is typically small, and scattered throughout the HDF5 file. This can affect performance, particularly on large HPC systems. The Metadata Cache Image feature can improve performance by writing the metadata cache in a single block on file close, and then populating the cache with the contents of this block on file open, thus avoiding the many small I/O operations that would otherwise be required on file open and close. -Metadata Cache Evict on Close » Fine-tuning the Metadata Cache +#### Metadata Cache Evict on Close » Fine-tuning the Metadata Cache The HDF5 library's metadata cache is fairly conservative about holding on to HDF5 object metadata (object headers, chunk index structures, etc.), which can cause the cache size to grow, resulting in memory pressure on an application or system. The "evict on close" property will cause all metadata for an object to be evicted from the cache as long as metadata is not referenced from any other open object. -Paged Aggregation ( RFC ) » File Space Management * +#### Paged Aggregation ( RFC ) » File Space Management * The current HDF5 file space allocation accumulates small pieces of metadata and raw data in aggregator blocks which are not page aligned and vary widely in sizes. The paged aggregation feature was implemented to provide efficient paged access of these small pieces of metadata and raw data. -Page Buffering ( RFC ) +#### Page Buffering ( RFC ) Small and random I/O accesses on parallel file systems result in poor performance for applications. Page buffering in conjunction with paged aggregation can improve performance by giving an application control of minimizing HDF5 I/O requests to a specific granularity and alignment. -New Features Introduced in HDF5 1.10.0 -SWMR * +### New Features Introduced in HDF5 1.10.0 + +#### SWMR * Data acquisition and computer modeling systems often need to analyze and visualize data while it is being written. It is not unusual, for example, for an application to produce results in the middle of a run that suggest some basic parameters be changed, sensors be adjusted, or the run be scrapped entirely. To enable users to check on such systems, we have been developing a concurrent read/write file access pattern we call SWMR (pronounced swimmer). SWMR is short for single-writer/multiple-reader. SWMR functionality allows a writer process to add data to a file while multiple reader processes read from the file. -Fine-tuning the Metadata Cache +#### Fine-tuning the Metadata Cache The orderly operation of the metadata cache is crucial to SWMR functioning. A number of APIs have been developed to handle the requests from writer and reader processes and to give applications the control of the metadata cache they might need. However, the metadata cache APIs can be used when SWMR is not being used; so, these functions are described separately. -Collective Metadata I/O +#### Collective Metadata I/O Calls for HDF5 metadata can result in many small reads and writes. On metadata reads, collective metadata I/O can improve performance by allowing the library to perform optimizations when reading the metadata, by having one rank read the data and broadcasting it to all other ranks. -Collective metadata I/O improves metadata write performance through the construction of an MPI derived datatype that is then written collectively in a single call. +#### Collective metadata I/O improves metadata write performance through the construction of an MPI derived datatype that is then written collectively in a single call. -File Space Management * +#### File Space Management * Usage patterns when working with an HDF5 file sometimes result in wasted space within the file. This can also impair access times when working with the resulting files. The new file space management feature provides strategies for managing space in a file to improve performance in both of these arenas. -Virtual Datasets (VDS) * +#### Virtual Datasets (VDS) * With a growing amount of data in HDF5, the need has emerged to access data stored across multiple HDF5 files using standard HDF5 objects, such as groups and datasets, without rewriting or rearranging the data. The new virtual dataset (VDS) feature enables an application to draw on multiple datasets and files to create virtual datasets without moving or rewriting any data. -Partial Edge Chunk Options * +#### Partial Edge Chunk Options * New options for the storage and filtering of partial edge chunks in a dataset provide a tool for tuning I/O speed and file size in cases where the dataset size may not be a multiple of the chunk size. -Additional New APIs +#### Additional New APIs In addition to the features described above, several additional new functions, a new struct, and new macros have been introduced or newly versioned in this release. -Changes to the File Format Specification +### Changes to the File Format Specification The file format of the HDF5 library has been changed to support the new features in HDF5-1.10. See the HDF5 File Format Specification for complete details on the changes. This specification describes how the bytes in an HDF5 file are organized on the storage media where the file is kept. In other words, when a file is written to disk, the file will be written according to the information described in this file. The following sections have been added or changed: -Another version of the superblock was added. -Additional B-tree types were added to the version 2 B-trees. -The global heap block for virtual datasets was added. -The Data Layout Message was changed: the name was changed, and version 4 of the data layout message was added for the virtual type. -Additional types of indexes were added for dataset chunks. +* Another version of the superblock was added. +* Additional B-tree types were added to the version 2 B-trees. +* The global heap block for virtual datasets was added. +* The Data Layout Message was changed: the name was changed, and version 4 of the data layout message was added for the virtual type. +* Additional types of indexes were added for dataset chunks. + HDF5-1.8 cannot read files created with the new features described on this page that are marked with *. diff --git a/documentation/hdf5-docs/release_specifics/new_features_1_12.md b/documentation/hdf5-docs/release_specifics/new_features_1_12.md index 764f2682..fcd0f53b 100644 --- a/documentation/hdf5-docs/release_specifics/new_features_1_12.md +++ b/documentation/hdf5-docs/release_specifics/new_features_1_12.md @@ -22,19 +22,20 @@ The Virtual Object Layer (VOL) is an abstraction layer within the HDF5 library t -The plugins can actually store the objects in variety of ways. A plugin could, for example, have objects be distributed remotely over different platforms, provide a raw mapping of the model to the file system, or even store the data in other file formats (like native netCDF or HDF4 format). The user still gets the same data model where access is done to a single HDF5 “container”; however the plugin object driver translates from what the user sees to how the data is actually stored. Having this abstraction layer maintains the object model of HDF5 and allows better usage of new object storage file systems that are targeted for Exascale systems. +The plugins can actually store the objects in variety of ways. A plugin could, for example, have objects be distributed remotely over different platforms, provide a raw mapping of the model to the file system, or even store the data in other file formats (like native netCDF or HDF4 format). The user still gets the same data model where access is done to a single HDF5 \"container\"; however the plugin object driver translates from what the user sees to how the data is actually stored. Having this abstraction layer maintains the object model of HDF5 and allows better usage of new object storage file systems that are targeted for Exascale systems. -Hyperslab Performance Improvements +### Hyperslab Performance Improvements In 1.12.0 the hyperslab selection code was optimized to achieve better performance. In general, performance improved by an order of a magnitude. In the case of reading a regular selection from a 20 GB dataset into a one dimensional array, performance improved by a factor of 6000. If you are interested in the benchmark we ran, please see issue HDFFV-10930 by logging into jira.hdfgroup.org with your hdfgroup.org login. -Update to References (RFC) * +### Update to References (RFC) * See the Update to References page for details on the changes in HDF5-1.12. HDF5 references were extended to support attributes, as well as object and dataset selections that reside in another HDF5 file. In order to support these features several functions were introduced: -Create (H5R\_CREATE*) functions were added for each reference type: object, dataset region and attribute. -A function was added to release a reference (H5R_DESTROY). This is required because a region reference no longer modifies the original file. -Functions were added to query references (H5R_GET*). -Other functions were added to simplify or clarify the API. -Update to Selections +* Create (H5R_CREATE\*) functions were added for each reference type: object, dataset region and attribute. +* A function was added to release a reference (H5R_DESTROY). This is required because a region reference no longer modifies the original file. +* Functions were added to query references (H5R_GET\*). +* Other functions were added to simplify or clarify the API. + +### Update to Selections Several new H5S APIs were introduced to allow a user to more flexibly operate on two hyperslab selections. See Update to Selections for more details. From d419654df4aa3dda1a1759b5c500f25a78e5951f Mon Sep 17 00:00:00 2001 From: bmribler <39579120+bmribler@users.noreply.github.com> Date: Sat, 16 Mar 2024 02:39:19 -0400 Subject: [PATCH 2/2] Update release_specific_info.md --- .../hdf5-docs/release_specifics/release_specific_info.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/documentation/hdf5-docs/release_specifics/release_specific_info.md b/documentation/hdf5-docs/release_specifics/release_specific_info.md index 98ec2a9d..1ef66127 100644 --- a/documentation/hdf5-docs/release_specifics/release_specific_info.md +++ b/documentation/hdf5-docs/release_specifics/release_specific_info.md @@ -21,7 +21,7 @@ redirect_from: * [Migrating from HDF5 1.8 to HDF5 1.10](documentation/hdf5-docs/release_specifics/) ### [HDF5 1.8](/documentation/hdf5-docs/release_specifics/hdf5_1_8.md) -* [New Features](documentation/hdf5-docs/release_specifics/new_features_1_18.md) +* New Features * [Software Changes from Release to Release](documentation/hdf5-docs/release_specifics/sw_changes_1.8) ### [API compatibility Macros in HDF5](documentation/hdf5-docs/release_specifics/api_comp_macros.md)