H5Pset_szip
(hid_t plist
,
unsigned int options_mask
,
unsigned int pixels_per_block
)
H5Pset_szip
sets an SZIP compression filter,
H5Z_FILTER_SZIP
, for a dataset.
SZIP is a compression method designed for use with scientific data.
Before proceeding, all users should review the “Limitations” section below.
Users familiar with SZIP outrside the HDF5 context may benefit from reviewing “Notes for Users Familiar with SZIP in Other Contexts” below.
In the text below, the term pixel refers to an HDF5 data element. This terminology derives from SZIP compression's use with image data, where pixel referred to an image pixel.
The SZIP bits_per_pixel
value (see Notes, below)
is automatically set, based on the HDF5 datatype.
SZIP can be used with atomic datatypes that may have size
of 8, 16, 32, or 64 bits.
Specifically, a dataset with a datatype that is
8-, 16-, 32-, or 64-bit
signed or unsigned integer;
char; or
32- or 64-bit float
can be compressed with SZIP.
See Notes, below, for further discussion of the
the SZIP bits_per_pixel
setting.
SZIP options are passed in an options mask, options_mask
,
as follows.
Option |
Description (Mutually exclusive; select one.) |
H5_SZIP_EC_OPTION_MASK
|
Selects entropy coding method. |
H5_SZIP_NN_OPTION_MASK
| Selects nearest neighbor coding method. |
|
|
H5_SZIP_EC_OPTION_MASK
, is best suited for
data that has been processed.
The EC method works best for small numbers.
H5_SZIP_NN_OPTION_MASK
,
preprocesses the data then the applies EC method as above.
SZIP compresses data block by block, with a user-tunable block size.
This block size is passed in the parameter
pixels_per_block
and must be even and not greater than 32,
with typical values being 8
, 10
,
16
, or 32
.
This parameter affects compression ratio;
the more pixel values vary, the smaller this number should be to
achieve better performance.
In HDF5, compression can be applied only to chunked datasets.
If pixels_per_block
is bigger than the total
number of elements in a dataset chunk,
H5Pset_szip
will succeed but the subsequent call to
H5Dcreate
will fail; the conflict can be detected only when the property list
is used.
To achieve optimal performance for SZIP compression,
it is recommended that a chunk's fastest-changing dimension
be equal to N times pixels_per_block
where N is the maximum number of blocks per scan line
allowed by the SZIP library.
In the current version of SZIP, N is set to 128.
compound datatypes,
array datatypes,
variable-length datatypes,
enumerations,
or any other user-defined datatypes.
H5Dcreate
will fail;
the conflict can be detected only when the property list is used.
hid_t plist |
IN: Dataset creation property list identifier. |
unsigned int options_mask |
IN: A bit-mask conveying the desired SZIP options.
Valid values are H5_SZIP_EC_OPTION_MASK
and H5_SZIP_NN_OPTION_MASK . |
unsigned int pixels_per_block |
IN: The number of pixels or data elements in each data block. |
In non-HDF5 applications, SZIP typically requires that the user application supply additional parameters:
pixels_in_object
,
the number of pixels in the object to be compressed
bits_per_pixel
,
the number of bits per pixel
pixels_per_scanline
,
the number of pixels per scan line
These values need not be independently supplied in the HDF5
environment as they are derived from the datatype and dataspace,
which are already known.
In particular, HDF5 sets
pixels_in_object
to the number of elements in a chunk
and bits_per_pixel
to the size of the element or
pixel datatype.
The following algorithm is used to set
pixels_per_scanline
:
pixels_per_scanline
to
128 times pixels_per_block
.
pixels_per_block
,
set pixels_per_scanline
to the minimum of
size and 128 times pixels_per_block
.
pixels_per_block
but greater than the number elements in the chunk,
set pixels_per_scanline
to the minimum of
the number elements in the chunk and
128 times pixels_per_block
.
The HDF5 datatype may have precision that is less than the
full size of the data element, e.g., an 11-bit integer can be
defined using
H5Tset_precision
.
To a certain extent, SZIP can take advantage of the
precision of the datatype to improve compression:
H5Tset_offset
or
H5Tget_offset
),
the data is the in lowest N bits of the data element.
In this case, the SZIP bits_per_pixel
is set to the precision
of the HDF5 datatype.
bits_per_pixel
will be set to the number of bits in the full size of the data
element.
bits_per_pixel
will be set to 32.
bits_per_pixel
will be set to 64.
HDF5 always modifies the options mask provided by the user
to set up usage of RAW_OPTION_MASK
,
ALLOW_K13_OPTION_MASK
, and one of
LSB_OPTION_MASK
or MSB_OPTION_MASK
,
depending on endianness of the datatype.
SUBROUTINE h5pset_szip_f(prp_id, options_mask, pixels_per_block, hdferr) IMPLICIT NONE INTEGER(HID_T), INTENT(IN) :: prp_id ! Dataset creation property list identifier INTEGER, INTENT(IN) :: options_mask ! A bit-mask conveying the desired ! SZIP options ! Current valid values in Fortran are: ! H5_SZIP_EC_OM_F ! H5_SZIP_NN_OM_F INTEGER, INTENT(IN) :: pixels_per_block ! The number of pixels or data elements ! in each data block INTEGER, INTENT(OUT) :: hdferr ! Error code ! 0 on success and -1 on failure END SUBROUTINE h5pset_szip_f
Release | C |
1.6.0 | Function introduced in this release. |