Class ChunksIntEncoder

  • Direct Known Subclasses:
    EightFlagsIntEncoder, FourFlagsIntEncoder

    public abstract class ChunksIntEncoder
    extends IntEncoder
    An IntEncoder which encodes values in chunks. Implementations of this class assume the data which needs encoding consists of small, consecutive values, and therefore the encoder is able to compress them better. You can read more on the two implementations FourFlagsIntEncoder and EightFlagsIntEncoder.

    Extensions of this class need to implement IntEncoder.encode(int) in order to build the proper indicator (flags). When enough values were accumulated (typically the batch size), extensions can call encodeChunk() to flush the indicator and the rest of the values.

    NOTE: flags encoders do not accept values ≤ 0 (zero) in their IntEncoder.encode(int). For performance reasons they do not check that condition, however if such value is passed the result stream may be corrupt or an exception will be thrown. Also, these encoders perform the best when there are many consecutive small values (depends on the encoder implementation). If that is not the case, the encoder will occupy 1 more byte for every batch number of integers, over whatever VInt8IntEncoder would have occupied. Therefore make sure to check whether your data fits into the conditions of the specific encoder.

    For the reasons mentioned above, these encoders are usually chained with UniqueValuesIntEncoder and DGapIntEncoder in the following manner:

     IntEncoder fourFlags = 
             new SortingEncoderFilter(new UniqueValuesIntEncoder(new DGapIntEncoder(new FlagsIntEncoderImpl())));
     
    WARNING: This API is experimental and might change in incompatible ways in the next release.
    • Field Detail

      • encodeQueue

        protected final int[] encodeQueue
        Holds the values which must be encoded, outside the indicator.
      • encodeQueueSize

        protected int encodeQueueSize
      • encoder

        protected final IntEncoder encoder
        Encoder used to encode values outside the indicator.
      • indicator

        protected int indicator
        Represents bits flag byte.
      • ordinal

        protected byte ordinal
        Counts the current ordinal of the encoded value.
    • Constructor Detail

      • ChunksIntEncoder

        protected ChunksIntEncoder​(int chunkSize)
    • Method Detail

      • encodeChunk

        protected void encodeChunk()
                            throws IOException
        Encodes the values of the current chunk. First it writes the indicator, and then it encodes the values outside the indicator.
        Throws:
        IOException
      • close

        public void close()
                   throws IOException
        Description copied from class: IntEncoder
        Instructs the encoder to finish the encoding process. This method closes the output stream which was specified by reInit. An implementation may do here additional cleanup required to complete the encoding, such as flushing internal buffers, etc.
        Once this method was called, no further calls to encode should be made before first calling reInit.

        NOTE: overriding classes should make sure they either call super.close() or close the output stream themselves.

        Overrides:
        close in class IntEncoder
        Throws:
        IOException
      • reInit

        public void reInit​(OutputStream out)
        Description copied from class: IntEncoder
        Reinitializes the encoder with the give OutputStream. For re-usability it can be changed without the need to reconstruct a new object.

        NOTE: after calling IntEncoder.close(), one must call this method even if the output stream itself hasn't changed. An example case is that the output stream wraps a byte[], and the output stream itself is reset, but its instance hasn't changed. Some implementations of IntEncoder may write some metadata about themselves to the output stream, and therefore it is imperative that one calls this method before encoding any data.

        Overrides:
        reInit in class IntEncoder