Class NumericUtils


  • public final class NumericUtils
    extends Object
    This is a helper class to generate prefix-encoded representations for numerical values and supplies converters to represent float/double values as sortable integers/longs.

    To quickly execute range queries in Apache Lucene, a range is divided recursively into multiple intervals for searching: The center of the range is searched only with the lowest possible precision in the trie, while the boundaries are matched more exactly. This reduces the number of terms dramatically.

    This class generates terms to achieve this: First the numerical integer values need to be converted to strings. For that integer values (32 bit or 64 bit) are made unsigned and the bits are converted to ASCII chars with each 7 bit. The resulting string is sortable like the original integer value. Each value is also prefixed (in the first char) by the shift value (number of bits removed) used during encoding.

    To also index floating point numbers, this class supplies two methods to convert them to integer values by changing their bit layout: doubleToSortableLong(double), floatToSortableInt(float). You will have no precision loss by converting floating point numbers to integers and back (only that the integer form is not usable). Other data types like dates can easily converted to longs or ints (e.g. date to long: Date.getTime()).

    For easy usage, the trie algorithm is implemented for indexing inside NumericTokenStream that can index int, long, float, and double. For querying, NumericRangeQuery and NumericRangeFilter implement the query part for the same data types.

    This class can also be used, to generate lexicographically sortable (according String.compareTo(String)) representations of numeric data types for other usages (e.g. sorting).

    Since:
    2.9
    NOTE: This API is for internal purposes only and might change in incompatible ways in the next release.
    • Method Detail

      • longToPrefixCoded

        public static int longToPrefixCoded​(long val,
                                            int shift,
                                            char[] buffer)
        Expert: Returns prefix coded bits after reducing the precision by shift bits. This is method is used by NumericTokenStream.
        Parameters:
        val - the numeric value
        shift - how many bits to strip from the right
        buffer - that will contain the encoded chars, must be at least of BUF_SIZE_LONG length
        Returns:
        number of chars written to buffer
      • longToPrefixCoded

        public static String longToPrefixCoded​(long val,
                                               int shift)
      • longToPrefixCoded

        public static String longToPrefixCoded​(long val)
      • intToPrefixCoded

        public static int intToPrefixCoded​(int val,
                                           int shift,
                                           char[] buffer)
        Expert: Returns prefix coded bits after reducing the precision by shift bits. This is method is used by NumericTokenStream.
        Parameters:
        val - the numeric value
        shift - how many bits to strip from the right
        buffer - that will contain the encoded chars, must be at least of BUF_SIZE_INT length
        Returns:
        number of chars written to buffer
      • intToPrefixCoded

        public static String intToPrefixCoded​(int val,
                                              int shift)
      • intToPrefixCoded

        public static String intToPrefixCoded​(int val)
      • prefixCodedToLong

        public static long prefixCodedToLong​(String prefixCoded)
      • prefixCodedToInt

        public static int prefixCodedToInt​(String prefixCoded)
      • doubleToSortableLong

        public static long doubleToSortableLong​(double val)
        Converts a double value to a sortable signed long. The value is converted by getting their IEEE 754 floating-point "double format" bit layout and then some bits are swapped, to be able to compare the result as long. By this the precision is not reduced, but the value can easily used as a long. The sort order (including Double.NaN) is defined by Double.compareTo(java.lang.Double); NaN is greater than positive infinity.
        See Also:
        sortableLongToDouble(long)
      • doubleToPrefixCoded

        public static String doubleToPrefixCoded​(double val)
      • sortableLongToDouble

        public static double sortableLongToDouble​(long val)
        Converts a sortable long back to a double.
        See Also:
        doubleToSortableLong(double)
      • prefixCodedToDouble

        public static double prefixCodedToDouble​(String val)
      • floatToSortableInt

        public static int floatToSortableInt​(float val)
        Converts a float value to a sortable signed int. The value is converted by getting their IEEE 754 floating-point "float format" bit layout and then some bits are swapped, to be able to compare the result as int. By this the precision is not reduced, but the value can easily used as an int. The sort order (including Float.NaN) is defined by Float.compareTo(java.lang.Float); NaN is greater than positive infinity.
        See Also:
        sortableIntToFloat(int)
      • floatToPrefixCoded

        public static String floatToPrefixCoded​(float val)
      • sortableIntToFloat

        public static float sortableIntToFloat​(int val)
        Converts a sortable int back to a float.
        See Also:
        floatToSortableInt(float)
      • prefixCodedToFloat

        public static float prefixCodedToFloat​(String val)