Class IndicTokenizer

    • Method Detail

      • isTokenChar

        protected boolean isTokenChar​(int c)
        Deprecated.
        Description copied from class: CharTokenizer
        Returns true iff a codepoint should be included in a token. This tokenizer generates as tokens adjacent sequences of codepoints which satisfy this predicate. Codepoints for which this is false are used to define token boundaries and are not included in tokens.

        As of Lucene 3.1 the char based API (CharTokenizer.isTokenChar(char) and CharTokenizer.normalize(char)) has been depreciated in favor of a Unicode 4.0 compatible int based API to support codepoints instead of UTF-16 code units. Subclasses of CharTokenizer must not override the char based methods if a Version >= 3.1 is passed to the constructor.

        NOTE: This method will be marked abstract in Lucene 4.0.

        Overrides:
        isTokenChar in class CharTokenizer