Interface CollectionProcessingEngine

  • All Known Implementing Classes:
    CollectionProcessingEngine_impl

    public interface CollectionProcessingEngine
    A CollectionProcessingEngine (CPE) processes a collection of artifacts (for text analysis applications, this will be a collection of documents) and produces collection-level results.

    A CPE consists of a CollectionReader, zero or more AnalysisEngines and zero or more CasConsumers. The Collection Reader is responsible for reading artifacts from a collection and setting up the CAS. The AnalysisEngines analyze each CAS and the results are passed on to the CAS Consumers. CAS Consumers perform analysis over multiple CASes and generally produce collection-level results in some application-specific data structure.

    Processing is started by calling the process() method. Processing can be controlled via thepause(), resume(), and stop() methods.

    Listeners can register with the CPE by calling the addStatusCallbackListener(StatusCallbackListener) method. These listeners receive status callbacks during the processing. At any time, performance and progress reports are available from the getPerformanceReport() and getProgress() methods.

    A CPE implementation may choose to implement parallelization of the processing, but this is not a requirement of the architecture.

    Note that a CPE only supports processing one collection at a time. Attempting to start a new processing job while a previous processing job is running will result in an exception. Processing multiple collections simultaneously is done by instantiating and configuring multiple instances of the CPE.

    A CollectionProcessingEngine instance can be obtained by calling UIMAFramework.produceCollectionProcessingEngine(CpeDescription).

    • Method Detail

      • initialize

        void initialize​(CpeDescription aCpeDescription,
                        Map<String,​Object> aAdditionalParams)
                 throws ResourceInitializationException
        Initializes this CPE from a cpeDescription Applications do not need to call this method. It is called automatically by the framework and cannot be called a second time.
        Parameters:
        aCpeDescription - CPE description, generally parsed from an XML file
        aAdditionalParams - a Map containing additional parameters. May be null if there are no parameters. Each class that implements this interface can decide what additional parameters it supports.
        Throws:
        ResourceInitializationException - if a failure occurs during initialization.
        UIMA_IllegalStateException - if this method is called more than once on a single instance.
      • addStatusCallbackListener

        void addStatusCallbackListener​(StatusCallbackListener aListener)
        Registers a listener to receive status callbacks.
        Parameters:
        aListener - the listener to add
      • removeStatusCallbackListener

        void removeStatusCallbackListener​(StatusCallbackListener aListener)
        Unregisters a status callback listener.
        Parameters:
        aListener - the listener to remove
      • isProcessing

        boolean isProcessing()
        Determines whether this CPE is currently processing. This means that a processing request has been submitted and has not yet completed or been stop()ped. If processing is paused, this method will still return true.
        Returns:
        true if and only if this CPE is currently processing.
      • pause

        void pause()
        Pauses processing. Processing can later be resumed by calling the resume() method.
        Throws:
        UIMA_IllegalStateException - if no processing is currently occuring
      • isPaused

        boolean isPaused()
        Determines whether this CPE's processing is currently paused.
        Returns:
        true if and only if this CPE's processing is currently paused.
      • resume

        void resume()
        Resumes processing that has been paused.
        Throws:
        UIMA_IllegalStateException - if processing is not currently paused
      • getPerformanceReport

        ProcessTrace getPerformanceReport()
        Gets a performance report for the processing that is currently occurring or has just completed.
        Returns:
        an object containing performance statistics
      • getProgress

        Progress[] getProgress()
        Gets a progress report for the processing that is currently occurring or has just completed.
        Returns:
        an array of Progress objects, each of which represents the progress in a different set of units (for example number of entities or bytes)
      • getCollectionReader

        BaseCollectionReader getCollectionReader()
        Gets the Collection Reader for this CPE.
        Returns:
        the collection reader
      • getCasProcessors

        CasProcessor[] getCasProcessors()
        Gets the CasProcessorss in this CPE, in the order in which they will be executed.
        Returns:
        an array of CasProcessors
      • kill

        void kill()
        Kill CPM hard.