Input - the type of the input data to be processed.public interface Extractor<Input>
| Modifier and Type | Interface and Description |
|---|---|
static interface |
Extractor.BlindExtractor
|
static interface |
Extractor.ContentExtractor
This interface specializes an
Extractor able to handle
InputStream as input format. |
static interface |
Extractor.TagSoupDOMExtractor
|
| Modifier and Type | Method and Description |
|---|---|
ExtractorDescription |
getDescription()
Returns a
ExtractorDescription of this extractor. |
void |
run(ExtractionParameters extractionParameters,
ExtractionContext context,
Input in,
ExtractionResult out)
Executes the extractor.
|
void run(ExtractionParameters extractionParameters, ExtractionContext context, Input in, ExtractionResult out) throws IOException, ExtractionException
extractionParameters - the parameters to be applied during the extraction.context - The document context.in - The extractor input data.out - the collector for the extracted data.IOException - On error while reading from the input stream.ExtractionException - On other error, such as parse errors.ExtractorDescription getDescription()
ExtractorDescription of this extractor.Copyright © 2010–2019 The Apache Software Foundation. All rights reserved.