Class TesseractOCRParser

All Implemented Interfaces:
Serializable, Initializable, Parser

@Deprecated(since="2026-04-30") public class TesseractOCRParser extends AbstractExternalProcessParser implements Initializable
Deprecated.
This version of the Apache Tika library is deprecated. Use your own version of Apache Tika.
TesseractOCRParser powered by tesseract-ocr engine. To enable this parser, create a TesseractOCRConfig object and pass it through a ParseContext. Tesseract-ocr must be installed and on system path or the path to its root folder must be provided:

TesseractOCRConfig config = new TesseractOCRConfig();
//Needed if tesseract is not on system path
config.setTesseractPath(tesseractFolder);
parseContext.set(TesseractOCRConfig.class, config);

See Also:
  • Constructor Details

    • TesseractOCRParser

      public TesseractOCRParser()
      Deprecated.
  • Method Details