Class AbstractRecursiveParserWrapperHandler

java.lang.Object
org.xml.sax.helpers.DefaultHandler
org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
All Implemented Interfaces:
Serializable, ContentHandler, DTDHandler, EntityResolver, ErrorHandler
Direct Known Subclasses:
RecursiveParserWrapperHandler

@Deprecated(since="2026-04-30") public abstract class AbstractRecursiveParserWrapperHandler extends DefaultHandler implements Serializable
Deprecated.
This version of the Apache Tika library is deprecated. Use your own version of Apache Tika.
This is a special handler to be used only with the RecursiveParserWrapper. It allows for finer-grained processing of embedded documents than in the legacy handlers. Subclasses can choose how to process individual embedded documents.
See Also:
  • Field Details

    • TIKA_CONTENT

      public static final Property TIKA_CONTENT
      Deprecated.
    • TIKA_CONTENT_HANDLER

      public static final Property TIKA_CONTENT_HANDLER
      Deprecated.
      Simple class name of the content handler
    • PARSE_TIME_MILLIS

      public static final Property PARSE_TIME_MILLIS
      Deprecated.
    • WRITE_LIMIT_REACHED

      public static final Property WRITE_LIMIT_REACHED
      Deprecated.
    • EMBEDDED_RESOURCE_LIMIT_REACHED

      public static final Property EMBEDDED_RESOURCE_LIMIT_REACHED
      Deprecated.
    • EMBEDDED_EXCEPTION

      public static final Property EMBEDDED_EXCEPTION
      Deprecated.
    • CONTAINER_EXCEPTION

      public static final Property CONTAINER_EXCEPTION
      Deprecated.
    • EMBEDDED_RESOURCE_PATH

      public static final Property EMBEDDED_RESOURCE_PATH
      Deprecated.
    • EMBEDDED_DEPTH

      public static final Property EMBEDDED_DEPTH
      Deprecated.
  • Constructor Details

    • AbstractRecursiveParserWrapperHandler

      public AbstractRecursiveParserWrapperHandler(ContentHandlerFactory contentHandlerFactory)
      Deprecated.
    • AbstractRecursiveParserWrapperHandler

      public AbstractRecursiveParserWrapperHandler(ContentHandlerFactory contentHandlerFactory, int maxEmbeddedResources, int totalWriteLimit)
      Deprecated.
  • Method Details

    • getNewContentHandler

      public ContentHandler getNewContentHandler()
      Deprecated.
    • getNewContentHandler

      public ContentHandler getNewContentHandler(OutputStream os, Charset charset)
      Deprecated.
    • startEmbeddedDocument

      public void startEmbeddedDocument(ContentHandler contentHandler, Metadata metadata) throws SAXException
      Deprecated.
      This is called before parsing each embedded document. Override this for custom behavior. Make sure to call this in your custom classes because this tracks the number of embedded documents.
      Parameters:
      contentHandler - local handler to be used on this embedded document
      metadata - embedded document's metadata
      Throws:
      SAXException
    • endEmbeddedDocument

      public void endEmbeddedDocument(ContentHandler contentHandler, Metadata metadata) throws SAXException
      Deprecated.
      This is called after parsing each embedded document. Override this for custom behavior. This is currently a no-op.
      Parameters:
      contentHandler - content handler that was used on this embedded document
      metadata - metadata for this embedded document
      Throws:
      SAXException
    • endDocument

      public void endDocument(ContentHandler contentHandler, Metadata metadata) throws SAXException
      Deprecated.
      This is called after the full parse has completed. Override this for custom behavior. Make sure to call this as super.endDocument(...) in subclasses because this adds whether or not the embedded resource maximum has been hit to the metadata.
      Parameters:
      contentHandler - content handler that was used on the main document
      metadata - metadata that was gathered for the main document
      Throws:
      SAXException
    • hasHitMaximumEmbeddedResources

      public boolean hasHitMaximumEmbeddedResources()
      Deprecated.
      Returns:
      whether this handler has hit the maximum embedded resources during the parse
    • getContentHandlerFactory

      public ContentHandlerFactory getContentHandlerFactory()
      Deprecated.
    • getTotalWriteLimit

      public int getTotalWriteLimit()
      Deprecated.