Class AbstractOOXMLExtractor
java.lang.Object
org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
- All Implemented Interfaces:
OOXMLExtractor
- Direct Known Subclasses:
POIXMLTextExtractorDecorator,SXSLFPowerPointExtractorDecorator,SXWPFWordExtractorDecorator,XPSExtractorDecorator,XSLFPowerPointExtractorDecorator,XSSFExcelExtractorDecorator,XWPFWordExtractorDecorator
@Deprecated(since="2026-04-30")
public abstract class AbstractOOXMLExtractor
extends Object
implements OOXMLExtractor
Deprecated.
This version of the Apache Tika library is deprecated. Use your own version of Apache Tika.
Base class for all Tika OOXML extractors.
Tika extractors decorate POI extractors so that the parsed content of
documents is returned as a sequence of XHTML SAX events. Subclasses must
implement the buildXHTML method
buildXHTML(XHTMLContentHandler) that
populates the XHTMLContentHandler object received as parameter.-
Constructor Summary
ConstructorsConstructorDescriptionAbstractOOXMLExtractor(ParseContext context, POIXMLTextExtractor extractor) Deprecated. -
Method Summary
Modifier and TypeMethodDescriptionDeprecated.Returns the opened document.Deprecated.POIXMLTextExtractor.getMetadataTextExtractor()not yet supported for OOXML by POI.voidgetXHTML(ContentHandler handler, Metadata metadata, ParseContext context) Deprecated.Parses the document into a sequence of XHTML SAX events sent to the given content handler.
-
Constructor Details
-
AbstractOOXMLExtractor
Deprecated.
-
-
Method Details
-
getDocument
Deprecated.Description copied from interface:OOXMLExtractorReturns the opened document.- Specified by:
getDocumentin interfaceOOXMLExtractor- See Also:
-
getMetadataExtractor
Deprecated.Description copied from interface:OOXMLExtractorPOIXMLTextExtractor.getMetadataTextExtractor()not yet supported for OOXML by POI.- Specified by:
getMetadataExtractorin interfaceOOXMLExtractor- See Also:
-
getXHTML
public void getXHTML(ContentHandler handler, Metadata metadata, ParseContext context) throws SAXException, XmlException, IOException, TikaException Deprecated.Description copied from interface:OOXMLExtractorParses the document into a sequence of XHTML SAX events sent to the given content handler.- Specified by:
getXHTMLin interfaceOOXMLExtractor- Throws:
SAXExceptionXmlExceptionIOExceptionTikaException- See Also:
-