Class AbstractOfficeParser

java.lang.Object
org.apache.tika.parser.AbstractParser
org.apache.tika.parser.microsoft.AbstractOfficeParser
All Implemented Interfaces:
Serializable, Parser
Direct Known Subclasses:
OfficeParser, OOXMLParser, Word2006MLParser

@Deprecated(since="2026-04-30") public abstract class AbstractOfficeParser extends AbstractParser
Deprecated.
This version of the Apache Tika library is deprecated. Use your own version of Apache Tika.
Intermediate layer to set OfficeParserConfig uniformly.
See Also:
  • Constructor Details

    • AbstractOfficeParser

      public AbstractOfficeParser()
      Deprecated.
  • Method Details

    • configure

      public void configure(ParseContext parseContext)
      Deprecated.
      Checks to see if the user has specified an OfficeParserConfig. If so, no changes are made; if not, one is added to the context.
      Parameters:
      parseContext -
    • getIncludeDeletedContent

      public boolean getIncludeDeletedContent()
      Deprecated.
      Returns:
      See Also:
    • getIncludeMoveFromContent

      public boolean getIncludeMoveFromContent()
      Deprecated.
      Returns:
      See Also:
    • getUseSAXDocxExtractor

      public boolean getUseSAXDocxExtractor()
      Deprecated.
      Returns:
      See Also:
    • getExtractMacros

      public boolean getExtractMacros()
      Deprecated.
      Returns:
      whether or not to extract macros
      See Also:
    • setIncludeDeletedContent

      @Field public void setIncludeDeletedContent(boolean includeDeletedConent)
      Deprecated.
    • setIncludeMoveFromContent

      @Field public void setIncludeMoveFromContent(boolean includeMoveFromContent)
      Deprecated.
    • setIncludeShapeBasedContent

      @Field public void setIncludeShapeBasedContent(boolean includeShapeBasedContent)
      Deprecated.
    • setUseSAXDocxExtractor

      @Field public void setUseSAXDocxExtractor(boolean useSAXDocxExtractor)
      Deprecated.
    • setUseSAXPptxExtractor

      @Field public void setUseSAXPptxExtractor(boolean useSAXPptxExtractor)
      Deprecated.
    • setExtractMacros

      @Field public void setExtractMacros(boolean extractMacros)
      Deprecated.
    • setConcatenatePhoneticRuns

      @Field public void setConcatenatePhoneticRuns(boolean concatenatePhoneticRuns)
      Deprecated.
    • setExtractAllAlternativesFromMSG

      @Field public void setExtractAllAlternativesFromMSG(boolean extractAllAlternativesFromMSG)
      Deprecated.
      Some .msg files can contain body content in html, rtf and/or text. The default behavior is to pick the first non-null value and include only that. If you'd like to extract all non-null body content, which is likely duplicative, set this value to true.
      Parameters:
      extractAllAlternativesFromMSG - whether or not to extract all alternative parts from msg files
      Since:
      1.17
    • getExtractAllAlternativesFromMSG

      public boolean getExtractAllAlternativesFromMSG()
      Deprecated.
    • setByteArrayMaxOverride

      @Field public void setByteArrayMaxOverride(int maxOverride)
      Deprecated.
      WARNING: this sets a static variable in POI. This allows users to override POI's protection of the allocation of overly large byte arrays. Use carefully; and please open up issues on POI's bugzilla to bump values for specific records.
      Parameters:
      maxOverride -
    • setDateFormatOverride

      @Field public void setDateFormatOverride(String format)
      Deprecated.