Package org.apache.tika.parser.ner.regex
Class RegexNERecogniser
java.lang.Object
org.apache.tika.parser.ner.regex.RegexNERecogniser
- All Implemented Interfaces:
NERecogniser
@Deprecated(since="2026-04-30")
public class RegexNERecogniser
extends Object
implements NERecogniser
Deprecated.
This version of the Apache Tika library is deprecated. Use your own version of Apache Tika.
This class offers an implementation of
NERecogniser based on
Regular Expressions.
The default configuration file "ner-regex.txt" is used when no
argument constructor is used to instantiate this class. The regex file is
loaded via Class.getResourceAsStream(String), so the file should be
placed in the same package path as of this class.
ENTITY_TYPE1=REGEX1 ENTITY_TYPE2=REGEX2For example, to extract week day from text:
WEEK_DAY=(?i)((sun)|(mon)|(tues)|(thurs)|(fri)|((sat)(ur)?))(day)?
- Since:
- Nov. 7, 2015
-
Field Summary
FieldsFields inherited from interface org.apache.tika.parser.ner.NERecogniser
DATE, LOCATION, MISCELLANEOUS, MONEY, ORGANIZATION, PERCENT, PERSON, TIME -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionfindMatches(String text, Pattern pattern) Deprecated.finds matching sub groups in textDeprecated.gets a set of entity types whose names are recognisable by thisstatic RegexNERecogniserDeprecated.booleanDeprecated.checks if this Named Entity recogniser is available for serviceDeprecated.call for name recognition action from text
-
Field Details
-
NER_REGEX_FILE
Deprecated.- See Also:
-
entityTypes
Deprecated. -
patterns
Deprecated.
-
-
Constructor Details
-
RegexNERecogniser
public RegexNERecogniser()Deprecated. -
RegexNERecogniser
Deprecated.
-
-
Method Details
-
getInstance
Deprecated. -
isAvailable
public boolean isAvailable()Deprecated.Description copied from interface:NERecogniserchecks if this Named Entity recogniser is available for service- Specified by:
isAvailablein interfaceNERecogniser- Returns:
- true if this recogniser is ready to recognise, false otherwise
-
getEntityTypes
Deprecated.Description copied from interface:NERecognisergets a set of entity types whose names are recognisable by this- Specified by:
getEntityTypesin interfaceNERecogniser- Returns:
- set of entity types/classes
-
findMatches
Deprecated.finds matching sub groups in text- Parameters:
text- text containing interesting sub stringspattern- pattern to find sub strings- Returns:
- set of sub strings if any found, or null if none found
-
recognise
Deprecated.Description copied from interface:NERecognisercall for name recognition action from text- Specified by:
recognisein interfaceNERecogniser- Parameters:
text- text with possibly contains names- Returns:
- map of entityType -> set of names
-