public class RDFaExtractor extends Object implements Extractor.TagSoupDOMExtractor
Extractor.BlindExtractor, Extractor.ContentExtractor, Extractor.TagSoupDOMExtractor| Modifier and Type | Field and Description |
|---|---|
static String |
NAME |
static String |
xsltFilename |
| Constructor and Description |
|---|
RDFaExtractor()
Default constructor, with no verification of data types and not stop at first error.
|
RDFaExtractor(boolean verifyDataType,
boolean stopAtFirstError)
Constructor, allows to specify the validation and error handling policies.
|
| Modifier and Type | Method and Description |
|---|---|
ExtractorDescription |
getDescription() |
static XSLTStylesheet |
getXSLT()
Returns a
XSLTStylesheet able to distill RDFa from
HTML pages. |
boolean |
isStopAtFirstError() |
boolean |
isVerifyDataType() |
void |
run(ExtractionParameters extractionParameters,
ExtractionContext extractionContext,
Document in,
ExtractionResult out) |
void |
setStopAtFirstError(boolean stopAtFirstError) |
void |
setVerifyDataType(boolean verifyDataType) |
public static final String NAME
public static final String xsltFilename
public RDFaExtractor(boolean verifyDataType,
boolean stopAtFirstError)
verifyDataType - if true the data types will be verified,
if false will be ignored.stopAtFirstError - if true the parser will stop at first parsing error,
if false will ignore non blocking errors.public RDFaExtractor()
public static XSLTStylesheet getXSLT()
XSLTStylesheet able to distill RDFa from
HTML pages.null XSLT instance.public boolean isVerifyDataType()
public void setVerifyDataType(boolean verifyDataType)
public boolean isStopAtFirstError()
public void setStopAtFirstError(boolean stopAtFirstError)
public void run(ExtractionParameters extractionParameters, ExtractionContext extractionContext, Document in, ExtractionResult out) throws IOException, ExtractionException
run in interface Extractor<Document>IOExceptionExtractionExceptionpublic ExtractorDescription getDescription()
getDescription in interface Extractor<Document>ExtractorDescription of this extractorCopyright © 2010-2013 The Apache Software Foundation. All Rights Reserved.