public class FulltextExtractorWork extends AbstractWork
Work.Progress, Work.State
Modifier and Type | Field and Description |
---|---|
protected static String |
ANY2TEXT_CONVERTER |
protected static String |
CATEGORY |
protected List<DocumentRef> |
docsToUpdate |
protected DocumentModel |
document |
static String |
FULLTEXT_DEFAULT_INDEX |
protected FulltextConfiguration |
fulltextConfiguration |
protected static int |
HTML_MAGIC_OFFSET |
static String |
SYSPROP_FULLTEXT_BINARY |
static String |
SYSPROP_FULLTEXT_JOBID |
static String |
SYSPROP_FULLTEXT_SIMPLE |
protected static String |
TEXT_HTML |
protected static String |
TITLE |
protected boolean |
updateBinaryText
If true, update the binary text from the document.
|
protected boolean |
updateSimpleText
If true, update the simple text from the document.
|
protected boolean |
useJobId |
callerThread, completionTime, docId, docIds, FAILURE_EXCEPTION, FAILURE_MSG, id, isTree, loginContext, originatingUsername, progress, RANDOM, repositoryName, schedulePath, schedulingTime, session, startTime, state, status, suspended, suspending, WORK_FAILED_EVENT, WORK_INSTANCE
Constructor and Description |
---|
FulltextExtractorWork(String repositoryName,
String docId,
boolean updateSimpleText,
boolean updateBinaryText,
boolean useJobId) |
Modifier and Type | Method and Description |
---|---|
protected String |
blobToText(Blob blob)
Converts the blob to text by calling a converter.
|
protected void |
extractAndUpdate() |
protected void |
extractAndUpdateBinaryText() |
protected void |
extractAndUpdateSimpleText() |
protected void |
findDocsToUpdate() |
String |
getCategory()
Gets the category for this work.
|
protected String |
getFulltextPropertyName(String name,
String indexName) |
int |
getRetryCount()
Gets the number of times that this Work instance can be retried in case of concurrent update exceptions.
|
String |
getTitle()
Gets a human-readable name for this work instance.
|
protected void |
initFulltextConfiguration() |
protected String |
limitStringSize(String string,
int maxSize) |
protected String |
removeEntities(String string) |
protected String |
removeHtml(String string) |
protected String |
stringToText(String string) |
void |
work()
This method should implement the actual work done by the
Work instance. |
buildWorkFailureEventProps, cleanUp, closeSession, commitOrRollbackTransaction, equals, getCompletionTime, getDocument, getDocuments, getId, getOriginatingUsername, getPartitionKey, getProgress, getSchedulePath, getSchedulingTime, getStartTime, getStatus, getWorkInstanceState, hashCode, initSession, initSession, isDocumentTree, isSuspending, isWorkInstanceSuspended, newDocumentLocation, openSystemSession, openUserSession, run, runWorkWithTransaction, setCompletionTime, setDocument, setDocument, setDocuments, setOriginatingUsername, setProgress, setSchedulePath, setStartTime, setStatus, setWorkInstanceState, setWorkInstanceSuspending, startTransaction, suspended, toString, workFailed
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
isCoalescing, isIdempotent
public static final String SYSPROP_FULLTEXT_SIMPLE
public static final String SYSPROP_FULLTEXT_BINARY
public static final String SYSPROP_FULLTEXT_JOBID
public static final String FULLTEXT_DEFAULT_INDEX
protected static final String CATEGORY
protected static final String TITLE
protected static final String ANY2TEXT_CONVERTER
protected static final int HTML_MAGIC_OFFSET
protected static final String TEXT_HTML
protected transient FulltextConfiguration fulltextConfiguration
protected transient DocumentModel document
protected transient List<DocumentRef> docsToUpdate
protected final boolean updateSimpleText
protected final boolean updateBinaryText
protected final boolean useJobId
public FulltextExtractorWork(String repositoryName, String docId, boolean updateSimpleText, boolean updateBinaryText, boolean useJobId)
public String getCategory()
Work
Used to choose an execution queue.
getCategory
in interface Work
getCategory
in class AbstractWork
null
for the defaultpublic String getTitle()
Work
public int getRetryCount()
AbstractWork
getRetryCount
in class AbstractWork
AbstractWork.work()
public void work()
Work
Work
instance.
It should periodically update its progress through Work.setProgress(org.nuxeo.ecm.core.work.api.Work.Progress)
.
To allow for suspension by the WorkManager
, it should periodically call Work.isSuspending()
, and if
true
call Work.suspended()
return early with saved state data.
Clean up can by implemented by #cleanUp()
.
work
in interface Work
work
in class AbstractWork
Work.isSuspending()
,
Work.suspended()
,
Work.cleanUp(boolean, java.lang.Exception)
protected void initFulltextConfiguration()
protected void findDocsToUpdate()
protected void extractAndUpdate()
protected void extractAndUpdateSimpleText()
protected void extractAndUpdateBinaryText()
protected String stringToText(String string)
protected String removeHtml(String string)
protected String removeEntities(String string)
protected String blobToText(Blob blob)
protected String limitStringSize(String string, int maxSize)
protected String getFulltextPropertyName(String name, String indexName)
Copyright © 2019 Nuxeo. All rights reserved.