Trouble shooting
Status on the file indexing
The file indexer has a stus page that will display information about the state of the indexer
https://<server>/tsFileIndexingService/execute
The page also constains a goodword "HEALTHY" taht is displayed if the process has not exceeded the specified timeouts.
Controlling timeouts
Timeouts are specified in seconds and should be tuned to CPU size and quality of documents
<Parameter name="TimeoutTesseract" value="600"/> <Parameter name="TimeoutGhostscript" value="60"/>
Poor quality documents on virtualized environments can easily consume about a minute per page.
Debugging OCR proces
By default output from the external components are written to logfiles, which can be disabled by adding this option
<Parameter name="SuppressCommandOutput" value="0"/>
Note that there is a switch in configuration file (context.xml) which can disable file deletion on the server
<Parameter name="DisableFileCleanup" value=""/>