Install TS indexing service
Install war file
cd /usr/share/tomcat7/webapps/ sudo wget https://www.tempusserva.dk/install/tsFileIndexingService.war
A couple of seconds later you can configure he data connection and paths for OCR librarys
sudo nano /usr/share/tomcat7/conf/Catalina/localhost/tsFileIndexingService.xml
(or depending on Linux distribution)
sudo nano /etc/tomcat7/Catalina/localhost/tsFileIndexingService.xml
Example configurations can be seen below
Restart server after changes
tstomcatrestart
Windows example configuration
<?xml version="1.0" encoding="UTF-8"?> <Context antiJARLocking="true" path="/tsFileIndexingService"> <Resource name="jdbc/TempusServaLive" auth="Container" type="javax.sql.DataSource" maxActive="80" maxIdle="30" maxWait="2000" removeAbandoned="true" removeAbandonedTimeout="60" logAbandoned="true" validationQuery="SELECT 1" validationInterval="30000" testOnBorrow="true" username="root" password="TempusServaFTW!" driverClassName="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/tslive?autoReconnect=true" /> <Parameter name="ExecutableImageMagick" value="c:\ImageMagick\convert"/> <Parameter name="ExecutableGhostscript" value="c:\Program Files\gs\gs9.20\bin\gswin64c.exe"/> <Parameter name="ExecutableTesseract" value="c:\Program Files (x86)\Tesseract-OCR\tesseract"/> <Parameter name="LanguagesTesseract" value="eng+dan"/> <Parameter name="ElasticServerAddress" value="localhost"/> </Context>
Linux example configuration
<?xml version="1.0" encoding="UTF-8"?> <Context antiJARLocking="true" path="/tsFileIndexingService"> <Resource name="jdbc/TempusServaLive" auth="Container" type="javax.sql.DataSource" maxActive="80" maxIdle="30" maxWait="2000" removeAbandoned="true" removeAbandonedTimeout="60" logAbandoned="true" validationQuery="SELECT 1" validationInterval="30000" testOnBorrow="true" username="root" password="TempusServaFTW!" driverClassName="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/tslive?autoReconnect=true" /> <Parameter name="ExecutableImageMagick" value="/usr/bin/convert"/> <Parameter name="ExecutableGhostscript" value="/usr/bin/ghostscript"/> <Parameter name="ExecutableTesseract" value="/usr/bin/tesseract"/> <Parameter name="LanguagesTesseract" value="eng+dan"/> <Parameter name="ElasticServerAddress" value="localhost"/> </Context>
Enable and test indexing in Tempus Serva
Set the following configurations to true
- fulltextIndexData
- fulltextIndexFile
Also add port 8080 to the following URL
- fulltextFileHandlerURL
Update any record in the TS installation
Tjeck the index is created and that there is a mapping for the solution
curl 'http://localhost:9200/tempusserva/?pretty'
Next validate that records are found when searched for (replace * with a valid string)
curl 'http://localhost:9200/tempusserva/_search?pretty&q=*'
Finally validate that the Tempus Serva wrapper also works
http://<server>/TempusServa/fulltextsearch?subtype=4&term=*
Optional OCR components
Some libraries must be installed (ghostscript is probably already installed)
sudo yum install ImageMagick sudo yum install ghostscript
Also install tesseract
CentOS/Fedora
sudo yum install tesseract-ocr
Amazon linux
sudo yum --enablerepo=epel --disablerepo=amzn-main install libwebp sudo yum --enablerepo=epel --disablerepo=amzn-main install tesseract
Afterwards change the configurations in the file indexer
sudo nano /usr/share/tomcat7/conf/Catalina/localhost/tsFileIndexingService.xml
The values should be
- /usr/bin/tesseract
- /usr/bin/convert
- /usr/bin/ghostscript
After changing the values restart the server.