Automatic Phrasing Token Filter in Lucidworks Fusion

  1. Home
  2. Blog
  3. Automatic Phrasing Token Filter in Lucidworks Fusion

Automatic Phrasing Token Filter in Lucidworks Fusion

How do we inject capability to detect and tokenize phrases using Automatic Phrazing Token Filter in Fusion? Well below are the details on how to do it.

Configuration Settings:

1. Using this github create your jar. [https://github.com/lucidworks/auto-phrase-tokenfilter]. Copy JAR [“auto-phrase-tokenfilter-1.0.jar”] to solr
Location to copy JAR “/opt/fusion/<version>/apps/solr-dist/dist”

2. Update solr config.xml

Add lib reference

<lib dir="${solr.install.dir:../../../..}/dist/" regex="auto-phrase-tokenfilter.*\.jar" />

3. Add a new text file “autophrases.txt” to Solr Config

  • Goto System –> Solr Config
  • Enter Filename and keep default location for file in Path.
  • Save the changes

4. Add “AutoPhrasingTokenFilterFactory” filter to managed-schema

  • Goto System –> Solr Config –> managed-schema
  • Add below in index analyzer
<filter class="com.lucidworks.analysis.AutoPhrasingTokenFilterFactory" phrases="autophrases.txt" includeTokens="true" /> 

Add below in query analyzer to specific fieldType

<filter class="com.lucidworks.analysis.AutoPhrasingTokenFilterFactory" phrases="autophrases.txt" />  

Example:
Add “swim short” to autophrases.txt and save the file.

And validate the change in Solr Admin

Have a different use case? Email [email protected]

Let's Share
Show Buttons
Hide Buttons