-
Notifications
You must be signed in to change notification settings - Fork 28
Language Support
waynegraham edited this page Aug 16, 2012
·
1 revision
The SolrSearch plugin is configured by default to index English, but has configurations for the following languages:
- Arabic
- Bulgarian
- Catalan
- CJK bigram
- Czech
- Danish
- German
- Greek
- Spanish
- Basque
- Persian
- Finnish
- French
- Irish
- Galician
- Hindi
- Hungarian
- Armenian
- Indonesian
- Italian
- Hebrew
- Japanese (using morphological analysis)
- Latvian
- Dutch
- Norwegian
- Portuguese
- Romanian
- Russian
- Swedish
- Thai
- Turkish
To use a different indexing schema, you will need to change the SolrSearch's fulltext
entry in the schema.xml
. Find the line that reads
<field name="fulltext" type="text_en" indexed="true" stored="false" multiValued="true"/>
You will need to change the type
attribute to the language you need to handle in the format 'text_ISOcode'. For example, if you have a collection of Irish Gaelic texts, you would change the line to the following:
<field name="fulltext" type="text_ga" indexed="true" stored="false" multiValued="true"/>
If you are using a non-western character set, make sure the server running Solr (e.g. Tomcat or Jetty) is configured to provide UTF-8 support. See Configuring Tomcat to provide UTF-8 support for Solr for an example for Tomcat.