Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the Extended DisMax query parser and support all queries. #142

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

kloor
Copy link

@kloor kloor commented Apr 3, 2017

Currently, SolrSearch is using the standard query parser. This parser has strict syntax requirements, so it is very easy to create a query that it can't process. One example is specifying a single double-quote, as noted in #137. SolrSearch made a few attempts at fixing some of these syntax issues by replacing colons with spaces, and removing square-brackets from query strings. Those changes can present their own issues, such as preventing users from searching against a specific field.

Solr also provides a query parser known as Extended DisMax or eDisMax. This parser is much more tolerant of non-standard syntax, automatically escaping characters as necessary:
https://cwiki.apache.org/confluence/display/solr/The+Extended+DisMax+Query+Parser

This pull request makes all queries use the eDisMax query parser by adding {!edismax} to the start of the query string on line 142. This seems like the simplest way to improve query parsing for plugin users, but does limit changing the parser in the future to changing that line of code.

An alternative change could be made to the solrconfig.xml file as discussed in #139. This would require plugin users to alter or replace the file copied to their Solr installation, and reload the core in Solr. But, it would allow changes to the query parser without having to edit the plugin's code.

Beyond enabling the eDisMax query parser, this pull request also makes changes to how query strings are formed:

  1. No special characters are removed from the query string, leaving that issue to the eDisMax parser. This allows for field matching such as title:"search string".
  2. The user's query string is wrapped in parenthesis, so the entire string will be ANDed to the facet and public search terms.
  3. Plus signs are added to the facet and public search terms so that they are required when searching with eDisMax. Without plus signs, the terms would only be boosted in the search results, but non-matches would also be included. The change was handled here for facets so existing bookmarks would not be impacted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant