background preloader

Solr

Facebook Twitter

Diacritic

Indexing. SolrRelevancyFAQ. Relevancy is the quality of results returned from a query, encompassing both what documents are found, and their relative ranking (the order that they are returned to the user.) Should I use the standard or dismax Query Parser The standard Query Parser uses SolrQuerySyntax to specify the query via the q parameter, and it must be well formed or an error will be returned. It's good for specifying exact, arbitrarily complex queries. The DisMax Query Parser has a more forgiving query parser for the q parameter, useful for directly passing in a user-supplied query string. The other parameters make it easy to search across multiple fields using disjunctions and sloppy phrase queries to return highly relevant results. For servicing user-entered queries, start by using dismax.

Solr3.1 From Solr 3.1 we recommend starting with the new Extended Dismax parser enabled by defType=edismax How can I search for "superman" in both the title and subject fields q=title:superman subject:superman index-time boosts. OmitNorm. On Wednesday 28 May 2008 01:37:57 Otis Gospodnetic wrote: If you have tokenized fields of variable size and you want the field length to affect the relevance score, then you do not want to omit norms. Omitting norms is good for fields where length is of no importance (e.g. gender="Male" vs. gender="Female"). Omitting norms saves you heap/RAM, one byte per doc per field without norms, I believe. I am also toying with the hypothesis that omitting the field norm may be a good idea for title fields in languages with compound words, which typically consist of only a few words. On our server we use a German language stemmer in conjunction with a compound word tokenizer, which inserst extra tokens into the stream.

With typical short titles, such as: Elterntagung mit Rekordbeteiligung, which is tokenized as (before stemming): elterntagung eltern tagung mit rekordbeteiligung rekord beteiligung, the title ends up having 7 tokens instead of 3 or even 5, which significantly affects the field norms. Auto-Suggest From Popular Queries Using EdgeNGrams. Search results for lucene term frequency. Search results for lucene term frequency. TermFreq always = 1 ?