Re: Search substring in field

Emir Arnautovic Wed, 10 May 2017 07:59:34 -0700

Hi,

Solr works on top of data structure called inverted index<https://en.wikipedia.org/wiki/Inverted_index>. You can misuse it and donot invert your documents and use regex or wildcards to find matches,but that is not the way to use it - it'll be significantly slower.


Solr does support subset of regex and syntax for that is field:/regex/

Solr also supports wildcards: * and ?

In any case you have to be aware that it matches tokens and you have tosetup your analysis properly to make it work (at least need to lowercaseif want to make it case insensitive).



On 09.05.2017 19:15, jnobre wrote:

Hello,

Thanks for your response.

I realize the concept, but I do not know which one to use in my case. Not
exactly the difference between the analyzes.

1- At this moment I search for
"source": * "hello word" * or url =
http://XXXX:8983/solr/AWP10/select?Indent=on&q=source:*%22hello%20world%22*&wt=json

If you index source as string (single token) you can search withwildcards, but you have to escape spaces - source: *hello\ word*

or can use regex - source:/.*hello word.*/

If you index it as text, it will be tokenized and it will have tokens"hello" and "word" and then you can use phrase query - source: "helloword" - this is recommended way.


For example, one line of the answer:
    "source":
["http://www.gravatar.com/avatar/ad516503a11cd5ca435acc9bb6523536?s=32";]

The expression does not appear and even then the line is returned.

you can use debugQuery=true to see how query is parsed - the one yousent uses match all on default field.


2 - My idea was to identify a url in the middle of a string with regex, for
example, as it does in Java:
Eur-lex.europa.eu eur-lex.europa.eu eur-lex.europa.eu Eur-lex.europa.eu
eur-lex.europa.eu
I do not know what the syntax is for entering regex in the search.

The proper way is to use analysis to split url into tokens and then tosearch for exact match. Analysis could include:

1. changing / with space
2. white space tokenizer
3. removing 'www.'
4. ignoring http
...


3- I can use the multiplication function, but not the search syntax to
evaluate its return.

Again, if you always query product of the same fields, you might want tocreate field containing that value (e.g. field prod) and then use rangequery - prod:[10 TO 20]

If you have two numeric fields (e.g. a and b) you can filter out docusing frange in filter query:

  fg={!frange l=10 u=20}product(a, b)
if you need to return that value you need to add it to fl:
  fl=*,prod:product(a,b)
this will return all stored fields and product as 'prod'.

HTH,
Emir







--
View this message in context: 
http://lucene.472066.n3.nabble.com/Search-substring-in-field-tp4333553p4334316.html
Sent from the Solr - User mailing list archive at Nabble.com.


--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/

Re: Search substring in field

Reply via email to