complex keywords, hierarchical data, Solr representation problem

2012-01-08 Thread jimmy
Hi, I'm new to Solr and already highly impressed about its possibilities and speed. Until now, I only have used a relational database (MySQL) and programmed so far everything in php or Java. Now, I'm stuck and don't know how to represent my data in a Solr Index. To simplify things, first I want

Re: Detecting query errors with SolrJ

2012-01-08 Thread Shawn Heisey
On 1/6/2012 3:57 PM, Michael Sokolov wrote: See SOLR-141; there are a few patches - currently all you get back is a 400 error with no actual information equivalent to what is logged in the solr exception. If I can get the HTTP code and the text that's in the error you see in the browser, that

Re: complex keywords, hierarchical data, Solr representation problem

2012-01-08 Thread Ted Dunning
Option 3 is preferably because you can use phrase queries to get interesting results as in "color light beige" or "color light". Normalizing is bad in this kind of environment. On Sun, Jan 8, 2012 at 11:35 AM, jimmy wrote: > ... > First Table KEYWORDS: > keyword_id, keyword > 1, white horse > 2

stopwords as privacy measure

2012-01-08 Thread Michael Lissner
I have a unique use case where I have words in my corpus that users shouldn't ever be allowed to search for. My theory is that if I add these to the stopwords list, that should do the trick. I'm using the edismax parser and it seems to be working in my dev environment. Is there any risk to thi

Re: Solr Scoring question

2012-01-08 Thread Esteban Donato
filter queries (fq) are not included for score calculation, just the query in q parameter is used for this purpose. That's why although you get the same results, lucene will just use q=*:* in your 1st query and q=tag:car in your 2nd query to calculate the scores. As you can see since both queries

Doing url search in solr is slow

2012-01-08 Thread yu shen
Hi, My solr document has up to 20 fields, containing data from product name, date, url etc. The volume of documents is around 1.5m. My symptom is when doing url search like [ url:*www.someurl.com* referal_url:*www.someurl.com* page_url:*www.someurl.com*] will get a extraordinary long response ti

Re: Doing url search in solr is slow

2012-01-08 Thread Arian
I face similar problem. Facet queries with uri fields are slower than others field types. I don't know why. Arian Sent from my Kindle Fire _ From: yu shen Sent: Sun Jan 08 23:44:16 GMT-03:00 2012 To: solr-user@lucene.apache.org Subject: Doing url sear

Re: Detecting query errors with SolrJ

2012-01-08 Thread Michael Sokolov
It's possible this may have changed in a recent release, and I don't know about it, but when I last checked, the only information you could get out of solrj was a SolrServerException with some very limited info - basically - an error occurred on the server, and maybe HTTP 400. When you say you

Re: stopwords as privacy measure

2012-01-08 Thread Ted Dunning
On Sun, Jan 8, 2012 at 3:33 PM, Michael Lissner < mliss...@michaeljaylissner.com> wrote: > I have a unique use case where I have words in my corpus that users > shouldn't ever be allowed to search for. My theory is that if I add these > to the stopwords list, that should do the trick. > That shou

Re: stopwords as privacy measure

2012-01-08 Thread Gora Mohanty
On Mon, Jan 9, 2012 at 5:03 AM, Michael Lissner wrote: > I have a unique use case where I have words in my corpus that users > shouldn't ever be allowed to search for. My theory is that if I add these to > the stopwords list, that should do the trick. Yes, that should work. Are you including the

Re: stopwords as privacy measure

2012-01-08 Thread Michael Lissner
I've got them configured at index and query time, so sounds like I'm all set. I'm doing anonymization of social security numbers, converting them to xxx-xx-. I don't *think* users can find a way of identifying these docs if the stopwords-based block works. Thank you both for the confirma