Solr in a distributed multi-machine high-performance environment

2008-01-15 Thread Srikant Jakilinki
Hi All, There is a requirement in our group of indexing and searching several millions of documents (TREC) in real-time and millisecond responses. For the moment we are preferring scale-out (throw more commodity machines) approaches rather than scale-up (faster disks, more RAM). This is in-turn in

Re: wildcards and German umlauts

2008-01-15 Thread Daniel Naber
On Dienstag, 15. Januar 2008, Alexey Shakov wrote: > Index-searching works, if i type complete word (such as "übersicht"). > But there are no hits, if i use wildcards (such as "über*") > Searching with wildcards and without umlauts works as well. Maybe this describes your problem on the Lucene le

Re: Missing Content Stream

2008-01-15 Thread Ismail Siddiqui
thanks brian and otis, i will definitely try solrj.. but actaually now the problem is resolved by setting content length in header i was missing it c.setRequestProperty("Content-Length", xmlText.length()+""); but now its not throwing any error but not indexing the document either.. do I have to set

Re: best way to get number of documents in a Solr index

2008-01-15 Thread Maria Mosolova
Thanks a lot Brian! Maria Brian Whitman wrote: On Jan 15, 2008, at 3:47 PM, Maria Mosolova wrote: Hello, I am looking for the best way to get the number of documents in a Solr index. I'd like to do it from a java code using solrj. public int resultCount() { try { SolrQuery q

Re: best way to get number of documents in a Solr index

2008-01-15 Thread Ryan McKinley
try a query with q=*:* the 'numFound' will be every document -- use &rows=0 to avoid returing docs (if you like) ryan Maria Mosolova wrote: Hello, I am looking for the best way to get the number of documents in a Solr index. I'd like to do it from a java code using solrj. Any suggestions

Re: best way to get number of documents in a Solr index

2008-01-15 Thread Brian Whitman
On Jan 15, 2008, at 3:47 PM, Maria Mosolova wrote: Hello, I am looking for the best way to get the number of documents in a Solr index. I'd like to do it from a java code using solrj. public int resultCount() { try { SolrQuery q = new SolrQuery("*:*"); QueryResponse rq =

best way to get number of documents in a Solr index

2008-01-15 Thread Maria Mosolova
Hello, I am looking for the best way to get the number of documents in a Solr index. I'd like to do it from a java code using solrj. Any suggestions are welcome. Thank you in advance, Maria mosolova

Re: Missing Content Stream

2008-01-15 Thread Brian Whitman
On Jan 15, 2008, at 1:50 PM, Ismail Siddiqui wrote: Hi Everyone, I am new to solr. I am trying to index xml using http post as follows Ismail, you seem to have a few spelling mistakes in your xml string. "fiehld, nadme" etc. (a) try fixing them, (b) try solrj instead, I agree w/ otis.

Re: Missing Content Stream

2008-01-15 Thread Otis Gospodnetic
Ismail, use Solrj instead, you'll be much happier. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Ismail Siddiqui <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Tuesday, January 15, 2008 1:50:25 PM Subject: Missing Content Stream Hi

Missing Content Stream

2008-01-15 Thread Ismail Siddiqui
Hi Everyone, I am new to solr. I am trying to index xml using http post as follows try{ String xmlText = ""; xmlText+=""; xmlText+="SOLR1000"; xmlText+="Solr, the Enterprise Search Server"; xmlText+="Apache Software Foundation"; xmlText+="software"; xmlText+="search

Re: highlighting marks wrong words

2008-01-15 Thread Alexey Shakov
Thank you! It works correct with filter query Charlie Jackson schrieb: I believe changing the "AND id: etc etc " part of the query to it's on filter query will take care of your highlighting problem. In other words, try a query like this: q=(auto)&fq=id:(100 OR 1 OR 2 OR 3 OR 5 OR 6)&fl=s

Re: LNS - or - "now i know we've succeeded"

2008-01-15 Thread Otis Gospodnetic
I'm sure N stealth startups are doing this as we speakand reading this, rubbing hands :) Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Lance Norskog <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Monday, January 14, 2008 6:09:3

Re: XSLT to preprocess XML documents into 'update xml documents' ?

2008-01-15 Thread Ryan McKinley
I have not tried it, but check: https://issues.apache.org/jira/browse/SOLR-285 Karen Loughran wrote: Hi all, I noticed some recent discussion with regard to using XSLT to preprocess XML documents into 'update xml documents' : http://www.mail-archive.com/[EMAIL PROTECTED]/msg05927.ht

XSLT to preprocess XML documents into 'update xml documents' ?

2008-01-15 Thread Karen Loughran
Hi all, I noticed some recent discussion with regard to using XSLT to preprocess XML documents into 'update xml documents' : http://www.mail-archive.com/[EMAIL PROTECTED]/msg05927.html I was wondering if there has been any update to this ? It is something we would be interested in using. Th

Re: field:(-null) returns records where field was not specified

2008-01-15 Thread Karen Loughran
Thanks Chris, this is useful, we can you the query format you suggest, Karen On Tuesday 15 January 2008 01:13:14 Chris Hostetter wrote: > Several things in this thread should be clarified (note: order of > quotations munged for clarity)... > > : I had read this page. But I'm not using the "NOT"

RE: highlighting marks wrong words

2008-01-15 Thread Charlie Jackson
I believe changing the "AND id: etc etc " part of the query to it's on filter query will take care of your highlighting problem. In other words, try a query like this: q=(auto)&fq=id:(100 OR 1 OR 2 OR 3 OR 5 OR 6)&fl=score&hl.fl=content&hl=true&hl.fragsize=200&hl.snippets=2&hl.simpl e.pre=%3Cb%3

highlighting marks wrong words

2008-01-15 Thread Alexey Shakov
Hi all, I have a query like this: q=(auto) AND id:(100 OR 1 OR 2 OR 3 OR 5 OR 6)&fl=score&hl.fl=content&hl=true&hl.fragsize=200&hl.snippets=2&hl.simple.pre=%3Cb%3E&hl.simple.post=%3C%2Fb%3E&start=0&rows=10 Default field is content. So, I expect, that only occurrencies of "auto" will be marke

FunctionQuery in a custom request handler

2008-01-15 Thread evol__
I'm trying to pull off a "time bias", "article freshness" thing - boosting recent documents based on a "published_date" field. The reasonable way to do this seems using a FunctionQuery. But all the examples I find are for expressing this through the query parser; I'd need to do this inside my cust

wildcards and German umlauts

2008-01-15 Thread Alexey Shakov
Hi all, Index-searching works, if i type complete word (such as "übersicht"). But there are no hits, if i use wildcards (such as "über*") Searching with wildcards and without umlauts works as well. Can someone help me? Thanx in advance! Here is my field definition: