date:20091024

Re: Solrj Javabin and JSON

2009-10-24 Thread Noble Paul നോബിള്‍ नोब्ळ्

There is no point converting javabin to json. javabin is in
intermediate format it is converted to the java objects as soon as
comes. You just need means to convert the java object to json.



On Sat, Oct 24, 2009 at 12:10 PM, SGE0  wrote:
>
> Hi,
>
> did anyone write a Javabin to JSON convertor and is willing to share this ?
>
> In our servlet we use a CommonsHttpSolrServer instance to execute a query.
>
> The problem is that is returns Javabin format and we need to send the result
> back to the browser using JSON format.
>
> And no, the browser is not allowed to directly query Lucene with the wt=json
> format.
>
> Regards,
>
> S.
> --
> View this message in context: 
> http://www.nabble.com/Solrj-Javabin-and-JSON-tp26036551p26036551.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>



-- 
-
Noble Paul | Principal Engineer| AOL | http://aol.com

Re: Where the new replication pulls the files?

2009-10-24 Thread Noble Paul നോബിള്‍ नोब्ळ्

On Fri, Oct 23, 2009 at 11:46 PM, Jérôme Etévé  wrote:
> Hi all,
>  I'm wondering where a slave pulls the files from the master on replication.
>
> Is it directly to the index/ directory or is it somewhere else before
> it's completed and gets copied to index?
>
it is copied to a emp dir till all the files are downloaded
> Cheers!
>
> Jerome.
>
> --

> Jerome Eteve.
> http://www.eteve.net
> jer...@eteve.net
>



-- 
-
Noble Paul | Principal Engineer| AOL | http://aol.com

Re: Solrj client API and response in XML format (Solr 1.4)

2009-10-24 Thread Noble Paul നോബിള്‍ नोब्ळ्

hi
  you don't see the point . You really don't need to use SolrJ . All
that you need to do is just make an http request with wt=json and read
the output to a buffer and you can just send it to your client.
--Noble



On Fri, Oct 23, 2009 at 9:40 PM, SGE0  wrote:
>
> Hi All,
>
> After a day of searching I'm quite confused.
>
> I use the solrj client as follows:
>
> CommonsHttpSolrServer solr = new
> CommonsHttpSolrServer("http://127.0.0.1:8080/apache-solr-1.4-dev/test";);
> solr.setRequestWriter(new BinaryRequestWriter());
>
>    ModifiableSolrParams params = new ModifiableSolrParams();
>    params.set("qt", "dismax");
>    params.set("indent", "on");
>    params.set("version", "2.2");
>    params.set("q", "test");
>    params.set("start", "0");
>    params.set("rows", "10");
>    params.set("wt", "xml");
>    params.set("hl", "on");
>    QueryResponse response = solr.query(params);
>
>
> How can I get the query result (response) in XML format out f?
>
> I know it sounds stupid but I can't seem to manage that.
>
> What do I need to do with the response object to get the response in XML
> format ?
>
> I already understood I can"t get the result in JSON so my idea was to go
> from XML to JSON.
>
> Thx for your answer already !
>
> S.
>
>
>
>
>    System.out.println("response = " + response);
> SolrDocumentList  sdl  =  response.getResults();
> --
> View this message in context: 
> http://www.nabble.com/Solrj-client-API-and-response-in-XML-format-%28Solr-1.4%29-tp26029197p26029197.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>



-- 
-
Noble Paul | Principal Engineer| AOL | http://aol.com

Re: Solrj Javabin and JSON

2009-10-24 Thread SGE0


Hi Paul,


fair enough. Is this included in the Solrj package ? Any examples how to do
this ?


Stefan



Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
> 
> There is no point converting javabin to json. javabin is in
> intermediate format it is converted to the java objects as soon as
> comes. You just need means to convert the java object to json.
> 
> 
> 
> On Sat, Oct 24, 2009 at 12:10 PM, SGE0  wrote:
>>
>> Hi,
>>
>> did anyone write a Javabin to JSON convertor and is willing to share this
>> ?
>>
>> In our servlet we use a CommonsHttpSolrServer instance to execute a
>> query.
>>
>> The problem is that is returns Javabin format and we need to send the
>> result
>> back to the browser using JSON format.
>>
>> And no, the browser is not allowed to directly query Lucene with the
>> wt=json
>> format.
>>
>> Regards,
>>
>> S.
>> --
>> View this message in context:
>> http://www.nabble.com/Solrj-Javabin-and-JSON-tp26036551p26036551.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 
> 
> -- 
> -
> Noble Paul | Principal Engineer| AOL | http://aol.com
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Solrj-Javabin-and-JSON-tp26036551p26037017.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solrj client API and response in XML format (Solr 1.4)

2009-10-24 Thread SGE0


Hi Paul,

thx again.

Can I use this technique from within a servlet ?

Do I need an instance of the HttpClient to do that ?

I noticed I can instantiate the CommonsHttpSolrServer with a HttpClient
client .

I did not find any relevant examples how to use this .

If you can help me out with this much appreciated..

Stefan  


Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
> 
> hi
>   you don't see the point . You really don't need to use SolrJ . All
> that you need to do is just make an http request with wt=json and read
> the output to a buffer and you can just send it to your client.
> --Noble
> 
> 
> 
> On Fri, Oct 23, 2009 at 9:40 PM, SGE0  wrote:
>>
>> Hi All,
>>
>> After a day of searching I'm quite confused.
>>
>> I use the solrj client as follows:
>>
>> CommonsHttpSolrServer solr = new
>> CommonsHttpSolrServer("http://127.0.0.1:8080/apache-solr-1.4-dev/test";);
>> solr.setRequestWriter(new BinaryRequestWriter());
>>
>>    ModifiableSolrParams params = new ModifiableSolrParams();
>>    params.set("qt", "dismax");
>>    params.set("indent", "on");
>>    params.set("version", "2.2");
>>    params.set("q", "test");
>>    params.set("start", "0");
>>    params.set("rows", "10");
>>    params.set("wt", "xml");
>>    params.set("hl", "on");
>>    QueryResponse response = solr.query(params);
>>
>>
>> How can I get the query result (response) in XML format out f?
>>
>> I know it sounds stupid but I can't seem to manage that.
>>
>> What do I need to do with the response object to get the response in XML
>> format ?
>>
>> I already understood I can"t get the result in JSON so my idea was to go
>> from XML to JSON.
>>
>> Thx for your answer already !
>>
>> S.
>>
>>
>>
>>
>>    System.out.println("response = " + response);
>> SolrDocumentList  sdl  =  response.getResults();
>> --
>> View this message in context:
>> http://www.nabble.com/Solrj-client-API-and-response-in-XML-format-%28Solr-1.4%29-tp26029197p26029197.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 
> 
> -- 
> -
> Noble Paul | Principal Engineer| AOL | http://aol.com
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Solrj-client-API-and-response-in-XML-format-%28Solr-1.4%29-tp26029197p26037037.html
Sent from the Solr - User mailing list archive at Nabble.com.

Date Facet Giving Count more than actual

2009-10-24 Thread Aakash Dharmadhikari

hi guys,

  I am indexing events in solr, where every Event contains a startDate and
endDate.

  On the search page, I would like to have a Date Facet where users can
quickly browse through dates they are interested in.

  I have a field daysForFilter in each document which stores timestamps from
today till endDate as -MM-ddT00:00:01Z. The reason I have kept 01
seconds is to avoid overlap between two dates when calculating facets. My
application works on IST time zone, thus date 2009-10-24 00:00:00 is stored
in solr as 2009-10-23 18:30:00.

  I am using Date Faceting on this field, and the date facet query is
something like this.


q=&facet=true&facet.date=daysForFilter&facet.date.start=2009-10-23T18:30:01Z&facet.date.gap=%2B1DAY&facet.date.end=2009-10-28T18:30:01Z

  Ideally I should get correct date facets with count of events occuring on
that date. But for some dates I get count more that existing in the result.
For example I get total 18 documents for my query, and the facet count for
date 2009-10-23T18:30:01Z is 11; whereas there are only 5 documents
containing this field value. I have verified this in result. Also when I
query for daysForFilter:2009-10-23T18:30:01Z, it gives me 5 results.

  I am really helpless with this problem, and do not understand why its
generating such wrong facets.
  It would be great if any one can guide me further.

regards,
aakash

Solr under tomcat - UTF-8 issue

2009-10-24 Thread Glock, Thomas


Hoping someone can help -

Problem: 
Querying for non-english phrases such as Добавить do not return any 
results under Tomcat but do work when using the Jetty example.  

Both tomcat and jetty are being queried by the same custom (flash) 
client and both reference the same solr/data/index.  

I'm using an http POST rather than http GET to do the query to solr.  I 
believe the problem must be in how tomcat is configured and had hoped the 
-Dfile.encoding=UTF-8 would solve it - but no luck.  I've stopped started 
tomcat and deleted the work directory as well.

Results are the same in both IE6 and Firefox and I've used both firebug 
and fiddler to view the http request/responses.  It is consistent - jetty 
works, tomcat does not.

Environment:
Tomcat 6 as a service on WinXP Professional 2002 sp 2 
Tomcat Java properties -

-Dcatalina.home=C:\Program Files\Apache Software Foundation\Tomcat 6.0
-Dcatalina.base=C:\Program Files\Apache Software Foundation\Tomcat 6.0
-Djava.endorsed.dirs=C:\Program Files\Apache Software Foundation\Tomcat 
6.0\endorsed
-Djava.io.tmpdir=C:\Program Files\Apache Software Foundation\Tomcat 
6.0\temp
-Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager
-Djava.util.logging.config.file=C:\Program Files\Apache Software 
Foundation\Tomcat 6.0\conf\logging.properties
-Dfile.encoding=UTF-8

Thanks in advance.
Tom Glock

Re: Solr under tomcat - UTF-8 issue

2009-10-24 Thread Zsolt Czinkos

Hello

Have you set URIEncoding attribute to UTF-8 in tomcat's server.xml (on
connector element)?

Like:



Hope this helps.

Best regards

czinkos


2009/10/24 Glock, Thomas :
>
> Hoping someone can help -
>
> Problem:
>        Querying for non-english phrases such as Добавить do not return any 
> results under Tomcat but do work when using the Jetty example.
>
>        Both tomcat and jetty are being queried by the same custom (flash) 
> client and both reference the same solr/data/index.
>
>        I'm using an http POST rather than http GET to do the query to solr.  
> I believe the problem must be in how tomcat is configured and had hoped the 
> -Dfile.encoding=UTF-8 would solve it - but no luck.  I've stopped started 
> tomcat and deleted the work directory as well.
>
>        Results are the same in both IE6 and Firefox and I've used both 
> firebug and fiddler to view the http request/responses.  It is consistent - 
> jetty works, tomcat does not.
>
> Environment:
>        Tomcat 6 as a service on WinXP Professional 2002 sp 2
>        Tomcat Java properties -
>
>        -Dcatalina.home=C:\Program Files\Apache Software Foundation\Tomcat 6.0
>        -Dcatalina.base=C:\Program Files\Apache Software Foundation\Tomcat 6.0
>        -Djava.endorsed.dirs=C:\Program Files\Apache Software 
> Foundation\Tomcat 6.0\endorsed
>        -Djava.io.tmpdir=C:\Program Files\Apache Software Foundation\Tomcat 
> 6.0\temp
>        -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager
>        -Djava.util.logging.config.file=C:\Program Files\Apache Software 
> Foundation\Tomcat 6.0\conf\logging.properties
>        -Dfile.encoding=UTF-8
>
> Thanks in advance.
> Tom Glock
>
>

RE: Solr under tomcat - UTF-8 issue

2009-10-24 Thread Glock, Thomas


Thanks but not working...

I did have the URIEncoding in place and just again moved the URIEncoding 
attribute to be the first attribute - ensured I saved sever.xml, shut down 
tomcat, deleted logs and cache and still no luck  Its probably something 
very simple and I'm just missing it.

Thanks for your help.


-Original Message-
From: Zsolt Czinkos [mailto:czin...@gmail.com] 
Sent: Saturday, October 24, 2009 11:36 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr under tomcat - UTF-8 issue

Hello

Have you set URIEncoding attribute to UTF-8 in tomcat's server.xml (on 
connector element)?

Like:



Hope this helps.

Best regards

czinkos


2009/10/24 Glock, Thomas :
>
> Hoping someone can help -
>
> Problem:
>        Querying for non-english phrases such as Добавить do not return any 
> results under Tomcat but do work when using the Jetty example.
>
>        Both tomcat and jetty are being queried by the same custom (flash) 
> client and both reference the same solr/data/index.
>
>        I'm using an http POST rather than http GET to do the query to solr.  
> I believe the problem must be in how tomcat is configured and had hoped the 
> -Dfile.encoding=UTF-8 would solve it - but no luck.  I've stopped started 
> tomcat and deleted the work directory as well.
>
>        Results are the same in both IE6 and Firefox and I've used both 
> firebug and fiddler to view the http request/responses.  It is consistent - 
> jetty works, tomcat does not.
>
> Environment:
>        Tomcat 6 as a service on WinXP Professional 2002 sp 2
>        Tomcat Java properties -
>
>        -Dcatalina.home=C:\Program Files\Apache Software 
> Foundation\Tomcat 6.0
>        -Dcatalina.base=C:\Program Files\Apache Software 
> Foundation\Tomcat 6.0
>        -Djava.endorsed.dirs=C:\Program Files\Apache Software 
> Foundation\Tomcat 6.0\endorsed
>        -Djava.io.tmpdir=C:\Program Files\Apache Software 
> Foundation\Tomcat 6.0\temp
>        
> -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager
>        -Djava.util.logging.config.file=C:\Program Files\Apache 
> Software Foundation\Tomcat 6.0\conf\logging.properties
>        -Dfile.encoding=UTF-8
>
> Thanks in advance.
> Tom Glock
>
>

RE: Too many open files

2009-10-24 Thread Fuad Efendi


I had extremely specific use case; about 5000 documents-per-second (small
documents) update rate, some documents can be repeatedly sent to SOLR with
different timestamp field (and same unique document ID). Nothing breaks,
just a great performance gain which was impossible with 32GB Buffer (- it
caused constant index merge, 5 times more CPU than index update). Nothing
breaks... with indexMerge=10 I don't have ANY merge during 24 hours;
segments are large (few of 4Gb-8Gb, and one large "union"); I have "merge"
explicitly only, at night, when I issue "commit".


Of course, it depends on use case, for applications such as "Content
Management System" we don't need high remBufferSizeMB (few updates a day
sent to SOLR)...



> -Original Message-
> From: Mark Miller [mailto:markrmil...@gmail.com]
> Sent: October-23-09 5:28 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Too many open files
> 
> 8 GB is much larger than is well supported. Its diminishing returns over
> 40-100 and mostly a waste of RAM. Too high and things can break. It
> should be well below 2 GB at most, but I'd still recommend 40-100.
> 
> Fuad Efendi wrote:
> > Reason of having big RAM buffer is lowering frequency of IndexWriter
flushes
> > and (subsequently) lowering frequency of index merge events, and
> > (subsequently) merging of a few larger files takes less time...
especially
> > if RAM Buffer is intelligent enough (and big enough) to deal with 100
> > concurrent updates of existing document without 100-times flushing to
disk
> > of 100 document versions.
> >
> > I posted here thread related; I had 1:5 timing for Update:Merge (5
minutes
> > merge, and 1 minute update) with default SOLR settings (32Mb buffer). I
> > increased buffer to 8Gb on Master, and it triggered significant indexing
> > performance boost...
> >
> > -Fuad
> > http://www.linkedin.com/in/liferay
> >
> >
> >
> >> -Original Message-
> >> From: Mark Miller [mailto:markrmil...@gmail.com]
> >> Sent: October-23-09 3:03 PM
> >> To: solr-user@lucene.apache.org
> >> Subject: Re: Too many open files
> >>
> >> I wouldn't use a RAM buffer of a gig - 32-100 is generally a good
number.
> >>
> >> Fuad Efendi wrote:
> >>
> >>> I was partially wrong; this is what Mike McCandless
(Lucene-in-Action,
> >>>
> > 2nd
> >
> >>> edition) explained at Manning forum:
> >>>
> >>> mergeFactor of 1000 means you will have up to 1000 segments at each
> >>>
> > level.
> >
> >>> A level 0 segment means it was flushed directly by IndexWriter.
> >>> After you have 1000 such segments, they are merged into a single level
1
> >>> segment.
> >>> Once you have 1000 level 1 segments, they are merged into a single
level
> >>>
> > 2
> >
> >>> segment, etc.
> >>> So, depending on how many docs you add to your index, you'll could
have
> >>> 1000s of segments w/ mergeFactor=1000.
> >>>
> >>> http://www.manning-sandbox.com/thread.jspa?threadID=33784&tstart=0
> >>>
> >>>
> >>> So, in case of mergeFactor=100 you may have (theoretically) 1000
> >>>
> > segments,
> >
> >>> 10-20 files each (depending on schema)...
> >>>
> >>>
> >>> mergeFactor=10 is default setting... ramBufferSizeMB=1024 means that
you
> >>> need at least double Java heap, but you have -Xmx1024m...
> >>>
> >>>
> >>> -Fuad
> >>>
> >>>
> >>>
> >>>
>  I am getting too many open files error.
> 
>  Usually I test on a server that has 4GB RAM and assigned 1GB for
>  tomcat(set JAVA_OPTS=-Xms256m -Xmx1024m), ulimit -n is 256 for this
>  server and has following setting for SolrConfig.xml
> 
> 
> 
>  true
> 
>  1024
> 
>  100
> 
>  2147483647
> 
>  1
> 
> 
> 
> >>>
> >>>
> >> --
> >> - Mark
> >>
> >> http://www.lucidimagination.com
> >>
> >>
> >>
> >
> >
> >
> >
> 
> 
> --
> - Mark
> 
> http://www.lucidimagination.com
> 
>

Re: Solrj client API and response in XML format (Solr 1.4)

2009-10-24 Thread Noble Paul നോബിള്‍ नोब्ळ्

no need to use httpclient . use java.net.URL#openConnection(url) and
read the inputstream into a buffer and that is it.


On Sat, Oct 24, 2009 at 1:53 PM, SGE0  wrote:
>
> Hi Paul,
>
> thx again.
>
> Can I use this technique from within a servlet ?
>
> Do I need an instance of the HttpClient to do that ?
>
> I noticed I can instantiate the CommonsHttpSolrServer with a HttpClient
> client .
>
> I did not find any relevant examples how to use this .
>
> If you can help me out with this much appreciated..
>
> Stefan
>
>
> Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
>>
>> hi
>>   you don't see the point . You really don't need to use SolrJ . All
>> that you need to do is just make an http request with wt=json and read
>> the output to a buffer and you can just send it to your client.
>> --Noble
>>
>>
>>
>> On Fri, Oct 23, 2009 at 9:40 PM, SGE0  wrote:
>>>
>>> Hi All,
>>>
>>> After a day of searching I'm quite confused.
>>>
>>> I use the solrj client as follows:
>>>
>>> CommonsHttpSolrServer solr = new
>>> CommonsHttpSolrServer("http://127.0.0.1:8080/apache-solr-1.4-dev/test";);
>>> solr.setRequestWriter(new BinaryRequestWriter());
>>>
>>>    ModifiableSolrParams params = new ModifiableSolrParams();
>>>    params.set("qt", "dismax");
>>>    params.set("indent", "on");
>>>    params.set("version", "2.2");
>>>    params.set("q", "test");
>>>    params.set("start", "0");
>>>    params.set("rows", "10");
>>>    params.set("wt", "xml");
>>>    params.set("hl", "on");
>>>    QueryResponse response = solr.query(params);
>>>
>>>
>>> How can I get the query result (response) in XML format out f?
>>>
>>> I know it sounds stupid but I can't seem to manage that.
>>>
>>> What do I need to do with the response object to get the response in XML
>>> format ?
>>>
>>> I already understood I can"t get the result in JSON so my idea was to go
>>> from XML to JSON.
>>>
>>> Thx for your answer already !
>>>
>>> S.
>>>
>>>
>>>
>>>
>>>    System.out.println("response = " + response);
>>> SolrDocumentList  sdl  =  response.getResults();
>>> --
>>> View this message in context:
>>> http://www.nabble.com/Solrj-client-API-and-response-in-XML-format-%28Solr-1.4%29-tp26029197p26029197.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>
>>>
>>
>>
>>
>> --
>> -
>> Noble Paul | Principal Engineer| AOL | http://aol.com
>>
>>
>
> --
> View this message in context: 
> http://www.nabble.com/Solrj-client-API-and-response-in-XML-format-%28Solr-1.4%29-tp26029197p26037037.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>



-- 
-
Noble Paul | Principal Engineer| AOL | http://aol.com

RE: Too many open files

2009-10-24 Thread Fuad Efendi

Thanks for pointing to it, but it is so obvious:

1. "Buffer" is used as a RAM storage for index updates
2. "int" has 2 x Gb different values (2^^32)
3. We can have _up_to_ 2Gb of _Documents_ (stored as key->value pairs,
inverted index)

In case of 5 fields which I have, I need 5 arrays (up to 2Gb of size for
each) to store inverted pointers, so that there is no any theoretical limit:

> Also, from the javadoc in IndexWriter:
> 
>*  NOTE: because IndexWriter uses
>* ints when managing its internal storage,
>* the absolute maximum value for this setting is somewhat
>* less than 2048 MB.  The precise limit depends on
>* various factors, such as how large your documents are,
>* how many fields have norms, etc., so it's best to set
>* this value comfortably under 2048.



Note also, I use norms etc...

RE: Too many open files

2009-10-24 Thread Fuad Efendi

Mark, I don't understand this; of course it is use case specific, I haven't
seen any terrible behaviour with 8Gb... 32Mb is extremely small for
Nutch-SOLR -like applications, but it is acceptable for Liferay-SOLR...

Please note also, I have some documents with same IDs updated many thousands
times a day, and I believe (I hope) IndexWriter flushes "optimized" segment
instead of thousands "delete" and single "insert" in many small (32Mb) files
(especially with SOLR)...

> Hmm - came out worse than it looked. Here is a better attempt:
> 
> MergeFactor: 10
> 
> BUF   DOCS/S
> 32   37.40
> 80   39.91
> 120 40.74
> 512 38.25
> 
> Mark Miller wrote:
> > Here is an example using the Lucene benchmark package. Indexing 64,000
> > wikipedia docs (sorry for the formatting):
> >
> >  [java] > Report sum by Prefix (MAddDocs) and Round (4
> > about 32 out of 256058)
> >  [java] Operation round mrg  flush   runCnt
> > recsPerRunrec/s  elapsedSecavgUsedMemavgTotalMem
> >  [java] MAddDocs_8000 0  10  32.00MB8
> > 800037.401,711.22   124,612,472182,689,792
> >  [java] MAddDocs_8000 -   1  10  80.00MB -  -   8 -  -  - 8000 -
> > -   39.91 -  1,603.76 - 266,716,128 -  469,925,888
> >  [java] MAddDocs_8000 2  10 120.00MB8
> > 800040.741,571.02   348,059,488548,233,216
> >  [java] MAddDocs_8000 -   3  10 512.00MB -  -   8 -  -  - 8000 -
> > -   38.25 -  1,673.05 - 746,087,808 -  926,089,216
> >
> > After about 32-40, you don't gain much, and it starts decreasing once
> > you start getting to high. 8GB is a terrible recommendation.
> >

RE: Too many open files

2009-10-24 Thread Fuad Efendi

This JavaDoc is incorrect especially for SOLR, when you store raw (non
tokenized, non indexed) "text" value with a document (which almost everyone
does). Try to store 1,000,000 documents with 1000 bytes non-tokenized field:
you will need 1Gb just for this array.


> -Original Message-
> From: Fuad Efendi [mailto:f...@efendi.ca]
> Sent: October-24-09 12:10 PM
> To: solr-user@lucene.apache.org
> Subject: RE: Too many open files
> 
> Thanks for pointing to it, but it is so obvious:
> 
> 1. "Buffer" is used as a RAM storage for index updates
> 2. "int" has 2 x Gb different values (2^^32)
> 3. We can have _up_to_ 2Gb of _Documents_ (stored as key->value pairs,
> inverted index)
> 
> In case of 5 fields which I have, I need 5 arrays (up to 2Gb of size for
> each) to store inverted pointers, so that there is no any theoretical
limit:
> 
> > Also, from the javadoc in IndexWriter:
> >
> >*  NOTE: because IndexWriter uses
> >* ints when managing its internal storage,
> >* the absolute maximum value for this setting is somewhat
> >* less than 2048 MB.  The precise limit depends on
> >* various factors, such as how large your documents are,
> >* how many fields have norms, etc., so it's best to set
> >* this value comfortably under 2048.
> 
> 
> 
> Note also, I use norms etc...
> 
>

Re: Too many open files

2009-10-24 Thread Yonik Seeley

On Sat, Oct 24, 2009 at 12:18 PM, Fuad Efendi  wrote:
>
> Mark, I don't understand this; of course it is use case specific, I haven't
> seen any terrible behaviour with 8Gb

If you had gone over 2GB of actual buffer *usage*, it would have
broke...  Guaranteed.
We've now added a check in Lucene 2.9.1 that will throw an exception
if you try to go over 2048MB.
And as the javadoc says, to be on the safe side, you probably
shouldn't go too near 2048 - perhaps 2000MB is a good practical limit.

-Yonik
http://www.lucidimagination.com

Re: Solr under tomcat - UTF-8 issue

2009-10-24 Thread Yonik Seeley

Try using example/exampledocs/test_utf8.sh to narrow down if the
charset problems you're hitting are due to servlet container
configuration.

-Yonik
http://www.lucidimagination.com


2009/10/24 Glock, Thomas :
>
> Thanks but not working...
>
> I did have the URIEncoding in place and just again moved the URIEncoding 
> attribute to be the first attribute - ensured I saved sever.xml, shut down 
> tomcat, deleted logs and cache and still no luck  Its probably something 
> very simple and I'm just missing it.
>
> Thanks for your help.
>
>
> -Original Message-
> From: Zsolt Czinkos [mailto:czin...@gmail.com]
> Sent: Saturday, October 24, 2009 11:36 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr under tomcat - UTF-8 issue
>
> Hello
>
> Have you set URIEncoding attribute to UTF-8 in tomcat's server.xml (on 
> connector element)?
>
> Like:
>
>  protocol="HTTP/1.1" redirectPort="8443"/>
>
> Hope this helps.
>
> Best regards
>
> czinkos
>
>
> 2009/10/24 Glock, Thomas :
>>
>> Hoping someone can help -
>>
>> Problem:
>>        Querying for non-english phrases such as Добавить do not return any 
>> results under Tomcat but do work when using the Jetty example.
>>
>>        Both tomcat and jetty are being queried by the same custom (flash) 
>> client and both reference the same solr/data/index.
>>
>>        I'm using an http POST rather than http GET to do the query to solr.  
>> I believe the problem must be in how tomcat is configured and had hoped the 
>> -Dfile.encoding=UTF-8 would solve it - but no luck.  I've stopped started 
>> tomcat and deleted the work directory as well.
>>
>>        Results are the same in both IE6 and Firefox and I've used both 
>> firebug and fiddler to view the http request/responses.  It is consistent - 
>> jetty works, tomcat does not.
>>
>> Environment:
>>        Tomcat 6 as a service on WinXP Professional 2002 sp 2
>>        Tomcat Java properties -
>>
>>        -Dcatalina.home=C:\Program Files\Apache Software
>> Foundation\Tomcat 6.0
>>        -Dcatalina.base=C:\Program Files\Apache Software
>> Foundation\Tomcat 6.0
>>        -Djava.endorsed.dirs=C:\Program Files\Apache Software
>> Foundation\Tomcat 6.0\endorsed
>>        -Djava.io.tmpdir=C:\Program Files\Apache Software
>> Foundation\Tomcat 6.0\temp
>>
>> -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager
>>        -Djava.util.logging.config.file=C:\Program Files\Apache
>> Software Foundation\Tomcat 6.0\conf\logging.properties
>>        -Dfile.encoding=UTF-8
>>
>> Thanks in advance.
>> Tom Glock
>>
>>
>

Re: Too many open files

2009-10-24 Thread Yonik Seeley

On Sat, Oct 24, 2009 at 12:25 PM, Fuad Efendi  wrote:
> This JavaDoc is incorrect especially for SOLR,

It looks correct to me... if you think it can be clarified, please
propose how you would change it.

> when you store raw (non
> tokenized, non indexed) "text" value with a document (which almost everyone
> does). Try to store 1,000,000 documents with 1000 bytes non-tokenized field:
> you will need 1Gb just for this array.

Nope.  You shouldn't even need 1GB of buffer space for that.
The size specified is for all things that the indexing process needs
to temporarily keep in memory... stored fields are normally
immediately written to disk.

-Yonik
http://www.lucidimagination.com

RE: Solr under tomcat - UTF-8 issue

2009-10-24 Thread Glock, Thomas

 
Thanks - I now think it must be due to my client not sending enough ( or 
correct ) headers in the request.

Tomcat does work when using an HTTP GET but is failing the POST from my flash 
client. 

For example putting this in both firefox and IE browsers url works correctly:

http://localhost:8080/hranswers/elevate?fl=*%20score&indent=on&start=0&q=%D0%94%D0%BE%D0%B1%D0%B0%D0%B2%D0%B8%D1%82%D1%8C%20%D0%BD%D0%BE%D0%B2%D1%8B%D1%85%20%D0%BA%D0%B0%D0%BD%D0%B4%D0%B8%D0%B4%D0%B0%D1%82%D0%BE%D0%B2&fq=language_cd:ru&rows=20

The POST information my client is sending looks like this and it fails:

POST /hranswers/elevate HTTP/1.1
Accept: */*
Accept-Language: en-US
x-flash-version: 10,0,32,18
Content-Type: application/x-www-form-urlencoded
Content-Encoding: UTF-8
Content-Length: 209
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 
1.1.4322; InfoPath.1; .NET CLR 2.0.50727; .NET CLR 3.0.04506.648; MS-RTC LM 8; 
.NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; UserABC123)
Host: localhost:8080
Connection: Keep-Alive
Pragma: no-cache

fq=language%5Fcd%3Aru&rows=20&start=0&fl=%2A%20score&indent=on&q=%D0%94%D0%BE%D0%B1%D0%B0%D0%B2%D0%B8%D1%82%D1%8C%20%D0%BD%D0%BE%D0%B2%D1%8B%D1%85%20%D0%BA%D0%B0%D0%BD%D0%B4%D0%B8%D0%B4%D0%B0%D1%82%D0%BE%D0%B2

I will keep digging - and let you know how it turns out.

Thanks!


-Original Message-
From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley
Sent: Saturday, October 24, 2009 12:43 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr under tomcat - UTF-8 issue

Try using example/exampledocs/test_utf8.sh to narrow down if the charset 
problems you're hitting are due to servlet container configuration.

-Yonik
http://www.lucidimagination.com


2009/10/24 Glock, Thomas :
>
> Thanks but not working...
>
> I did have the URIEncoding in place and just again moved the URIEncoding 
> attribute to be the first attribute - ensured I saved sever.xml, shut down 
> tomcat, deleted logs and cache and still no luck  Its probably something 
> very simple and I'm just missing it.
>
> Thanks for your help.
>
>
> -Original Message-
> From: Zsolt Czinkos [mailto:czin...@gmail.com]
> Sent: Saturday, October 24, 2009 11:36 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr under tomcat - UTF-8 issue
>
> Hello
>
> Have you set URIEncoding attribute to UTF-8 in tomcat's server.xml (on 
> connector element)?
>
> Like:
>
>  protocol="HTTP/1.1" redirectPort="8443"/>
>
> Hope this helps.
>
> Best regards
>
> czinkos
>
>
> 2009/10/24 Glock, Thomas :
>>
>> Hoping someone can help -
>>
>> Problem:
>>        Querying for non-english phrases such as Добавить do not return any 
>> results under Tomcat but do work when using the Jetty example.
>>
>>        Both tomcat and jetty are being queried by the same custom (flash) 
>> client and both reference the same solr/data/index.
>>
>>        I'm using an http POST rather than http GET to do the query to solr.  
>> I believe the problem must be in how tomcat is configured and had hoped the 
>> -Dfile.encoding=UTF-8 would solve it - but no luck.  I've stopped started 
>> tomcat and deleted the work directory as well.
>>
>>        Results are the same in both IE6 and Firefox and I've used both 
>> firebug and fiddler to view the http request/responses.  It is consistent - 
>> jetty works, tomcat does not.
>>
>> Environment:
>>        Tomcat 6 as a service on WinXP Professional 2002 sp 2
>>        Tomcat Java properties -
>>
>>        -Dcatalina.home=C:\Program Files\Apache Software 
>> Foundation\Tomcat 6.0
>>        -Dcatalina.base=C:\Program Files\Apache Software 
>> Foundation\Tomcat 6.0
>>        -Djava.endorsed.dirs=C:\Program Files\Apache Software 
>> Foundation\Tomcat 6.0\endorsed
>>        -Djava.io.tmpdir=C:\Program Files\Apache Software 
>> Foundation\Tomcat 6.0\temp
>>
>> -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager
>>        -Djava.util.logging.config.file=C:\Program Files\Apache 
>> Software Foundation\Tomcat 6.0\conf\logging.properties
>>        -Dfile.encoding=UTF-8
>>
>> Thanks in advance.
>> Tom Glock
>>
>>
>

RE: Too many open files

2009-10-24 Thread Fuad Efendi

Hi Yonik,

I am still using pre-2.9 Lucene (taken from SOLR trunk two months ago).

2048 is limit for documents, not for array of pointers to documents. And
especially for new "uninverted" SOLR features, plus non-tokenized stored
fields, we need 1Gb to store 1Mb of a simple field only (size of field: 1000
bytes).

May be it would broke... frankly, I started with 8Gb, then by some reason I
set if to 2Gb (a month ago), I don't remember why... I had hardware problems
and I didn't want frequent loose of ram buffer...

But again: why it would broke? Because "int" has 2048M different values?!! 

This is extremely strange. My understanding is that "buffer" stores
processed data such as "term -> document_id" values, _per_field_array(s!!!);
so that 2048M is _absolute_maximum_ in case if your SOLR schema consists
from _single_tokenized_field_only_. What about 10 fields? What about plain
text stored with document, term vectors, "uninverted" values??? What are
reasons on putting such check in Lucene? Array overflow?

-Fuad
http://www.linkedin.com/in/liferay

> -Original Message-
> From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik
Seeley
> Sent: October-24-09 12:27 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Too many open files
> 
> On Sat, Oct 24, 2009 at 12:18 PM, Fuad Efendi  wrote:
> >
> > Mark, I don't understand this; of course it is use case specific, I
haven't
> > seen any terrible behaviour with 8Gb
> 
> If you had gone over 2GB of actual buffer *usage*, it would have
> broke...  Guaranteed.
> We've now added a check in Lucene 2.9.1 that will throw an exception
> if you try to go over 2048MB.
> And as the javadoc says, to be on the safe side, you probably
> shouldn't go too near 2048 - perhaps 2000MB is a good practical limit.
> 
> -Yonik
> http://www.lucidimagination.com

RE: Too many open files

2009-10-24 Thread Fuad Efendi


> > when you store raw (non
> > tokenized, non indexed) "text" value with a document (which almost
everyone
> > does). Try to store 1,000,000 documents with 1000 bytes non-tokenized
field:
> > you will need 1Gb just for this array.
> 
> Nope.  You shouldn't even need 1GB of buffer space for that.
> The size specified is for all things that the indexing process needs
> to temporarily keep in memory... stored fields are normally
> immediately written to disk.
> 
> -Yonik
> http://www.lucidimagination.com


-Ok, thanks for clarification! What about term vectors, what about
non-trivial schema having 10 tokenized fields? Buffer will need 10 arrays
(up to 2048M each) for that. 
My understanding is probably very naive...


-Fuad
http://www.linkedin.com/in/liferay

Re: Solr under tomcat - UTF-8 issue

2009-10-24 Thread Walter Underwood

Don't use POST. That is the wrong HTTP semantic for search results.  
Use GET. That will make it possible to cache the results, will make  
your HTTP logs useful, and all sorts of other good things.


wunder

On Oct 24, 2009, at 10:11 AM, Glock, Thomas wrote:



Thanks - I now think it must be due to my client not sending enough  
( or correct ) headers in the request.


Tomcat does work when using an HTTP GET but is failing the POST from  
my flash client.


For example putting this in both firefox and IE browsers url works  
correctly:


http://localhost:8080/hranswers/elevate?fl=*%20score&indent=on&start=0&q=%D0%94%D0%BE%D0%B1%D0%B0%D0%B2%D0%B8%D1%82%D1%8C%20%D0%BD%D0%BE%D0%B2%D1%8B%D1%85%20%D0%BA%D0%B0%D0%BD%D0%B4%D0%B8%D0%B4%D0%B0%D1%82%D0%BE%D0%B2&fq=language_cd:ru&rows=20

The POST information my client is sending looks like this and it  
fails:


POST /hranswers/elevate HTTP/1.1
Accept: */*
Accept-Language: en-US
x-flash-version: 10,0,32,18
Content-Type: application/x-www-form-urlencoded
Content-Encoding: UTF-8
Content-Length: 209
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;  
SV1; .NET CLR 1.1.4322; InfoPath.1; .NET CLR 2.0.50727; .NET CLR  
3.0.04506.648; MS-RTC LM 8; .NET CLR 3.0.4506.2152; .NET CLR  
3.5.30729; UserABC123)

Host: localhost:8080
Connection: Keep-Alive
Pragma: no-cache

fq=language%5Fcd%3Aru&rows=20&start=0&fl=%2A%20score&indent=on&q= 
%D0%94%D0%BE%D0%B1%D0%B0%D0%B2%D0%B8%D1%82%D1%8C%20%D0%BD%D0%BE 
%D0%B2%D1%8B%D1%85%20%D0%BA%D0%B0%D0%BD 
%D0%B4%D0%B8%D0%B4%D0%B0%D1%82%D0%BE%D0%B2


I will keep digging - and let you know how it turns out.

Thanks!


-Original Message-
From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of  
Yonik Seeley

Sent: Saturday, October 24, 2009 12:43 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr under tomcat - UTF-8 issue

Try using example/exampledocs/test_utf8.sh to narrow down if the  
charset problems you're hitting are due to servlet container  
configuration.


-Yonik
http://www.lucidimagination.com


2009/10/24 Glock, Thomas :


Thanks but not working...

I did have the URIEncoding in place and just again moved the  
URIEncoding attribute to be the first attribute - ensured I saved  
sever.xml, shut down tomcat, deleted logs and cache and still no  
luck  Its probably something very simple and I'm just missing it.


Thanks for your help.


-Original Message-
From: Zsolt Czinkos [mailto:czin...@gmail.com]
Sent: Saturday, October 24, 2009 11:36 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr under tomcat - UTF-8 issue

Hello

Have you set URIEncoding attribute to UTF-8 in tomcat's server.xml  
(on connector element)?


Like:



Hope this helps.

Best regards

czinkos


2009/10/24 Glock, Thomas :


Hoping someone can help -

Problem:
   Querying for non-english phrases such as Добавить do not  
return any results under Tomcat but do work when using the Jetty  
example.


   Both tomcat and jetty are being queried by the same custom  
(flash) client and both reference the same solr/data/index.


   I'm using an http POST rather than http GET to do the query  
to solr.  I believe the problem must be in how tomcat is  
configured and had hoped the -Dfile.encoding=UTF-8 would solve it  
- but no luck.  I've stopped started tomcat and deleted the work  
directory as well.


   Results are the same in both IE6 and Firefox and I've used  
both firebug and fiddler to view the http request/responses.  It  
is consistent - jetty works, tomcat does not.


Environment:
   Tomcat 6 as a service on WinXP Professional 2002 sp 2
   Tomcat Java properties -

   -Dcatalina.home=C:\Program Files\Apache Software
Foundation\Tomcat 6.0
   -Dcatalina.base=C:\Program Files\Apache Software
Foundation\Tomcat 6.0
   -Djava.endorsed.dirs=C:\Program Files\Apache Software
Foundation\Tomcat 6.0\endorsed
   -Djava.io.tmpdir=C:\Program Files\Apache Software
Foundation\Tomcat 6.0\temp

-Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager
   -Djava.util.logging.config.file=C:\Program Files\Apache
Software Foundation\Tomcat 6.0\conf\logging.properties
   -Dfile.encoding=UTF-8

Thanks in advance.
Tom Glock

RE: Solr under tomcat - UTF-8 issue

2009-10-24 Thread Glock, Thomas

Thanks -

I agree.  However my application requires results be trimmed to users based on 
roles.  The roles are repeating values on the documents.  Users have many 
different role combinations as do documents.
I recognize this is going to hamper caching - but using a GET will tend to 
limit the size of search phrases when combined with the boolean role clause.  
And I am concerned with hitting url limits.

At any rate I solved it thanks to Yonik's recommendation.  

My flex client httpservice by default only sets the content-type request header 
to  "application/x-www-form-urlencoded"  what it needed to do for tomcat is set 
the content-type request header to content-type = 
"application/x-www-form-urlencoded; charset=UTF-8"; 

If you have any suggestions regarding limiting results based on user and 
document role permutations - I'm all ears.  I've been to the Search Summit in 
NYC and no vendor could even seem to grasp the concept.  

The problem case statement is this  - I have users globally who need to search 
for content tailored to them.  Users searching for 'Holiday' don't get any 
value from 1 documents having the word holiday. What they need are 
documents authored for that population.  The documents have the associated role 
information as metadata and therefore users will get only the documents they 
have access to and are relevant to them.  That's the plan anyway!  

By chance I stumbled in Solr a month or so ago and I think its awesome.  I got 
the book two days ago too - fantastic!

Thanks again,
Tom

-Original Message-
From: Walter Underwood [mailto:wun...@wunderwood.org] 
Sent: Saturday, October 24, 2009 1:31 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr under tomcat - UTF-8 issue

Don't use POST. That is the wrong HTTP semantic for search results.  
Use GET. That will make it possible to cache the results, will make your HTTP 
logs useful, and all sorts of other good things.

wunder

On Oct 24, 2009, at 10:11 AM, Glock, Thomas wrote:

>
> Thanks - I now think it must be due to my client not sending enough ( 
> or correct ) headers in the request.
>
> Tomcat does work when using an HTTP GET but is failing the POST from 
> my flash client.
>
> For example putting this in both firefox and IE browsers url works
> correctly:
>
> http://localhost:8080/hranswers/elevate?fl=*%20score&indent=on&start=0
> &q=%D0%94%D0%BE%D0%B1%D0%B0%D0%B2%D0%B8%D1%82%D1%8C%20%D0%BD%D0%BE%D0%
> B2%D1%8B%D1%85%20%D0%BA%D0%B0%D0%BD%D0%B4%D0%B8%D0%B4%D0%B0%D1%82%D0%B
> E%D0%B2&fq=language_cd:ru&rows=20
>
> The POST information my client is sending looks like this and it
> fails:
>
> POST /hranswers/elevate HTTP/1.1
> Accept: */*
> Accept-Language: en-US
> x-flash-version: 10,0,32,18
> Content-Type: application/x-www-form-urlencoded
> Content-Encoding: UTF-8
> Content-Length: 209
> Accept-Encoding: gzip, deflate
> User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; 
> .NET CLR 1.1.4322; InfoPath.1; .NET CLR 2.0.50727; .NET CLR 
> 3.0.04506.648; MS-RTC LM 8; .NET CLR 3.0.4506.2152; .NET CLR 
> 3.5.30729; UserABC123)
> Host: localhost:8080
> Connection: Keep-Alive
> Pragma: no-cache
>
> fq=language%5Fcd%3Aru&rows=20&start=0&fl=%2A%20score&indent=on&q=
> %D0%94%D0%BE%D0%B1%D0%B0%D0%B2%D0%B8%D1%82%D1%8C%20%D0%BD%D0%BE
> %D0%B2%D1%8B%D1%85%20%D0%BA%D0%B0%D0%BD
> %D0%B4%D0%B8%D0%B4%D0%B0%D1%82%D0%BE%D0%B2
>
> I will keep digging - and let you know how it turns out.
>
> Thanks!
>
>
> -Original Message-
> From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik 
> Seeley
> Sent: Saturday, October 24, 2009 12:43 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr under tomcat - UTF-8 issue
>
> Try using example/exampledocs/test_utf8.sh to narrow down if the 
> charset problems you're hitting are due to servlet container 
> configuration.
>
> -Yonik
> http://www.lucidimagination.com
>
>
> 2009/10/24 Glock, Thomas :
>>
>> Thanks but not working...
>>
>> I did have the URIEncoding in place and just again moved the 
>> URIEncoding attribute to be the first attribute - ensured I saved 
>> sever.xml, shut down tomcat, deleted logs and cache and still no 
>> luck  Its probably something very simple and I'm just missing it.
>>
>> Thanks for your help.
>>
>>
>> -Original Message-
>> From: Zsolt Czinkos [mailto:czin...@gmail.com]
>> Sent: Saturday, October 24, 2009 11:36 AM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Solr under tomcat - UTF-8 issue
>>
>> Hello
>>
>> Have you set URIEncoding attribute to UTF-8 in tomcat's server.xml 
>> (on connector element)?
>>
>> Like:
>>
>> > protocol="HTTP/1.1" redirectPort="8443"/>
>>
>> Hope this helps.
>>
>> Best regards
>>
>> czinkos
>>
>>
>> 2009/10/24 Glock, Thomas :
>>>
>>> Hoping someone can help -
>>>
>>> Problem:
>>>Querying for non-english phrases such as Добавить do not 
>>> return any results under Tomcat but do work when using the Jetty 
>>> example.
>>>
>>>Both tomcat and jetty are being queri

RE: Too many open files

2009-10-24 Thread Fuad Efendi


> If you had gone over 2GB of actual buffer *usage*, it would have
> broke...  Guaranteed.
> We've now added a check in Lucene 2.9.1 that will throw an exception
> if you try to go over 2048MB.
> And as the javadoc says, to be on the safe side, you probably
> shouldn't go too near 2048 - perhaps 2000MB is a good practical limit.
> 


I browsed http://issues.apache.org/jira/browse/LUCENE-1995 and
http://search.lucidimagination.com/search/document/f29fc52348ab9b63/arrayind
exoutofboundsexception_during_indexing
- it is not proof of concept. It is workaround. Problem still exists, and
scenario is unclear.


-Fuad
http://www.linkedin.com/in/liferay

RE: StreamingUpdateSolrServer - indexing process stops in a couple of hours

2009-10-24 Thread Dadasheva, Olga

I am using java 1.6.0_05

To illustrate what is happening I wrote this test program that has 10 threads 
adding a collection of documents and one thread optimizing the index every 10 
sec.

I am seeing that after the first optimize there is only one thread that keeps 
adding documents. The other ones are locked.

In the real code I ended up adding synchronized around add on optimize to avoid 
this.

public static void main(String[] args) {

final JettySolrRunner jetty = new JettySolrRunner("/solr", 8983 );
try {
jetty.start();
// setup the server...
String url = "http://localhost:8983/solr";;
final StreamingUpdateSolrServer server = new 
StreamingUpdateSolrServer( url, 2, 5 ) {
@Override
public void handleError(Throwable ex) {
// do somethign...
}
};
server.setConnectionTimeout(1000); 
server.setDefaultMaxConnectionsPerHost(100);
server.setMaxTotalConnections(100);
int i = 0;
while (i++ < 10) {
new Thread("add-thread"+i) {
public void run(){
int j = 0;
while (true) {
try {
List docs = 
new ArrayList();
for (int n = 0; n < 50; n++) {
SolrInputDocument doc = new 
SolrInputDocument();
String docID = 
this.getName()+"_doc_"+j++;
doc.addField( "id", docID);
doc.addField( "content", 
"document_"+docID);
docs.add(doc);
}
server.add(docs);

System.out.println(this.getName()+" added "+docs.size()+" documents");
Thread.sleep(100);
} catch (Exception e) {
e.printStackTrace();

System.err.println(this.getName()+" "+e.getLocalizedMessage());
System.exit(0);
}
}
}
}.start();
}

new Thread("optimizer-thread") {
public void run(){
while (true) {
try {
Thread.sleep(1);
server.optimize();
System.out.println(this.getName()+" 
optimized");
} catch (Exception e) {
e.printStackTrace();
System.err.println("optimizer 
"+e.getLocalizedMessage());
System.exit(0);
}
}
}
}.start();


} catch (Exception e) {
e.printStackTrace();
}

}
-Original Message-
From: Lance Norskog [mailto:goks...@gmail.com] 
Sent: Tuesday, October 13, 2009 8:59 PM
To: solr-user@lucene.apache.org
Subject: Re: StreamingUpdateSolrServer - indexing process stops in a couple of 
hours

Which Java release is this?  There are known thread-blocking problems in Java 
1.5.

Also, what sockets are used during this time? Try 'netstat -s | fgrep 8983' (or 
your Solr URL port #) and watch the active, TIME_WAIT, CLOSE_WAIT sockets build 
up. This may give a hint.

On Tue, Oct 13, 2009 at 8:47 AM, Dadasheva, Olga  
wrote:
> Hi,
>
> I am indexing documents using StreamingUpdateSolrServer. My 'setup' 
> code is almost a copy of the junit test of the Solr trunk.
>
>                try {
>                        StreamingUpdateSolrServer streamingServer = new 
> StreamingUpdateSolrServer( url, 2, 5 ) {
>                       �...@override
>                        public void handleError(Throwable ex) {
>                                System.out.println(" new 
> StreamingUpdateSolrServer error "+ex);
>                                mail.send(new 
> Date()+"StreamingUpdateSolrServer error. "+ex);
>                        }
>                      };
>                      streamingServer.setConnectionTimeout(20*6

Re: Solrj Javabin and JSON

Re: Where the new replication pulls the files?

Re: Solrj client API and response in XML format (Solr 1.4)

Re: Solrj Javabin and JSON

Re: Solrj client API and response in XML format (Solr 1.4)

Date Facet Giving Count more than actual

Solr under tomcat - UTF-8 issue

Re: Solr under tomcat - UTF-8 issue

RE: Solr under tomcat - UTF-8 issue

RE: Too many open files

Re: Solrj client API and response in XML format (Solr 1.4)

RE: Too many open files

RE: Too many open files

RE: Too many open files

Re: Too many open files

Re: Solr under tomcat - UTF-8 issue

Re: Too many open files

RE: Solr under tomcat - UTF-8 issue

RE: Too many open files

RE: Too many open files

Re: Solr under tomcat - UTF-8 issue

RE: Solr under tomcat - UTF-8 issue

RE: Too many open files

RE: StreamingUpdateSolrServer - indexing process stops in a couple of hours

24 matches

Site Navigation

Mail list logo

Footer information