date:20091023

Re: Collapse with multiple fields

2009-10-23 Thread Thijs

I haven't had time to actually ask this on the list my self but seeing 
this, I just had to reply. I was wondering this myself.


Thijs

On 23-10-2009 5:50, R. Tan wrote:

Hi,
Is it possible to collapse the results from multiple fields?

Rih

Re: multicore query via solrJ

2009-10-23 Thread Licinio Fernández Maurelo

As no answer is given, I assume it's not possible. It will be great to code
a method like this

query(SolrServer,  List)



El 20 de octubre de 2009 11:21, Licinio Fernández Maurelo <
licinio.fernan...@gmail.com> escribió:

> Hi there,
> is there any way to perform a multi-core query using solrj?
>
> P.S.:
>
> I know about this syntax:
> http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1&q=
> but i'm looking for a more fancy way to do this using solrj (something like
> shards(query) )
>
> thx
>
>
>
> --
> Lici
>



-- 
Lici

Re: multicore query via solrJ

2009-10-23 Thread Noble Paul നോബിള്‍ नोब्ळ्

u guessed it right . Solrj cannot query on multiple cores

2009/10/23 Licinio Fernández Maurelo :
> As no answer is given, I assume it's not possible. It will be great to code
> a method like this
>
> query(SolrServer,  List)
>
>
>
> El 20 de octubre de 2009 11:21, Licinio Fernández Maurelo <
> licinio.fernan...@gmail.com> escribió:
>
>> Hi there,
>> is there any way to perform a multi-core query using solrj?
>>
>> P.S.:
>>
>> I know about this syntax:
>> http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1&q=
>> but i'm looking for a more fancy way to do this using solrj (something like
>> shards(query) )
>>
>> thx
>>
>>
>>
>> --
>> Lici
>>
>
>
>
> --
> Lici
>



-- 
-
Noble Paul | Principal Engineer| AOL | http://aol.com

Classloading issues with solr 1.4 and tomcat

2009-10-23 Thread Joerg Erdmenger

Hi there,
I'm having trouble getting the latest solr from svn (I'm using trunk from
Oct., 22nd, but it didn't work with an earlier revision either) to run in
tomcat.
I've checked it out, built and ran the tests - all fine.
I run the example conf with jetty using the start.jar - all fine

Now I copy the example/solr dir to someplace else, copy the war in dist to
some webapp dir, configure a webapp in tomcat accoding to
http://wiki.apache.org/solr/SolrTomcat, where I set solr/home via JNDI to
the directory just created by copying example/solr.
I then check solrconfig.xml and make sure solr.data.dir is pointing to the
correct location and that the  configs are pointing to valid locations

When I then start tomcat solr fails and I get the following error:

INFO: Solr home set to '/path/to/my/solr-home/'
23.10.2009 10:17:34 org.apache.solr.core.SolrResourceLoader
createClassLoader
INFO: Reusing parent classloader
23.10.2009 10:17:34 org.apache.solr.servlet.SolrDispatchFilter init
SCHWERWIEGEND: Could not start SOLR. Check solr/home property
org.apache.solr.common.SolrException: Error loading class
'solr.FastLRUCache'
at
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:273)
at org.apache.solr.search.CacheConfig.getConfig(CacheConfig.java:90)
at org.apache.solr.search.CacheConfig.getConfig(CacheConfig.java:73)
at org.apache.solr.core.SolrConfig.(SolrConfig.java:128)
at org.apache.solr.core.SolrConfig.(SolrConfig.java:70)
at
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:117)
at
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
at
org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275)
at
org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397)
at
org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:108)
at
org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3800)
at org.apache.catalina.core.StandardContext.start(StandardContext.java:4450)
at
org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:791)
at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:771)
at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:526)
at
org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:630)
at
org.apache.catalina.startup.HostConfig.deployDescriptors(HostConfig.java:556)
at org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:491)
at org.apache.catalina.startup.HostConfig.start(HostConfig.java:1206)
at
org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:314)
at
org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:119)
at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1053)
at org.apache.catalina.core.StandardHost.start(StandardHost.java:722)
at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1045)
at org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443)
at org.apache.catalina.core.StandardService.start(StandardService.java:516)
at org.apache.catalina.core.StandardServer.start(StandardServer.java:710)
at org.apache.catalina.startup.Catalina.start(Catalina.java:583)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:288)
at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:413)
Caused by: java.lang.ClassNotFoundException: solr.FastLRUCache
at
org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1387)
at
org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1233)
at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:399)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:257)
... 33 more


if I uncomment the FastLRU section in solrconfig.xml

solr fails to start as well this time with this error:

INFO: Solr home set to '/path/to/my/solr-home/'
23.10.2009 10:23:50 org.apache.solr.core.SolrResourceLoader
createClassLoader
INFO: Reusing parent classloader
23.10.2009 10:23:50 org.apache.solr.core.SolrConfig 
INFO: Loaded SolrConfig: solrconfig.xml
23.10.2009 10:23:50 org.apache.solr.core.SolrCore 
INFO: Opening new SolrCore at /path/to/my/solr-home/,
dataDir=/path/to/my/solr-home/data/
23.10.2009 10:23:50 org.apache.solr.schema.IndexSchema readSchema
INFO: Reading Solr Schema
23.10.2009 10:23:50 org.apache.solr.schema.IndexSchema readSchema
INFO: Schema name=example
23.10.2009 10:23:50 org.apache.solr.util.plugin.AbstractPluginLoader load
INFO: created string: org.apache.solr.schema.StrField
23.

SolrJ and Json

2009-10-23 Thread SGE0


Hi ,

I have following problem: 
Using CommonsHttpSolrServer (javabin format) I do a query with wt=json and
get following response (by using  qresponse = solr.query(params); and then
qresponse.toString();

{responseHeader={status=0,QTime=16,params={indent=on,start=0,q=mmm,qt=dismax,wt=[javabin,
javabin],hl=on,rows=10,version=[1,
1]}},response={numFound=0,start=0,docs=[]},highlighting={}}

Now this does not seems to be JSON format (or is it ) ?

Should the equal sign not be a ':' and the values surrounded with double
quotes ?

The problem is that I want to pass the qresponse to a Javascript variable so
the client javascript code can then inspect the JSON response and do
whatever is needed.

What I did was:

 var str = "<%=qresponse.toString()%>"; 

but I can't seem to correctly read the str variable as a JSON object and
parse it (on the client side).

Any ideas or code snippets to show the correct way ?

Regards,

St.





-- 
View this message in context: 
http://www.nabble.com/SolrJ-and-Json-tp26022705p26022705.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Collapse with multiple fields

2009-10-23 Thread Martijn v Groningen

No this actually not supported at the moment. If you really need to
collapse on two different field you can concatenate the two fields
together in another field while indexing and then collapse on that
field.

Martijn

2009/10/23 Thijs :
> I haven't had time to actually ask this on the list my self but seeing this,
> I just had to reply. I was wondering this myself.
>
> Thijs
>
> On 23-10-2009 5:50, R. Tan wrote:
>>
>> Hi,
>> Is it possible to collapse the results from multiple fields?
>>
>> Rih
>>
>

Re: Constant Score Queries and Function Queries

2009-10-23 Thread Grant Ingersoll



On Oct 22, 2009, at 9:44 PM, Chris Hostetter wrote:



: > Why wouldn't you just query the function directly and leave out  
the *:* ?

:
: *:* was just a quick example, I might have other constant score  
queries, but I
: guess I probably could do a filter query plus the function query,  
too.


I guess i don't udnerstand what your point was ... you mentioned that
using a function query with *:* didn't produce scores that equal to  
the
function output, but that's just the nature of how a BooleanQuery  
works,

it aggregates the clauses.  if you *want* the scores to be a factor of
both clauses, then use a booleanQuery, if you have a clause that you  
don't

want factoring into the query, use "fq"



Fair enough, I guess I was just kind of expecting a constant score  
query + a function query to result in a score of whatever the function  
query is.  This is a common trick to sort by a function, but it's easy  
enough to just ^0 the non function clause.


-Grant

Re: SolrJ and Json

2009-10-23 Thread Noble Paul നോബിള്‍ नोब्ळ्

CommonsHttpSolrServer will overwrite the wt param depending on the
responseParser set.There are  only two response parsers. javabin and
xml.

The qresponse.toString() actually is a String reperesentation of a
namedList object . it has nothing to do with JSON

On Fri, Oct 23, 2009 at 2:11 PM, SGE0  wrote:
>
> Hi ,
>
> I have following problem:
> Using CommonsHttpSolrServer (javabin format) I do a query with wt=json and
> get following response (by using  qresponse = solr.query(params); and then
> qresponse.toString();
>
> {responseHeader={status=0,QTime=16,params={indent=on,start=0,q=mmm,qt=dismax,wt=[javabin,
> javabin],hl=on,rows=10,version=[1,
> 1]}},response={numFound=0,start=0,docs=[]},highlighting={}}
>
> Now this does not seems to be JSON format (or is it ) ?
>
> Should the equal sign not be a ':' and the values surrounded with double
> quotes ?
>
> The problem is that I want to pass the qresponse to a Javascript variable so
> the client javascript code can then inspect the JSON response and do
> whatever is needed.
>
> What I did was:
>
>  var str = "<%=qresponse.toString()%>";
>
> but I can't seem to correctly read the str variable as a JSON object and
> parse it (on the client side).
>
> Any ideas or code snippets to show the correct way ?
>
> Regards,
>
> St.
>
>
>
>
>
> --
> View this message in context: 
> http://www.nabble.com/SolrJ-and-Json-tp26022705p26022705.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>



-- 
-
Noble Paul | Principal Engineer| AOL | http://aol.com

Re: SolrJ and Json

2009-10-23 Thread SGE0


Hi, 

thx for the fast response.

So, is there a way to convert the response (javabin) to JSON ?

Regards,

S.






Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
> 
> CommonsHttpSolrServer will overwrite the wt param depending on the
> responseParser set.There are  only two response parsers. javabin and
> xml.
> 
> The qresponse.toString() actually is a String reperesentation of a
> namedList object . it has nothing to do with JSON
> 
> On Fri, Oct 23, 2009 at 2:11 PM, SGE0  wrote:
>>
>> Hi ,
>>
>> I have following problem:
>> Using CommonsHttpSolrServer (javabin format) I do a query with wt=json
>> and
>> get following response (by using  qresponse = solr.query(params); and
>> then
>> qresponse.toString();
>>
>> {responseHeader={status=0,QTime=16,params={indent=on,start=0,q=mmm,qt=dismax,wt=[javabin,
>> javabin],hl=on,rows=10,version=[1,
>> 1]}},response={numFound=0,start=0,docs=[]},highlighting={}}
>>
>> Now this does not seems to be JSON format (or is it ) ?
>>
>> Should the equal sign not be a ':' and the values surrounded with double
>> quotes ?
>>
>> The problem is that I want to pass the qresponse to a Javascript variable
>> so
>> the client javascript code can then inspect the JSON response and do
>> whatever is needed.
>>
>> What I did was:
>>
>>  var str = "<%=qresponse.toString()%>";
>>
>> but I can't seem to correctly read the str variable as a JSON object and
>> parse it (on the client side).
>>
>> Any ideas or code snippets to show the correct way ?
>>
>> Regards,
>>
>> St.
>>
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://www.nabble.com/SolrJ-and-Json-tp26022705p26022705.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 
> 
> -- 
> -
> Noble Paul | Principal Engineer| AOL | http://aol.com
> 
> 

-- 
View this message in context: 
http://www.nabble.com/SolrJ-and-Json-tp26022705p26025339.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrJ and Json

2009-10-23 Thread Noble Paul നോബിള്‍ नोब्ळ्

Why don't you directly hit Solr with wt=json? That will give you the
output as JSON

On Fri, Oct 23, 2009 at 5:53 PM, SGE0  wrote:
>
> Hi,
>
> thx for the fast response.
>
> So, is there a way to convert the response (javabin) to JSON ?
>
> Regards,
>
> S.
>
>
>
>
>
>
> Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
>>
>> CommonsHttpSolrServer will overwrite the wt param depending on the
>> responseParser set.There are  only two response parsers. javabin and
>> xml.
>>
>> The qresponse.toString() actually is a String reperesentation of a
>> namedList object . it has nothing to do with JSON
>>
>> On Fri, Oct 23, 2009 at 2:11 PM, SGE0  wrote:
>>>
>>> Hi ,
>>>
>>> I have following problem:
>>> Using CommonsHttpSolrServer (javabin format) I do a query with wt=json
>>> and
>>> get following response (by using  qresponse = solr.query(params); and
>>> then
>>> qresponse.toString();
>>>
>>> {responseHeader={status=0,QTime=16,params={indent=on,start=0,q=mmm,qt=dismax,wt=[javabin,
>>> javabin],hl=on,rows=10,version=[1,
>>> 1]}},response={numFound=0,start=0,docs=[]},highlighting={}}
>>>
>>> Now this does not seems to be JSON format (or is it ) ?
>>>
>>> Should the equal sign not be a ':' and the values surrounded with double
>>> quotes ?
>>>
>>> The problem is that I want to pass the qresponse to a Javascript variable
>>> so
>>> the client javascript code can then inspect the JSON response and do
>>> whatever is needed.
>>>
>>> What I did was:
>>>
>>>  var str = "<%=qresponse.toString()%>";
>>>
>>> but I can't seem to correctly read the str variable as a JSON object and
>>> parse it (on the client side).
>>>
>>> Any ideas or code snippets to show the correct way ?
>>>
>>> Regards,
>>>
>>> St.
>>>
>>>
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://www.nabble.com/SolrJ-and-Json-tp26022705p26022705.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>
>>>
>>
>>
>>
>> --
>> -
>> Noble Paul | Principal Engineer| AOL | http://aol.com
>>
>>
>
> --
> View this message in context: 
> http://www.nabble.com/SolrJ-and-Json-tp26022705p26025339.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>



-- 
-
Noble Paul | Principal Engineer| AOL | http://aol.com

Re:Classloading issues with solr 1.4 and tomcat

2009-10-23 Thread Joerg Erdmenger

Forget my mail about that, there was an old 1.3 webapp interfering with the
new webapp and I didn't immediately relise that
Sorry for the noise

Jörg

CAS client configuration with MT4-PHP.

2009-10-23 Thread Radha C.

Hi,
 
We have CAS server of spring integrated and it is running in apache. We have
application in MovableType4 - PHP.
Is it possible to configure the MT4 authentication module to redirect to
external CAS server when the application recieves login request? 
It would be helpful if there is any document available for this.
 
Thanks in advance.

QTime always a multiple of 50ms ?

2009-10-23 Thread Jérôme Etévé

Hi all,

 I'm using Solr trunk from 2009-10-12 and I noticed that the QTime
result is always a multiple of roughly 50ms, regardless of the used
handler.

For instance, for the update handler, I get :

INFO: [idx1] webapp=/solr path=/update/ params={} status=0 QTime=0
INFO: [idx1] webapp=/solr path=/update/ params={} status=0 QTime=104
INFO: [idx1] webapp=/solr path=/update/ params={} status=0 QTime=52
...

Is this a known issue ?

Cheers!

J.

-- 
Jerome Eteve.
http://www.eteve.net
jer...@eteve.net

help with how to search using spaces in the query for string fields...

2009-10-23 Thread Dan A. Dickey

I'm having a problem with figuring out how to search for things
that have spaces (just a single space character) in them.
For example, I have a field named "FileName" and it is of type string.
I've indexed a couple of documents, that have field FileName
equal to "File 10 10AM" and another that has FileName "File 11 11AM".
In my search query, I'm trying "Filename:(File 1*)" and I'd like to it
return both documents.  It return none.
If I search for "Filename:(File*)" I get both of them and everything else.
I've tried lots of different ways to form the query, but the only thing
that returns any documents is the "FileName:(File*)" form.  Anything
else with an actual space in it fails.
This has got to be another simple thing that I'm missing, but haven't
figured it out yet nor stumbled upon the correct search query.
Help please!  The above scenario is an example, but I am using
the string field type.
-Dan

-- 
Dan A. Dickey | Senior Software Engineer

Savvis
10900 Hampshire Ave. S., Bloomington, MN  55438
Office: 952.852.4803 | Fax: 952.852.4951
E-mail: dan.dic...@savvis.net

number of Solr indexes per Tomcat instance

2009-10-23 Thread Erik_l


Hi,

Currently we're running 10 Solr indexes inside a single Tomcat6 instance. In
the near future we would like to add another 30-40 indexes to every Tomcat
instance we host. What are the factors we have to take into account when
planning for such deployments? Obviously we do know the sizes of the indexes
but for example how much memory does Solr need to be allocated given that
each index is treated as a webapp in Tomcat. Also, do you know if Tomcat has
got a limit in number of apps that can be deployed (maybe I should ask this
questions in a Tomcat forum). 

Thanks
E
-- 
View this message in context: 
http://www.nabble.com/number-of-Solr-indexes-per-Tomcat-instance-tp26027238p26027238.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: number of Solr indexes per Tomcat instance

2009-10-23 Thread Marc Sturlese


Are you using one single solr instance with multicore or multiple solr
instances with one index each?

Erik_l wrote:
> 
> Hi,
> 
> Currently we're running 10 Solr indexes inside a single Tomcat6 instance.
> In the near future we would like to add another 30-40 indexes to every
> Tomcat instance we host. What are the factors we have to take into account
> when planning for such deployments? Obviously we do know the sizes of the
> indexes but for example how much memory does Solr need to be allocated
> given that each index is treated as a webapp in Tomcat. Also, do you know
> if Tomcat has got a limit in number of apps that can be deployed (maybe I
> should ask this questions in a Tomcat forum). 
> 
> Thanks
> E
> 

-- 
View this message in context: 
http://www.nabble.com/number-of-Solr-indexes-per-Tomcat-instance-tp26027238p26027304.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: CAS client configuration with MT4-PHP.

2009-10-23 Thread Noble Paul നോബിള്‍ नोब्ळ्

Is it a query related to  Solr ?

On Fri, Oct 23, 2009 at 6:46 PM, Radha C.  wrote:
> Hi,
>
> We have CAS server of spring integrated and it is running in apache. We have
> application in MovableType4 - PHP.
> Is it possible to configure the MT4 authentication module to redirect to
> external CAS server when the application recieves login request?
> It would be helpful if there is any document available for this.
>
> Thanks in advance.
>



-- 
-
Noble Paul | Principal Engineer| AOL | http://aol.com

Re: help with how to search using spaces in the query for string fields...

2009-10-23 Thread Dan A. Dickey

On Friday 23 October 2009 09:36:02 am AHMET ARSLAN wrote:
> 
> --- On Fri, 10/23/09, Dan A. Dickey  wrote:
> 
> > From: Dan A. Dickey 
> > Subject: help with how to search using spaces in the query for string 
> > fields...
> > To: solr-user@lucene.apache.org
> > Date: Friday, October 23, 2009, 5:12 PM
> > I'm having a problem with figuring
> > out how to search for things
> > that have spaces (just a single space character) in them.
> > For example, I have a field named "FileName" and it is of
> > type string.
> > I've indexed a couple of documents, that have field
> > FileName
> > equal to "File 10 10AM" and another that has FileName "File
> > 11 11AM".
> > In my search query, I'm trying "Filename:(File 1*)" and I'd
> > like to it
> > return both documents.  It return none.
> > If I search for "Filename:(File*)" I get both of them and
> > everything else.
> > I've tried lots of different ways to form the query, but
> > the only thing
> > that returns any documents is the "FileName:(File*)"
> > form.  Anything
> > else with an actual space in it fails.
> > This has got to be another simple thing that I'm missing,
> > but haven't
> > figured it out yet nor stumbled upon the correct search
> > query.
> > Help please!  The above scenario is an example, but I
> > am using
> > the string field type.
> > -Dan
> 
> You need to escape space. Try this one :  Filename:(File\ 1*)

Ahmet,
When I first saw your suggestion - I thought - I tried that.  It didn't work.
I went and did it again - it works beautifully!  Evidently, I *did not* try it.
You're a genius!  Thank you again!
-Dan 

-- 
Dan A. Dickey | Senior Software Engineer

Savvis
10900 Hampshire Ave. S., Bloomington, MN  55438
Office: 952.852.4803 | Fax: 952.852.4951
E-mail: dan.dic...@savvis.net

NGram query failing

2009-10-23 Thread Charlie Jackson

I have a requirement to be able to find hits within words in a free-form
id field. The field can have any type of alphanumeric data - it's as
likely it will be something like "123456" as it is to be "SUN-123-ABC".
I thought of using NGrams to accomplish the task, but I'm having a
problem. I set up a field like this

 









  



 

After indexing a field like this, the analysis page indicates my queries
should work. If I give it a sample field value of "ABC-123456-SUN" and a
query value of "45" it shows hits in several places, which is what I
expected.

 

However, when I actually query the field with something like "45" I get
no hits back. Looking at the debugQuery output, it looks like it's
taking my analyzed query text and putting it into a phrase query. So,
for a query of "45" it turns into a phrase query of :"4 5 45"
which then doesn't hit on anything in my index.

 

What am I missing to make this work?

 

- Charlie

Re: help with how to search using spaces in the query for string fields...

2009-10-23 Thread AHMET ARSLAN



--- On Fri, 10/23/09, Dan A. Dickey  wrote:

> From: Dan A. Dickey 
> Subject: help with how to search using spaces in the query for string 
> fields...
> To: solr-user@lucene.apache.org
> Date: Friday, October 23, 2009, 5:12 PM
> I'm having a problem with figuring
> out how to search for things
> that have spaces (just a single space character) in them.
> For example, I have a field named "FileName" and it is of
> type string.
> I've indexed a couple of documents, that have field
> FileName
> equal to "File 10 10AM" and another that has FileName "File
> 11 11AM".
> In my search query, I'm trying "Filename:(File 1*)" and I'd
> like to it
> return both documents.  It return none.
> If I search for "Filename:(File*)" I get both of them and
> everything else.
> I've tried lots of different ways to form the query, but
> the only thing
> that returns any documents is the "FileName:(File*)"
> form.  Anything
> else with an actual space in it fails.
> This has got to be another simple thing that I'm missing,
> but haven't
> figured it out yet nor stumbled upon the correct search
> query.
> Help please!  The above scenario is an example, but I
> am using
> the string field type.
>     -Dan

You need to escape space. Try this one :  Filename:(File\ 1*)

Re: number of Solr indexes per Tomcat instance

2009-10-23 Thread Erik_l


We're not using multicore. Today, one Tomcat instance host a number of
indexes in form of 10 Solr indexes (10 individual war files).


Marc Sturlese wrote:
> 
> Are you using one single solr instance with multicore or multiple solr
> instances with one index each?
> 
> Erik_l wrote:
>> 
>> Hi,
>> 
>> Currently we're running 10 Solr indexes inside a single Tomcat6 instance.
>> In the near future we would like to add another 30-40 indexes to every
>> Tomcat instance we host. What are the factors we have to take into
>> account when planning for such deployments? Obviously we do know the
>> sizes of the indexes but for example how much memory does Solr need to be
>> allocated given that each index is treated as a webapp in Tomcat. Also,
>> do you know if Tomcat has got a limit in number of apps that can be
>> deployed (maybe I should ask this questions in a Tomcat forum). 
>> 
>> Thanks
>> E
>> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/number-of-Solr-indexes-per-Tomcat-instance-tp26027238p26028083.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: number of Solr indexes per Tomcat instance

2009-10-23 Thread Marc Sturlese


Probably multicore would give you better performance... I think most
important factors to take into account are the size of the index and the
traffic you have to hold. With enought RAM memory you can hold 40 cores in a
singe solr instance (or even more) but depending on the traffic you have to
hold you will suffer of slow response times.

Erik_l wrote:
> 
> We're not using multicore. Today, one Tomcat instance host a number of
> indexes in form of 10 Solr indexes (10 individual war files).
> 
> 
> Marc Sturlese wrote:
>> 
>> Are you using one single solr instance with multicore or multiple solr
>> instances with one index each?
>> 
>> Erik_l wrote:
>>> 
>>> Hi,
>>> 
>>> Currently we're running 10 Solr indexes inside a single Tomcat6
>>> instance. In the near future we would like to add another 30-40 indexes
>>> to every Tomcat instance we host. What are the factors we have to take
>>> into account when planning for such deployments? Obviously we do know
>>> the sizes of the indexes but for example how much memory does Solr need
>>> to be allocated given that each index is treated as a webapp in Tomcat.
>>> Also, do you know if Tomcat has got a limit in number of apps that can
>>> be deployed (maybe I should ask this questions in a Tomcat forum). 
>>> 
>>> Thanks
>>> E
>>> 
>> 
>> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/number-of-Solr-indexes-per-Tomcat-instance-tp26027238p26028437.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: QTime always a multiple of 50ms ?

2009-10-23 Thread Andrzej Bialecki


Jérôme Etévé wrote:

Hi all,

 I'm using Solr trunk from 2009-10-12 and I noticed that the QTime
result is always a multiple of roughly 50ms, regardless of the used
handler.

For instance, for the update handler, I get :

INFO: [idx1] webapp=/solr path=/update/ params={} status=0 QTime=0
INFO: [idx1] webapp=/solr path=/update/ params={} status=0 QTime=104
INFO: [idx1] webapp=/solr path=/update/ params={} status=0 QTime=52
...

Is this a known issue ?


It may be an issue with System.currentTimeMillis() resolution on some 
platforms (e.g. Windows)?



--
Best regards,
Andrzej Bialecki <><
 ___. ___ ___ ___ _ _   __
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Result missing from query, but match shows in Field Analysis tool

2009-10-23 Thread Andrew Clegg


Hi,

I have a field in my index called related_ids, indexed and stored, with the
following field type:









Several records in my index contain the token 1cuk in the related_ids field,
but only *some* of them are returned when I query on this. e.g. if I send a
query like this:

http://localhost:8080/solr/select/?q=id:2.40.50+AND+related_ids:1cuk&version=2.2&start=0&rows=20&indent=on&fl=id,title,related_ids

I get a single hit for the record with id:2.40.50 . But if I try this, on a
different record with id:2.40 :

http://localhost:8080/solr/select/?q=id:2.40+AND+related_ids:1cuk&version=2.2&start=0&rows=20&indent=on&fl=id,title,related_ids

I get no hits. However, if I just query for id:2.40 ...

http://localhost:8080/solr/select/?q=id:2.40&version=2.2&start=0&rows=20&indent=on&fl=id,title,related_ids

I can clearly see the token "1cuk" in the related_ids field.

Not only that, but if I copy and paste record 2.40's related_ids field into
the Field Analysis tool in the admin interface, and search on "1cuk", the
term 1cuk is visible in the index analyzer's term list, and highlighted! So
Field Analysis thinks that I *should* be getting a hit for this term.

Can anyone suggest how I'd go about diagnosing this? I'm kind of hitting a
brick wall here.

If it makes any difference, related_ids for the culprit record 2.40 is
large-ish but not enormous (31000 terms). Also I've tried stopping and
restarting Solr in case it was some weird caching thing.

Thanks in advance,

Andrew.

-- 
View this message in context: 
http://www.nabble.com/Result-missing-from-query%2C-but-match-shows-in-Field-Analysis-tool-tp26029040p26029040.html
Sent from the Solr - User mailing list archive at Nabble.com.

Solrj client API and response in XML format (Solr 1.4)

2009-10-23 Thread SGE0


Hi All,

After a day of searching I'm quite confused.

I use the solrj client as follows:

CommonsHttpSolrServer solr = new
CommonsHttpSolrServer("http://127.0.0.1:8080/apache-solr-1.4-dev/test";);
solr.setRequestWriter(new BinaryRequestWriter());

ModifiableSolrParams params = new ModifiableSolrParams();
params.set("qt", "dismax");
params.set("indent", "on");
params.set("version", "2.2");
params.set("q", "test");
params.set("start", "0");
params.set("rows", "10");
params.set("wt", "xml");
params.set("hl", "on");
QueryResponse response = solr.query(params);


How can I get the query result (response) in XML format out f?

I know it sounds stupid but I can't seem to manage that.

What do I need to do with the response object to get the response in XML
format ?

I already understood I can"t get the result in JSON so my idea was to go
from XML to JSON.

Thx for your answer already !

S.




System.out.println("response = " + response);
SolrDocumentList  sdl  =  response.getResults();
-- 
View this message in context: 
http://www.nabble.com/Solrj-client-API-and-response-in-XML-format-%28Solr-1.4%29-tp26029197p26029197.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: number of Solr indexes per Tomcat instance

2009-10-23 Thread wojtekpia


I ran into trouble running several cores (either as Solr multi-core or as
separate web apps) in a single JVM because the Java garbage collector would
freeze all cores during a collection. This may not be an issue if you're not
dealing with large amounts of memory. My solution is to run each web app in
its own JVM and Tomcat instance.

-- 
View this message in context: 
http://www.nabble.com/number-of-Solr-indexes-per-Tomcat-instance-tp26027238p26029243.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Result missing from query, but match shows in Field Analysis tool

2009-10-23 Thread Erick Erickson

I'm really reaching here, but lucene only indexes the first 10,000 terms by
default (you can up the limit). Is there a chancethat you're hitting that
limit? That 1cuk is past the 10,000th term
in record 2.40?

For this to be possible, I have to assume that the FieldAnalysis
tool ignores this limit

FWIW
Erick

On Fri, Oct 23, 2009 at 12:01 PM, Andrew Clegg wrote:

>
> Hi,
>
> I have a field in my index called related_ids, indexed and stored, with the
> following field type:
>
>
> positionIncrementGap="100">
>
> pattern="\W*\s+\W*" />
>
>
>
>
> Several records in my index contain the token 1cuk in the related_ids
> field,
> but only *some* of them are returned when I query on this. e.g. if I send a
> query like this:
>
>
> http://localhost:8080/solr/select/?q=id:2.40.50+AND+related_ids:1cuk&version=2.2&start=0&rows=20&indent=on&fl=id,title,related_ids
>
> I get a single hit for the record with id:2.40.50 . But if I try this, on a
> different record with id:2.40 :
>
>
> http://localhost:8080/solr/select/?q=id:2.40+AND+related_ids:1cuk&version=2.2&start=0&rows=20&indent=on&fl=id,title,related_ids
>
> I get no hits. However, if I just query for id:2.40 ...
>
>
> http://localhost:8080/solr/select/?q=id:2.40&version=2.2&start=0&rows=20&indent=on&fl=id,title,related_ids
>
> I can clearly see the token "1cuk" in the related_ids field.
>
> Not only that, but if I copy and paste record 2.40's related_ids field into
> the Field Analysis tool in the admin interface, and search on "1cuk", the
> term 1cuk is visible in the index analyzer's term list, and highlighted! So
> Field Analysis thinks that I *should* be getting a hit for this term.
>
> Can anyone suggest how I'd go about diagnosing this? I'm kind of hitting a
> brick wall here.
>
> If it makes any difference, related_ids for the culprit record 2.40 is
> large-ish but not enormous (31000 terms). Also I've tried stopping and
> restarting Solr in case it was some weird caching thing.
>
> Thanks in advance,
>
> Andrew.
>
> --
> View this message in context:
> http://www.nabble.com/Result-missing-from-query%2C-but-match-shows-in-Field-Analysis-tool-tp26029040p26029040.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>

Re: Result missing from query, but match shows in Field Analysis tool

2009-10-23 Thread Andrew Clegg



That's probably it! It is quite near the end of the field. I'll try upping
it and re-indexing.

Thanks :-)


Erick Erickson wrote:
> 
> I'm really reaching here, but lucene only indexes the first 10,000 terms
> by
> default (you can up the limit). Is there a chancethat you're hitting that
> limit? That 1cuk is past the 10,000th term
> in record 2.40?
> 
> For this to be possible, I have to assume that the FieldAnalysis
> tool ignores this limit
> 
> FWIW
> Erick
> 
> On Fri, Oct 23, 2009 at 12:01 PM, Andrew Clegg
> wrote:
> 
>>
>> Hi,
>>
>> I have a field in my index called related_ids, indexed and stored, with
>> the
>> following field type:
>>
>>
>>> positionIncrementGap="100">
>>
>>> pattern="\W*\s+\W*" />
>>
>>
>>
>>
>> Several records in my index contain the token 1cuk in the related_ids
>> field,
>> but only *some* of them are returned when I query on this. e.g. if I send
>> a
>> query like this:
>>
>>
>> http://localhost:8080/solr/select/?q=id:2.40.50+AND+related_ids:1cuk&version=2.2&start=0&rows=20&indent=on&fl=id,title,related_ids
>>
>> I get a single hit for the record with id:2.40.50 . But if I try this, on
>> a
>> different record with id:2.40 :
>>
>>
>> http://localhost:8080/solr/select/?q=id:2.40+AND+related_ids:1cuk&version=2.2&start=0&rows=20&indent=on&fl=id,title,related_ids
>>
>> I get no hits. However, if I just query for id:2.40 ...
>>
>>
>> http://localhost:8080/solr/select/?q=id:2.40&version=2.2&start=0&rows=20&indent=on&fl=id,title,related_ids
>>
>> I can clearly see the token "1cuk" in the related_ids field.
>>
>> Not only that, but if I copy and paste record 2.40's related_ids field
>> into
>> the Field Analysis tool in the admin interface, and search on "1cuk", the
>> term 1cuk is visible in the index analyzer's term list, and highlighted!
>> So
>> Field Analysis thinks that I *should* be getting a hit for this term.
>>
>> Can anyone suggest how I'd go about diagnosing this? I'm kind of hitting
>> a
>> brick wall here.
>>
>> If it makes any difference, related_ids for the culprit record 2.40 is
>> large-ish but not enormous (31000 terms). Also I've tried stopping and
>> restarting Solr in case it was some weird caching thing.
>>
>> Thanks in advance,
>>
>> Andrew.
>>
>> --
>> View this message in context:
>> http://www.nabble.com/Result-missing-from-query%2C-but-match-shows-in-Field-Analysis-tool-tp26029040p26029040.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Result-missing-from-query%2C-but-match-shows-in-Field-Analysis-tool-tp26029040p26029417.html
Sent from the Solr - User mailing list archive at Nabble.com.

Issues adding document to EmbeddedSolrServer

2009-10-23 Thread egon . o

Hi everybody,

I just started playing with Solr and think of it as a quite useful tool!
I'm using Solrj (Solr 1.3) in combination with an EmbeddedSolrServer. I 
managed to get the server running and implemented a method (following the 
Solrj Wiki) to create a document and add it to the server's index. The 
method looks like the following:

public void fillIndex() {
        SolrInputDocument doc1 = new SolrInputDocument();
        doc1.addField("id", "id1", 1.0f);
        doc1.addField("name", "doc1", 1.0f);
        doc1.addField("price", 10);

     try {
             server.add(doc1);
         } catch (Exception e) {
          e.printStackTrace();
         }

        try {
            server.commit(true, true);
        } catch (Exception e) {
            e.printStackTrace();
        }
}

My problem now is, that the method server.add() never finishes which leads 
the the whole fillIndex() method to crash. It's like it throws an 
exception, which is not catched and the server.commit() is never executed.

I already used the maxTime configuration in the solrconfig.xml to commit 
new documents automatically. This looks like the following:

 
  1
  1000 


This works. But I want the explicit commit to work, as this looks like the 
way it should be done! In addition, this would give me better control over 
adding new stuff.

I assume this problem won't be the big challenge for an expert. :)
Any hints are appreciated!!

Thanks in advance.
Egon
-- 
Jetzt kostenlos herunterladen: Internet Explorer 8 und Mozilla Firefox 3.5 -
sicherer, schneller und einfacher! http://portal.gmx.net/de/go/atbrowser

Re: number of Solr indexes per Tomcat instance

2009-10-23 Thread Erik_l


That's a really good point. I didn't think about the GCs. Obviously we don't
want to have all the indexes hanging if full GC occur. Wee running a +8GB
heap so GCs are very important to us.

Thanks
Erik


wojtekpia wrote:
> 
> I ran into trouble running several cores (either as Solr multi-core or as
> separate web apps) in a single JVM because the Java garbage collector
> would freeze all cores during a collection. This may not be an issue if
> you're not dealing with large amounts of memory. My solution is to run
> each web app in its own JVM and Tomcat instance.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/number-of-Solr-indexes-per-Tomcat-instance-tp26027238p26029654.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Collapse with multiple fields

2009-10-23 Thread R. Tan

Clever, I think that would work in some cases.

On Fri, Oct 23, 2009 at 5:22 PM, Martijn v Groningen <
martijn.is.h...@gmail.com> wrote:

> No this actually not supported at the moment. If you really need to
> collapse on two different field you can concatenate the two fields
> together in another field while indexing and then collapse on that
> field.
>
> Martijn
>
> 2009/10/23 Thijs :
> > I haven't had time to actually ask this on the list my self but seeing
> this,
> > I just had to reply. I was wondering this myself.
> >
> > Thijs
> >
> > On 23-10-2009 5:50, R. Tan wrote:
> >>
> >> Hi,
> >> Is it possible to collapse the results from multiple fields?
> >>
> >> Rih
> >>
> >
>

keep index in production and snapshots in separate phisical disks

2009-10-23 Thread Marc Sturlese


Is there any way to make snapinstaller install the index in
spanpshot20091023124543 (for example) from another disk? I am asking this
because I would like not to optimize the index in the master (if I do that
it takes a long time to send it via rsync if it is so big). This way I would
just have to send the new segments.
In the slave I would have 2 phisical disks. Snappuller would send the
snapshot to a disk (here the index would not be optimized). Snapinstaller
would install the snapshot in the other disk, optimize it and open the
newIndexReader. The optimization should be done in the disk wich contains
the "not in production index" to not affect the search request speed.
Any idea what should I hack to reach this goal in case it is possible?
-- 
View this message in context: 
http://www.nabble.com/keep-index-in-production-and-snapshots-in-separate-phisical-disks-tp26029666p26029666.html
Sent from the Solr - User mailing list archive at Nabble.com.

Is optimized?

2009-10-23 Thread William Pierce

Folks:

If I issue two  requests with no intervening changes to the index,  
will the second optimize request be smart enough to not do anything?

Thanks,

Bill

Too many open files

2009-10-23 Thread Ranganathan, Sharmila

Hi,

I am getting too many open files error.

Usually I test on a server that has 4GB RAM and assigned 1GB for
tomcat(set JAVA_OPTS=-Xms256m -Xmx1024m), ulimit -n is 256 for this
server and has following setting for SolrConfig.xml

 

true

1024

100

2147483647

1

 

In my case 200,000 documents is of 1024MB size and in this testing, I am
indexing total of million documents. We have high setting because we are
expected to index about 10+ million records in production. It works fine
in this server. 

 

When I deploy same solr configuration on a server with 32GB RAM, I get
"too many open files" error. The ulimit -n is 1024 for this server. Any
idea? Is this because 2nd server has 32GB RAM? Is 1024 open files limit
too low? Also I don't find any documentation for .
I checked Solr 'Solr 1.4 Enterprise Search Server' book, wiki, etc. I am
using Solr 1.3.

 

Is it good idea to use ramBufferSizeMB? Vs maxBufferedDocs?  What does
ramBufferSizeMB mean? My understanding is that when documents added to
index which are initially stored in memory reaches size
1024MB(ramBufferSizeMB), it flushes data to disk. Or is it when total
memory used(by tomcat, etc) reaches 1024, it flushed data to disk?

 

Thanks,

Sharmila

Re: QTime always a multiple of 50ms ?

2009-10-23 Thread Jérôme Etévé

2009/10/23 Andrzej Bialecki :
> Jérôme Etévé wrote:
>>
>> Hi all,
>>
>>  I'm using Solr trunk from 2009-10-12 and I noticed that the QTime
>> result is always a multiple of roughly 50ms, regardless of the used
>> handler.
>>
>> For instance, for the update handler, I get :
>>
>> INFO: [idx1] webapp=/solr path=/update/ params={} status=0 QTime=0
>> INFO: [idx1] webapp=/solr path=/update/ params={} status=0 QTime=104
>> INFO: [idx1] webapp=/solr path=/update/ params={} status=0 QTime=52
>> ...
>>
>> Is this a known issue ?
>
> It may be an issue with System.currentTimeMillis() resolution on some
> platforms (e.g. Windows)?

I don't know, I'm using linux 2.6.22 and a jvm 1.6.0


-- 
Jerome Eteve.
http://www.eteve.net
jer...@eteve.net

field collapsing bug (java.lang.ArrayIndexOutOfBoundsException)

2009-10-23 Thread Joe Calderon

seems to happen when sort on anything besides strictly score, even
score desc, num desc triggers it, using latest nightly and 10/14 patch

Problem accessing /solr/core1/select. Reason:

4731592

java.lang.ArrayIndexOutOfBoundsException: 4731592
at 
org.apache.lucene.search.FieldComparator$StringOrdValComparator.copy(FieldComparator.java:660)
at 
org.apache.solr.search.NonAdjacentDocumentCollapser$DocumentComparator.compare(NonAdjacentDocumentCollapser.java:235)
at 
org.apache.solr.search.NonAdjacentDocumentCollapser$DocumentPriorityQueue.lessThan(NonAdjacentDocumentCollapser.java:173)
at 
org.apache.lucene.util.PriorityQueue.insertWithOverflow(PriorityQueue.java:158)
at 
org.apache.solr.search.NonAdjacentDocumentCollapser.doCollapsing(NonAdjacentDocumentCollapser.java:95)
at 
org.apache.solr.search.AbstractDocumentCollapser.collapse(AbstractDocumentCollapser.java:208)
at 
org.apache.solr.handler.component.CollapseComponent.doProcess(CollapseComponent.java:98)
at 
org.apache.solr.handler.component.CollapseComponent.process(CollapseComponent.java:66)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at 
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:233)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1148)
at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:387)
at 
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417)
at 
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
at 
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:326)
at 
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534)
at 
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:864)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:539)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
at 
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409)
at 
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:520)

Where the new replication pulls the files?

2009-10-23 Thread Jérôme Etévé

Hi all,
  I'm wondering where a slave pulls the files from the master on replication.

Is it directly to the index/ directory or is it somewhere else before
it's completed and gets copied to index?

Cheers!

Jerome.

-- 
Jerome Eteve.
http://www.eteve.net
jer...@eteve.net

RE: Too many open files

2009-10-23 Thread Fuad Efendi

Make it 10:
10

-Fuad


> -Original Message-
> From: Ranganathan, Sharmila [mailto:sranganat...@library.rochester.edu]
> Sent: October-23-09 1:08 PM
> To: solr-user@lucene.apache.org
> Subject: Too many open files
> 
> Hi,
> 
> I am getting too many open files error.
> 
> Usually I test on a server that has 4GB RAM and assigned 1GB for
> tomcat(set JAVA_OPTS=-Xms256m -Xmx1024m), ulimit -n is 256 for this
> server and has following setting for SolrConfig.xml
> 
> 
> 
> true
> 
> 1024
> 
> 100
> 
> 2147483647
> 
> 1
> 
> 
> 
> In my case 200,000 documents is of 1024MB size and in this testing, I am
> indexing total of million documents. We have high setting because we are
> expected to index about 10+ million records in production. It works fine
> in this server.
> 
> 
> 
> When I deploy same solr configuration on a server with 32GB RAM, I get
> "too many open files" error. The ulimit -n is 1024 for this server. Any
> idea? Is this because 2nd server has 32GB RAM? Is 1024 open files limit
> too low? Also I don't find any documentation for .
> I checked Solr 'Solr 1.4 Enterprise Search Server' book, wiki, etc. I am
> using Solr 1.3.
> 
> 
> 
> Is it good idea to use ramBufferSizeMB? Vs maxBufferedDocs?  What does
> ramBufferSizeMB mean? My understanding is that when documents added to
> index which are initially stored in memory reaches size
> 1024MB(ramBufferSizeMB), it flushes data to disk. Or is it when total
> memory used(by tomcat, etc) reaches 1024, it flushed data to disk?
> 
> 
> 
> Thanks,
> 
> Sharmila
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
>

RE: Too many open files

2009-10-23 Thread Fuad Efendi

> 1024

Ok, it will lower frequency of Buffer flush to disk (buffer flush happens
when it reaches capacity, due commit, etc.); it will improve performance. It
is internal buffer used by Lucene. It is not total memory of Tomcat... 


> 100

It will deal with 100 Segments, and each segment will consist on number of
files (equal to number of fields) - you may have 20 fields, 2000 files...



For many such applications, set ulimit to 65536. You never know how many
files you will need (including log files of Tomcat, class files, config
files, image/css/html files, etc...)

Even with 10 Lucene segments (mergeFactor), 10 files each, (100 files)
Lucene may need much more during commit/optimize...


-Fuad


> -Original Message-
> From: Ranganathan, Sharmila [mailto:sranganat...@library.rochester.edu]
> Sent: October-23-09 1:08 PM
> To: solr-user@lucene.apache.org
> Subject: Too many open files
> 
> Hi,
> 
> I am getting too many open files error.
> 
> Usually I test on a server that has 4GB RAM and assigned 1GB for
> tomcat(set JAVA_OPTS=-Xms256m -Xmx1024m), ulimit -n is 256 for this
> server and has following setting for SolrConfig.xml
> 
> 
> 
> true
> 
> 1024
> 
> 100
> 
> 2147483647
> 
> 1
> 
> 
> 
> In my case 200,000 documents is of 1024MB size and in this testing, I am
> indexing total of million documents. We have high setting because we are
> expected to index about 10+ million records in production. It works fine
> in this server.
> 
> 
> 
> When I deploy same solr configuration on a server with 32GB RAM, I get
> "too many open files" error. The ulimit -n is 1024 for this server. Any
> idea? Is this because 2nd server has 32GB RAM? Is 1024 open files limit
> too low? Also I don't find any documentation for .
> I checked Solr 'Solr 1.4 Enterprise Search Server' book, wiki, etc. I am
> using Solr 1.3.
> 
> 
> 
> Is it good idea to use ramBufferSizeMB? Vs maxBufferedDocs?  What does
> ramBufferSizeMB mean? My understanding is that when documents added to
> index which are initially stored in memory reaches size
> 1024MB(ramBufferSizeMB), it flushes data to disk. Or is it when total
> memory used(by tomcat, etc) reaches 1024, it flushed data to disk?
> 
> 
> 
> Thanks,
> 
> Sharmila
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
>

RE: Too many open files

2009-10-23 Thread Fuad Efendi

I was partially wrong; this is what Mike McCandless  (Lucene-in-Action, 2nd
edition) explained at Manning forum:

mergeFactor of 1000 means you will have up to 1000 segments at each level.
A level 0 segment means it was flushed directly by IndexWriter.
After you have 1000 such segments, they are merged into a single level 1
segment.
Once you have 1000 level 1 segments, they are merged into a single level 2
segment, etc.
So, depending on how many docs you add to your index, you'll could have
1000s of segments w/ mergeFactor=1000.

http://www.manning-sandbox.com/thread.jspa?threadID=33784&tstart=0


So, in case of mergeFactor=100 you may have (theoretically) 1000 segments,
10-20 files each (depending on schema)...


mergeFactor=10 is default setting... ramBufferSizeMB=1024 means that you
need at least double Java heap, but you have -Xmx1024m...


-Fuad


> 
> I am getting too many open files error.
> 
> Usually I test on a server that has 4GB RAM and assigned 1GB for
> tomcat(set JAVA_OPTS=-Xms256m -Xmx1024m), ulimit -n is 256 for this
> server and has following setting for SolrConfig.xml
> 
> 
> 
> true
> 
> 1024
> 
> 100
> 
> 2147483647
> 
> 1
>

Re: Too many open files

2009-10-23 Thread Mark Miller

I wouldn't use a RAM buffer of a gig - 32-100 is generally a good number.

Fuad Efendi wrote:
> I was partially wrong; this is what Mike McCandless  (Lucene-in-Action, 2nd
> edition) explained at Manning forum:
>
> mergeFactor of 1000 means you will have up to 1000 segments at each level.
> A level 0 segment means it was flushed directly by IndexWriter.
> After you have 1000 such segments, they are merged into a single level 1
> segment.
> Once you have 1000 level 1 segments, they are merged into a single level 2
> segment, etc.
> So, depending on how many docs you add to your index, you'll could have
> 1000s of segments w/ mergeFactor=1000.
>
> http://www.manning-sandbox.com/thread.jspa?threadID=33784&tstart=0
>
>
> So, in case of mergeFactor=100 you may have (theoretically) 1000 segments,
> 10-20 files each (depending on schema)...
>
>
> mergeFactor=10 is default setting... ramBufferSizeMB=1024 means that you
> need at least double Java heap, but you have -Xmx1024m...
>
>
> -Fuad
>
>
>   
>> I am getting too many open files error.
>>
>> Usually I test on a server that has 4GB RAM and assigned 1GB for
>> tomcat(set JAVA_OPTS=-Xms256m -Xmx1024m), ulimit -n is 256 for this
>> server and has following setting for SolrConfig.xml
>>
>>
>>
>> true
>>
>> 1024
>>
>> 100
>>
>> 2147483647
>>
>> 1
>>
>> 
>
>
>   


-- 
- Mark

http://www.lucidimagination.com

Re: Environment Timezone being considered when using SolrJ

2009-10-23 Thread Michel Bottan

Hi Hoss,

Thanks for the clarification.

I've a wrote a Unit Test in order to simulate the date processing. A high
level detail of this problem is that it occurs only when used the JavaBin
custom format (&wt=javabin), in this case the dates get back set with
environment UTC offset coordinates.


On Thu, Oct 22, 2009 at 11:41 PM, Chris Hostetter
wrote:

>
> : When using SolrJ I've realized document dates are being modified
> according
> : to the environment UTC timezone. The timezone is being set in the inner
> : class ISO8601CanonicalDateFormat of DateField class.
>
> The dates aren't "modified" based on UTC, they are formated in UTC before
> being written to the Lucene index so that no matter what the current
> locale is the index format is consistent.
>

yes, dates are consistent at index.


>
> : I've read some posts where people say Solr should be most locale and
> culture
> : agnostic. So, what's the purpose for that timezone processing before
>
> The use of UTC is specificly to be agnostic of where the server is
> running.  Any client, any where in the world, using any TimeZone can query
> any solr server, running in any JVM, and know that the dates it gets back
> are formated in UTC.
>
> : Code to simulate issue:
>
> I don't actaully see any "issue" being simulated in this code, can you
> elaborate on how exactly it's behaving in a way that's inconsistent with
> your expectaitons?  (making it a JUNit TestCase that uses assserts to fail
> where you are getting data you don't expect is pretty must the universal
> way to describe a bug)
>


import static org.junit.Assert.assertEquals;

import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.Date;

import org.apache.lucene.document.Field;
import org.apache.lucene.document.Field.Index;
import org.apache.lucene.document.Field.Store;
import org.apache.solr.schema.DateField;
import org.junit.Test;

public class DateFieldTest {

@Test
public void shouldReturnSameDateValueWhenDateFieldIsUsedToParseDates()
throws ParseException {

//Given
String originalDateString = "2010-10-10T10:10:10Z";

//When
Field field = new
Field("field",originalDateString,Store.NO,Index.ANALYZED);
DateField dateField = new DateField();

SimpleDateFormat dateFormat = new
SimpleDateFormat("-MM-dd'T'HH:mm:ss'Z'");
Date originalDateObject = dateFormat.parse(originalDateString);
Date parsedDate = dateField.toObject(field);

//Then
assertEquals(originalDateObject, parsedDate);

/* TO MAKE TEST PASS
 * Solr 1.3
 *
 * Comment line 271 at org.apache.solr.schema.DateField
 *  this.setTimeZone(CANONICAL_TZ);
 */
}
}


>
> My guess would be that you are getting confused by the fact that
> Date.toString() uses the default locale of your JVM to generate a string,
> which is why the data getting printed out doesn't match the hardcoded
> value in your code...
>
> : System.out.println(dateField.toObject(field));
>
> but if you take any Date object you want, print it's toString(), index it,
> and then take that indexed string representation convert it back into a
> Date (using dateField.toOBject()) you should originalDate.equals(newDate).
>

I was expecting this behaviour and I get it when performnig an HTTP query
and the XMLResponseWriter is used. But the same does not occur when used the
BinaryResponseWriter.



>
>
>
> -Hoss
>
>

Thanks!
Michel

RE: NGram query failing

2009-10-23 Thread Charlie Jackson


Well, I fixed my own problem in the end. For the record, this is the
schema I ended up going with:














I could have left it a trigram but went with a bigram because with this
setup, I can get queries to properly hit as long as the min/max gram
size is met. In other words, for any queries two or more characters
long, this works for me. Less than two characters and it fails. 

I don't know exactly why that is, but I'll take it anyway!

- Charlie


-Original Message-
From: Charlie Jackson [mailto:charlie.jack...@cision.com] 
Sent: Friday, October 23, 2009 10:00 AM
To: solr-user@lucene.apache.org
Subject: NGram query failing

I have a requirement to be able to find hits within words in a free-form
id field. The field can have any type of alphanumeric data - it's as
likely it will be something like "123456" as it is to be "SUN-123-ABC".
I thought of using NGrams to accomplish the task, but I'm having a
problem. I set up a field like this

 









  



 

After indexing a field like this, the analysis page indicates my queries
should work. If I give it a sample field value of "ABC-123456-SUN" and a
query value of "45" it shows hits in several places, which is what I
expected.

 

However, when I actually query the field with something like "45" I get
no hits back. Looking at the debugQuery output, it looks like it's
taking my analyzed query text and putting it into a phrase query. So,
for a query of "45" it turns into a phrase query of :"4 5 45"
which then doesn't hit on anything in my index.

 

What am I missing to make this work?

 

- Charlie

New Technical White Papers on Apache Lucene 2.9 and Solr 1.4 from Lucid Imagination

2009-10-23 Thread Tom Alt

Hi -

FYI, Lucid's just put out a two white papers, one on Apache Lucene 2.9 and
one on Apache Solr 1.4:
- "What's New in Lucene 2.9" covers range of performance improvements and
new features (per segment indexing, trierange numeric analysis, and more),
along with recommendations for upgrading your Lucene application to the 2.9
release. Download (reg required) at
http://www.lucidimagination.com/whitepaper/whats-new-in-lucene-2-9?sc=AP.

- “What’s New in Solr 1.4.” also covers its performance and feature
improvements (such as improved Data Import Handler, java-based replication,
rich document acquisition and more) . Download (reg required) at
http://www.lucidimagination.com/whitepaper/whats-new-in-solr-1-4?sc=AP

Tom
www.lucidimagination.com

Re: multicore query via solrJ

2009-10-23 Thread Silent Surfer

Hi Lici,

You may want to try the following snippet

---
SolrServer solr = new 
CommonsHttpSolrServer("http://localhost:8983/solr";); // 

ModifiableSolrParams params = new ModifiableSolrParams();

params.set("wt", "json"); // Can be json,standard..
params.set("rows", RowsToFetch); // Total # of rows to fetch
params.set("start", StartingRow);  // Starting record
params.set("shards", 
"localhost:8983/solr,localhost:8984/solr,localhost:8985/solr"); // Shard URL
.
.
.
params.set("q", queryStr.toString());  // User Query
QueryResponse response = solr.query(params);
SolrDocumentList docs = response.getResults();
---

Thanks,
sS

--- On Fri, 10/23/09, Licinio Fernández Maurelo  
wrote:

> From: Licinio Fernández Maurelo 
> Subject: Re: multicore query via solrJ
> To: solr-user@lucene.apache.org
> Date: Friday, October 23, 2009, 7:30 AM
> As no answer is given, I assume it's
> not possible. It will be great to code
> a method like this
> 
> query(SolrServer,  List)
> 
> 
> 
> El 20 de octubre de 2009 11:21, Licinio Fernández Maurelo
> <
> licinio.fernan...@gmail.com>
> escribió:
> 
> > Hi there,
> > is there any way to perform a multi-core query using
> solrj?
> >
> > P.S.:
> >
> > I know about this syntax:
> > http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1&q=
> > but i'm looking for a more fancy way to do this using
> solrj (something like
> > shards(query) )
> >
> > thx
> >
> >
> >
> > --
> > Lici
> >
> 
> 
> 
> -- 
> Lici
>

RE: Too many open files

2009-10-23 Thread Fuad Efendi

Reason of having big RAM buffer is lowering frequency of IndexWriter flushes
and (subsequently) lowering frequency of index merge events, and
(subsequently) merging of a few larger files takes less time... especially
if RAM Buffer is intelligent enough (and big enough) to deal with 100
concurrent updates of existing document without 100-times flushing to disk
of 100 document versions.

I posted here thread related; I had 1:5 timing for Update:Merge (5 minutes
merge, and 1 minute update) with default SOLR settings (32Mb buffer). I
increased buffer to 8Gb on Master, and it triggered significant indexing
performance boost... 

-Fuad
http://www.linkedin.com/in/liferay


> -Original Message-
> From: Mark Miller [mailto:markrmil...@gmail.com]
> Sent: October-23-09 3:03 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Too many open files
> 
> I wouldn't use a RAM buffer of a gig - 32-100 is generally a good number.
> 
> Fuad Efendi wrote:
> > I was partially wrong; this is what Mike McCandless  (Lucene-in-Action,
2nd
> > edition) explained at Manning forum:
> >
> > mergeFactor of 1000 means you will have up to 1000 segments at each
level.
> > A level 0 segment means it was flushed directly by IndexWriter.
> > After you have 1000 such segments, they are merged into a single level 1
> > segment.
> > Once you have 1000 level 1 segments, they are merged into a single level
2
> > segment, etc.
> > So, depending on how many docs you add to your index, you'll could have
> > 1000s of segments w/ mergeFactor=1000.
> >
> > http://www.manning-sandbox.com/thread.jspa?threadID=33784&tstart=0
> >
> >
> > So, in case of mergeFactor=100 you may have (theoretically) 1000
segments,
> > 10-20 files each (depending on schema)...
> >
> >
> > mergeFactor=10 is default setting... ramBufferSizeMB=1024 means that you
> > need at least double Java heap, but you have -Xmx1024m...
> >
> >
> > -Fuad
> >
> >
> >
> >> I am getting too many open files error.
> >>
> >> Usually I test on a server that has 4GB RAM and assigned 1GB for
> >> tomcat(set JAVA_OPTS=-Xms256m -Xmx1024m), ulimit -n is 256 for this
> >> server and has following setting for SolrConfig.xml
> >>
> >>
> >>
> >> true
> >>
> >> 1024
> >>
> >> 100
> >>
> >> 2147483647
> >>
> >> 1
> >>
> >>
> >
> >
> >
> 
> 
> --
> - Mark
> 
> http://www.lucidimagination.com
> 
>

Re: Too many open files

2009-10-23 Thread Mark Miller

8 GB is much larger than is well supported. Its diminishing returns over
40-100 and mostly a waste of RAM. Too high and things can break. It
should be well below 2 GB at most, but I'd still recommend 40-100.

Fuad Efendi wrote:
> Reason of having big RAM buffer is lowering frequency of IndexWriter flushes
> and (subsequently) lowering frequency of index merge events, and
> (subsequently) merging of a few larger files takes less time... especially
> if RAM Buffer is intelligent enough (and big enough) to deal with 100
> concurrent updates of existing document without 100-times flushing to disk
> of 100 document versions.
>
> I posted here thread related; I had 1:5 timing for Update:Merge (5 minutes
> merge, and 1 minute update) with default SOLR settings (32Mb buffer). I
> increased buffer to 8Gb on Master, and it triggered significant indexing
> performance boost... 
>
> -Fuad
> http://www.linkedin.com/in/liferay
>
>
>   
>> -Original Message-
>> From: Mark Miller [mailto:markrmil...@gmail.com]
>> Sent: October-23-09 3:03 PM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Too many open files
>>
>> I wouldn't use a RAM buffer of a gig - 32-100 is generally a good number.
>>
>> Fuad Efendi wrote:
>> 
>>> I was partially wrong; this is what Mike McCandless  (Lucene-in-Action,
>>>   
> 2nd
>   
>>> edition) explained at Manning forum:
>>>
>>> mergeFactor of 1000 means you will have up to 1000 segments at each
>>>   
> level.
>   
>>> A level 0 segment means it was flushed directly by IndexWriter.
>>> After you have 1000 such segments, they are merged into a single level 1
>>> segment.
>>> Once you have 1000 level 1 segments, they are merged into a single level
>>>   
> 2
>   
>>> segment, etc.
>>> So, depending on how many docs you add to your index, you'll could have
>>> 1000s of segments w/ mergeFactor=1000.
>>>
>>> http://www.manning-sandbox.com/thread.jspa?threadID=33784&tstart=0
>>>
>>>
>>> So, in case of mergeFactor=100 you may have (theoretically) 1000
>>>   
> segments,
>   
>>> 10-20 files each (depending on schema)...
>>>
>>>
>>> mergeFactor=10 is default setting... ramBufferSizeMB=1024 means that you
>>> need at least double Java heap, but you have -Xmx1024m...
>>>
>>>
>>> -Fuad
>>>
>>>
>>>
>>>   
 I am getting too many open files error.

 Usually I test on a server that has 4GB RAM and assigned 1GB for
 tomcat(set JAVA_OPTS=-Xms256m -Xmx1024m), ulimit -n is 256 for this
 server and has following setting for SolrConfig.xml



 true

 1024

 100

 2147483647

 1


 
>>>
>>>   
>> --
>> - Mark
>>
>> http://www.lucidimagination.com
>>
>>
>> 
>
>
>
>   


-- 
- Mark

http://www.lucidimagination.com

Re: Too many open files

2009-10-23 Thread Mark Miller

Here is an example using the Lucene benchmark package. Indexing 64,000
wikipedia docs (sorry for the formatting):

 [java] > Report sum by Prefix (MAddDocs) and Round (4
about 32 out of 256058)
 [java] Operation round mrg  flush   runCnt  
recsPerRunrec/s  elapsedSecavgUsedMemavgTotalMem
 [java] MAddDocs_8000 0  10  32.00MB8
800037.401,711.22   124,612,472182,689,792
 [java] MAddDocs_8000 -   1  10  80.00MB -  -   8 -  -  - 8000 - 
-   39.91 -  1,603.76 - 266,716,128 -  469,925,888
 [java] MAddDocs_8000 2  10 120.00MB8
800040.741,571.02   348,059,488548,233,216
 [java] MAddDocs_8000 -   3  10 512.00MB -  -   8 -  -  - 8000 - 
-   38.25 -  1,673.05 - 746,087,808 -  926,089,216

After about 32-40, you don't gain much, and it starts decreasing once
you start getting to high. 8GB is a terrible recommendation.

Also, from the javadoc in IndexWriter:

   *  NOTE: because IndexWriter uses
   * ints when managing its internal storage,
   * the absolute maximum value for this setting is somewhat
   * less than 2048 MB.  The precise limit depends on
   * various factors, such as how large your documents are,
   * how many fields have norms, etc., so it's best to set
   * this value comfortably under 2048.

Mark Miller wrote:
> 8 GB is much larger than is well supported. Its diminishing returns over
> 40-100 and mostly a waste of RAM. Too high and things can break. It
> should be well below 2 GB at most, but I'd still recommend 40-100.
>
> Fuad Efendi wrote:
>   
>> Reason of having big RAM buffer is lowering frequency of IndexWriter flushes
>> and (subsequently) lowering frequency of index merge events, and
>> (subsequently) merging of a few larger files takes less time... especially
>> if RAM Buffer is intelligent enough (and big enough) to deal with 100
>> concurrent updates of existing document without 100-times flushing to disk
>> of 100 document versions.
>>
>> I posted here thread related; I had 1:5 timing for Update:Merge (5 minutes
>> merge, and 1 minute update) with default SOLR settings (32Mb buffer). I
>> increased buffer to 8Gb on Master, and it triggered significant indexing
>> performance boost... 
>>
>> -Fuad
>> http://www.linkedin.com/in/liferay
>>
>>
>>   
>> 
>>> -Original Message-
>>> From: Mark Miller [mailto:markrmil...@gmail.com]
>>> Sent: October-23-09 3:03 PM
>>> To: solr-user@lucene.apache.org
>>> Subject: Re: Too many open files
>>>
>>> I wouldn't use a RAM buffer of a gig - 32-100 is generally a good number.
>>>
>>> Fuad Efendi wrote:
>>> 
>>>   
 I was partially wrong; this is what Mike McCandless  (Lucene-in-Action,
   
 
>> 2nd
>>   
>> 
 edition) explained at Manning forum:

 mergeFactor of 1000 means you will have up to 1000 segments at each
   
 
>> level.
>>   
>> 
 A level 0 segment means it was flushed directly by IndexWriter.
 After you have 1000 such segments, they are merged into a single level 1
 segment.
 Once you have 1000 level 1 segments, they are merged into a single level
   
 
>> 2
>>   
>> 
 segment, etc.
 So, depending on how many docs you add to your index, you'll could have
 1000s of segments w/ mergeFactor=1000.

 http://www.manning-sandbox.com/thread.jspa?threadID=33784&tstart=0


 So, in case of mergeFactor=100 you may have (theoretically) 1000
   
 
>> segments,
>>   
>> 
 10-20 files each (depending on schema)...


 mergeFactor=10 is default setting... ramBufferSizeMB=1024 means that you
 need at least double Java heap, but you have -Xmx1024m...


 -Fuad



   
 
> I am getting too many open files error.
>
> Usually I test on a server that has 4GB RAM and assigned 1GB for
> tomcat(set JAVA_OPTS=-Xms256m -Xmx1024m), ulimit -n is 256 for this
> server and has following setting for SolrConfig.xml
>
>
>
> true
>
> 1024
>
> 100
>
> 2147483647
>
> 1
>
>
> 
>   
   
 
>>> --
>>> - Mark
>>>
>>> http://www.lucidimagination.com
>>>
>>>
>>> 
>>>   
>>
>>   
>> 
>
>
>   


-- 
- Mark

http://www.lucidimagination.com

Re: Too many open files

2009-10-23 Thread Mark Miller

Hmm - came out worse than it looked. Here is a better attempt:

MergeFactor: 10

BUF   DOCS/S
32   37.40
80   39.91
120 40.74
512 38.25

Mark Miller wrote:
> Here is an example using the Lucene benchmark package. Indexing 64,000
> wikipedia docs (sorry for the formatting):
>
>  [java] > Report sum by Prefix (MAddDocs) and Round (4
> about 32 out of 256058)
>  [java] Operation round mrg  flush   runCnt  
> recsPerRunrec/s  elapsedSecavgUsedMemavgTotalMem
>  [java] MAddDocs_8000 0  10  32.00MB8
> 800037.401,711.22   124,612,472182,689,792
>  [java] MAddDocs_8000 -   1  10  80.00MB -  -   8 -  -  - 8000 - 
> -   39.91 -  1,603.76 - 266,716,128 -  469,925,888
>  [java] MAddDocs_8000 2  10 120.00MB8
> 800040.741,571.02   348,059,488548,233,216
>  [java] MAddDocs_8000 -   3  10 512.00MB -  -   8 -  -  - 8000 - 
> -   38.25 -  1,673.05 - 746,087,808 -  926,089,216
>
> After about 32-40, you don't gain much, and it starts decreasing once
> you start getting to high. 8GB is a terrible recommendation.
>
> Also, from the javadoc in IndexWriter:
>
>*  NOTE: because IndexWriter uses
>* ints when managing its internal storage,
>* the absolute maximum value for this setting is somewhat
>* less than 2048 MB.  The precise limit depends on
>* various factors, such as how large your documents are,
>* how many fields have norms, etc., so it's best to set
>* this value comfortably under 2048.
>
> Mark Miller wrote:
>   
>> 8 GB is much larger than is well supported. Its diminishing returns over
>> 40-100 and mostly a waste of RAM. Too high and things can break. It
>> should be well below 2 GB at most, but I'd still recommend 40-100.
>>
>> Fuad Efendi wrote:
>>   
>> 
>>> Reason of having big RAM buffer is lowering frequency of IndexWriter flushes
>>> and (subsequently) lowering frequency of index merge events, and
>>> (subsequently) merging of a few larger files takes less time... especially
>>> if RAM Buffer is intelligent enough (and big enough) to deal with 100
>>> concurrent updates of existing document without 100-times flushing to disk
>>> of 100 document versions.
>>>
>>> I posted here thread related; I had 1:5 timing for Update:Merge (5 minutes
>>> merge, and 1 minute update) with default SOLR settings (32Mb buffer). I
>>> increased buffer to 8Gb on Master, and it triggered significant indexing
>>> performance boost... 
>>>
>>> -Fuad
>>> http://www.linkedin.com/in/liferay
>>>
>>>
>>>   
>>> 
>>>   
 -Original Message-
 From: Mark Miller [mailto:markrmil...@gmail.com]
 Sent: October-23-09 3:03 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Too many open files

 I wouldn't use a RAM buffer of a gig - 32-100 is generally a good number.

 Fuad Efendi wrote:
 
   
 
> I was partially wrong; this is what Mike McCandless  (Lucene-in-Action,
>   
> 
>   
>>> 2nd
>>>   
>>> 
>>>   
> edition) explained at Manning forum:
>
> mergeFactor of 1000 means you will have up to 1000 segments at each
>   
> 
>   
>>> level.
>>>   
>>> 
>>>   
> A level 0 segment means it was flushed directly by IndexWriter.
> After you have 1000 such segments, they are merged into a single level 1
> segment.
> Once you have 1000 level 1 segments, they are merged into a single level
>   
> 
>   
>>> 2
>>>   
>>> 
>>>   
> segment, etc.
> So, depending on how many docs you add to your index, you'll could have
> 1000s of segments w/ mergeFactor=1000.
>
> http://www.manning-sandbox.com/thread.jspa?threadID=33784&tstart=0
>
>
> So, in case of mergeFactor=100 you may have (theoretically) 1000
>   
> 
>   
>>> segments,
>>>   
>>> 
>>>   
> 10-20 files each (depending on schema)...
>
>
> mergeFactor=10 is default setting... ramBufferSizeMB=1024 means that you
> need at least double Java heap, but you have -Xmx1024m...
>
>
> -Fuad
>
>
>
>   
> 
>   
>> I am getting too many open files error.
>>
>> Usually I test on a server that has 4GB RAM and assigned 1GB for
>> tomcat(set JAVA_OPTS=-Xms256m -Xmx1024m), ulimit -n is 256 for this
>> server and has following setting for SolrConfig.xml
>>
>>
>>
>> true
>>
>> 1024
>>
>> 100
>>
>> 2147483647
>>
>> 1
>>
>>
>> 
>>   
>> 
>   
> 
>   
 --
 - Mark

 http://www.lucidimagination.com


 
   
 
>>>   
>>> 
>>>   
>>   
>> 
>
>
>   


-- 
-

"exceeded limit of maxWarmingSearchers=2" when posting data

2009-10-23 Thread Teruhiko Kurosaka

I'm trying to stress-test solr (nightly build of 2009-10-12) using JMeter.
I set up JMeter to post pod_other.xml, then hd.xml, then commit.xml that only 
has a line "", 100 times.
Solr instance runs on a multi-core system.

Solr didn't complian when the number of test threads is 1, 2, 3 or 4.

But when I increased the thnumber of test threads to 8, I saw this error
on the console:

SEVERE: org.apache.solr.common.SolrException: Error opening new searcher. 
exceeded limit of maxWarmingSearchers=2, try again later.


What does this mean? 

Why Solr tries to make warm up searchers when I'm posting documents, not 
searching?

Do I need to set this maxWarmingSearchers to greater than the number of CPU 
cores?

Thanks.

T. "Kuro" Kurosaka

Re: "exceeded limit of maxWarmingSearchers=2" when posting data

2009-10-23 Thread Yonik Seeley

2009/10/23 Teruhiko Kurosaka :
> I'm trying to stress-test solr (nightly build of 2009-10-12) using JMeter.
> I set up JMeter to post pod_other.xml, then hd.xml, then commit.xml that only 
> has a line "", 100 times.
> Solr instance runs on a multi-core system.
>
> Solr didn't complian when the number of test threads is 1, 2, 3 or 4.
>
> But when I increased the thnumber of test threads to 8, I saw this error
> on the console:
>
> SEVERE: org.apache.solr.common.SolrException: Error opening new searcher. 
> exceeded limit of maxWarmingSearchers=2, try again later.
>
>
> What does this mean?
>
> Why Solr tries to make warm up searchers when I'm posting documents, not 
> searching?

A commit flushes index changes to disk and opens a new index searcher.
 The maxWarmingSearchers limit is just a protection mechanism.

> Do I need to set this maxWarmingSearchers to greater than the number of CPU 
> cores?

No, that's unrelated.  Don't commit so often.
The error is also not a fatal one - the commit fails, but you won't
lose data - you just won't see it until a commit succeeds in opening a
new searcher.

-Yonik
http://www.lucidimagination.com

Solrj Javabin and JSON

2009-10-23 Thread SGE0


Hi,

did anyone write a Javabin to JSON convertor and is willing to share this ?

In our servlet we use a CommonsHttpSolrServer instance to execute a query.

The problem is that is returns Javabin format and we need to send the result
back to the browser using JSON format.

And no, the browser is not allowed to directly query Lucene with the wt=json
format.

Regards,

S.
-- 
View this message in context: 
http://www.nabble.com/Solrj-Javabin-and-JSON-tp26036551p26036551.html
Sent from the Solr - User mailing list archive at Nabble.com.

52 matches

Mail list logo