Solr using a ridiculous amount of memory

2013-03-24 Thread John Nielsen
Hello all,

We are running a solr cluster which is now running solr-4.2.

The index is about 35GB on disk with each register between 15k and 30k.
(This is simply the size of a full xml reply of one register. I'm not sure
how to measure it otherwise.)

Our memory requirements are running amok. We have less than a quarter of
our customers running now and even though we have allocated 25GB to the JVM
already, we are still seeing daily OOM crashes. We used to just allocate
more memory to the JVM, but with the way solr is scaling, we would need
well over 100GB of memory on each node to finish the project, and thats
just not going to happen. I need to lower the memory requirements somehow.

I can see from the memory dumps we've done that the field cache is by far
the biggest sinner. Of special interest to me is the recent introduction of
DocValues which supposedly mitigates this issue by using memory outside the
JVM. I just can't, because of lack of documentation, seem to make it work.

We do a lot of facetting. One client facets on about 50.000 docs of approx
30k each on 5 fields. I understand that this is VERY memory intensive.

Schema with DocValues attempt at solving problem:
http://pastebin.com/Ne23NnW4
Config: http://pastebin.com/x1qykyXW

The cache is pretty well tuned. Any lower and i get evictions.

Come hell or high water, my JVM memory requirements must come down. Simply
moving some memory load outside of the JVM would be awesome! Making it not
use the field cache for anything would also (probably) work for me. I
thought about killing off my other caches, but from the dumps, they just
don't seem to use that much memory.

I am at my wits end. Any help would be sorely appreciated.

-- 
Med venlig hilsen / Best regards

*John Nielsen*
Programmer



*MCB A/S*
Enghaven 15
DK-7500 Holstebro

Kundeservice: +45 9610 2824
p...@mcb.dk
www.mcb.dk


Re: Solr sorting is not working properly on long Fields

2013-03-24 Thread ballusethuraman
Yes I did.. But there is no change in result..



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-sorting-is-not-working-properly-on-long-Fields-tp4050834p4050844.html
Sent from the Solr - User mailing list archive at Nabble.com.


SOLR 4.2 SolrQuery exception

2013-03-24 Thread Sandeep Kumar Anumalla
I am using the below code and getting the exception while using SolrQuery



Mar 24, 2013 3:08:07 PM org.apache.solr.core.QuerySenderListener newSearcher
INFO: QuerySenderListener sending requests to Searcher@795e0c2b 
main{StandardDirectoryReader(segments_49:524 _4v(4.2):C299313 
_4x(4.2):C2953/1396 _4y(4.2):C2866/1470 _4z(4.2):C4263/2793 _50(4.2):C3554/761 
_51(4.2):C1126/365 _52(4.2):C650/285 _53(4.2):C500/215 _54(4.2):C1808/1593 
_55(4.2):C1593)}
Mar 24, 2013 3:08:07 PM org.apache.solr.common.SolrException log
SEVERE: java.lang.NullPointerException
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:181)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1797)
at 
org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:64)
at org.apache.solr.core.SolrCore$5.call(SolrCore.java:1586)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:679)

Mar 24, 2013 3:08:07 PM org.apache.solr.core.SolrCore execute
INFO: [collection1] webapp=null path=null 
params={event=firstSearcher&q=static+firstSearcher+warming+in+solrconfig.xml&distrib=false}
 status=500 QTime=4
Mar 24, 2013 3:08:07 PM org.apache.solr.core.QuerySenderListener newSearcher
INFO: QuerySenderListener done.
Mar 24, 2013 3:08:07 PM 
org.apache.solr.handler.component.SpellCheckComponent$SpellCheckerListener 
newSearcher
INFO: Loading spell index for spellchecker: default
Mar 24, 2013 3:08:07 PM 
org.apache.solr.handler.component.SpellCheckComponent$SpellCheckerListener 
newSearcher
INFO: Loading spell index for spellchecker: wordbreak
Mar 24, 2013 3:08:07 PM org.apache.solr.core.SolrCore registerSearcher
INFO: [collection1] Registered new searcher Searcher@795e0c2b 
main{StandardDirectoryReader(segments_49:524 _4v(4.2):C299313 
_4x(4.2):C2953/1396 _4y(4.2):C2866/1470 _4z(4.2):C4263/2793 _50(4.2):C3554/761 
_51(4.2):C1126/365 _52(4.2):C650/285 _53(4.2):C500/215 _54(4.2):C1808/1593 
_55(4.2):C1593)}
Mar 24, 2013 3:08:07 PM org.apache.solr.core.CoreContainer registerCore
INFO: registering core: collection1
server value 
-org.apache.solr.client.solrj.embedded.EmbeddedSolrServer@3a32ea4
query value -q=smstext%3AEMIRATES&rows=50
Mar 24, 2013 3:08:07 PM org.apache.solr.common.SolrException log
SEVERE: java.lang.NullPointerException
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:181)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1797)
at 
org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:150)
at 
org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:90)
at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:301)
at SolrQueryResult.solrQuery(SolrQueryResult.java:31)
at SolrQueryResult.main(SolrQueryResult.java:65)

Mar 24, 2013 3:08:07 PM org.apache.solr.core.SolrCore execute
INFO: [collection1] webapp=null path=/select 
params={q=smstext%3AEMIRATES&rows=50} status=500 QTime=0
org.apache.solr.client.solrj.SolrServerException: 
org.apache.solr.client.solrj.SolrServerException: java.lang.NullPointerException
at 
org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:223)
at 
org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:90)
at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:301)
at SolrQueryResult.solrQuery(SolrQueryResult.java:31)
at SolrQueryResult.main(SolrQueryResult.java:65)
Caused by: org.apache.solr.client.solrj.SolrServerException: 
java.lang.NullPointerException
at 
org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:155)
... 4 more
Caused by: java.lang.NullPointerException
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:181)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1797)
at 
org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:150)
... 4 more


try{
String SOLR_HOME = "/data/solr1/example/solr/";
CoreContainer coreContainer = new CoreContainer(SOLR_HOME);
CoreDescriptor discriptor = new CoreDescriptor(coreContai

Tlog File not removed after hard commit

2013-03-24 Thread Niran Fajemisin
Hi all,

We import about 1.5 million documents on a nightly basis using DIH. During this 
time, we need to ensure that all documents make it into index otherwise 
rollback on any errors; which DIH takes care of for us. We also disable 
autoCommit in DIH but instruct it to commit at the very end of the import. This 
is all done through configuration of the DIH config XML file and the command 
issued to the request handler.

We have noticed that the tlog file appears to linger around even after DIH has 
issued the hard commit. My expectation would be that after the hard commit has 
occurred, the tlog file will be removed. I'm obviously misunderstanding how 
this all works.

Can someone please help me understand how this is meant to function? Thanks!

-Niran

Re: Solr Sorting is not working properly on long Fields

2013-03-24 Thread SUJIT PAL
Hi ballusethuraman, 

I am sure you have done this already, but just to be sure, did you reindex your 
existing kilometer data after you changed the data type from string to long? If 
not, then you should.

-sujit

On Mar 23, 2013, at 11:21 PM, ballusethuraman wrote:

> Hi,  I am having a column named 'Kilometers' and when I try to sort with
> that it is not working properly.The values in 'Kilometers' column
> are,Kilometers171119792365611Values in 'Kilometers' after soting are
> Kilometers979236561117111The Problem here is, when 97 is compared with 923
> it is taking 97 as bigger number since 97 is greater than 923. Initially
> Kilometers column was having string as datatype and I thought the problem
> could be because of that and i changed the datatype of that column to
> 'long'. Even then i couldn't see any change in the results.But when I insert
> values which are having same number of digits say, 1, 2,
> 3,4,5Kilometers21452  when i try to sort now
> it is working perfectlyKilometers12345Datatypes that I
> have tries are, Can anyone helpme to get rid out of this problem...
> Thanks in Advance
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Solr-Sorting-is-not-working-properly-on-long-Fields-tp4050833.html
> Sent from the Solr - User mailing list archive at Nabble.com.



Re: Practicality of enormous fields

2013-03-24 Thread Erick Erickson
Yeah, it is kind of weird, but certainly do-able. But big gotcha is if you
want to _retrieve_ that field, that could take some time. If you just want
to search it, no problems that I know of. If you do want to retrieve it,
make sure lazy field loading is enabled and that you do NOT ask for this
field in results except when you really need it...

Best
Erick


On Tue, Mar 19, 2013 at 6:33 PM, jimtronic  wrote:

> What are the likely ramifications of having a stored field with millions of
> "words"?
>
> For example, If I had an article and wanted to store the user id of every
> user who has read it and stuck it into a simple white space delimited
> field.
> What would go wrong and when?
>
> My tests lead me to believe this is not a problem, but it feels weird.
>
> Jim
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Practicality-of-enormous-fields-tp4049131.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Too many fields to Sort in Solr

2013-03-24 Thread Erick Erickson
Seems like a reasonable thing to do. Examine the debug output to insure
that there's no short-circuiting being done as far as ConstantScoreQuery...

Best
Erick


On Tue, Mar 19, 2013 at 7:05 PM, adityab  wrote:

> Hi All,
>
> I want to validate my approach by the experts, just to make sure i am on
> doing anything wrong.
>
> #Docs in Solr : 25M
> Solr Versin : 4.2
>
> Our requirement is to list top download document based on user country.
> So we have a dynamic field "*numdownload.**" which is evaluate as
> *numdownloads.*
>
> Now as sorting is an expensive and also uses large amount of java heap, I
> planned to use this field in boosting result.
>
> Old Query
> q=*:*&fq=countryId:1&sort=numdownloads.1 desc
>
> which i changed to
>  q={!boost b=numdownloads.1}*:*&fq=countryId:1
>
> Is my approach correct. Any better alternate ?
>
> thanks
> Aditya
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Too-many-fields-to-Sort-in-Solr-tp4049139.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Solr using a ridiculous amount of memory

2013-03-24 Thread Jack Krupansky
Just to get started, do you hit OOM quickly with a few expensive queries, or 
is it after a number of hours and lots of queries?


Does Java heap usage seem to be growing linearly as queries come in, or are 
there big spikes?


How complex/rich are your queries (e.g., how many terms, wildcards, faceted 
fields, sorting, etc.)?


As a baseline experiment, start a Solr server, see how much Java heap is 
used/available. Then do a couple of typical queries, and check the heap size 
again. Then do a couple more similar but different (to avoid query cache 
matches), and check the heap again. Maybe do that a few times to get a 
handle on the baseline memory required and whether there might be a leak of 
some sort. Do enough queries to hits all of the fields, facets, sorting, 
etc. that are likely to be encountered in one of your typical days that hits 
OOM - just not the volume of queries. The goal is to determine if there is 
something inherently memory intensive in your index/queries, or something 
relating to a leak based on total query volume.


-- Jack Krupansky

-Original Message- 
From: John Nielsen

Sent: Sunday, March 24, 2013 4:19 AM
To: solr-user@lucene.apache.org
Subject: Solr using a ridiculous amount of memory

Hello all,

We are running a solr cluster which is now running solr-4.2.

The index is about 35GB on disk with each register between 15k and 30k.
(This is simply the size of a full xml reply of one register. I'm not sure
how to measure it otherwise.)

Our memory requirements are running amok. We have less than a quarter of
our customers running now and even though we have allocated 25GB to the JVM
already, we are still seeing daily OOM crashes. We used to just allocate
more memory to the JVM, but with the way solr is scaling, we would need
well over 100GB of memory on each node to finish the project, and thats
just not going to happen. I need to lower the memory requirements somehow.

I can see from the memory dumps we've done that the field cache is by far
the biggest sinner. Of special interest to me is the recent introduction of
DocValues which supposedly mitigates this issue by using memory outside the
JVM. I just can't, because of lack of documentation, seem to make it work.

We do a lot of facetting. One client facets on about 50.000 docs of approx
30k each on 5 fields. I understand that this is VERY memory intensive.

Schema with DocValues attempt at solving problem:
http://pastebin.com/Ne23NnW4
Config: http://pastebin.com/x1qykyXW

The cache is pretty well tuned. Any lower and i get evictions.

Come hell or high water, my JVM memory requirements must come down. Simply
moving some memory load outside of the JVM would be awesome! Making it not
use the field cache for anything would also (probably) work for me. I
thought about killing off my other caches, but from the dumps, they just
don't seem to use that much memory.

I am at my wits end. Any help would be sorely appreciated.

--
Med venlig hilsen / Best regards

*John Nielsen*
Programmer



*MCB A/S*
Enghaven 15
DK-7500 Holstebro

Kundeservice: +45 9610 2824
p...@mcb.dk
www.mcb.dk 



Re: Solr using a ridiculous amount of memory

2013-03-24 Thread Robert Muir
On Sun, Mar 24, 2013 at 4:19 AM, John Nielsen  wrote:

> Schema with DocValues attempt at solving problem:
> http://pastebin.com/Ne23NnW4
> Config: http://pastebin.com/x1qykyXW
>

This schema isn't using docvalues, due to a typo in your config.
it should not be DocValues="true" but docValues="true".

Are you not getting an error? Solr needs to throw exception if you
provide invalid attributes to the field. Nothing is more frustrating
than having a typo or something in your configuration and solr just
ignores this, reports no error, and "doesnt work the way you want".
I'll look into this (I already intend to add these checks to analysis
factories for the same reason).

Separately, if you really want the terms data and so on to remain on
disk, it is not enough to "just enable docvalues" for the field. The
default implementation uses the heap. So if you want that, you need to
set docValuesFormat="Disk" on the fieldtype. This will keep the
majority of the data on disk, and only some key datastructures in heap
memory. This might have significant performance impact depending upon
what you are doing so you need to test that.


SOLR4/lucene and JVM memory management

2013-03-24 Thread Spyros Lambrinidis
Hi,

Does anyone know how solr4/lucene and the JVM, manages memory?

We have the following case.

We have a 15GB server running only SOLR4/Lucene and the JVM (no custom code)

We had allocated 2GB of memory and the JVM was using 1.9MB. At some point
something happened and we run out of memory.

Then we increased the JVM memory to 4GB and we see that gradually, JVM
starts to use as much as it can. It is now using 3GB out of the 4GB
allocated.

Is that normal JVM memory usage? i.e. Does the JVM always use as much as it
can from the allocated space?

Thanks for your help


-- 
Spyros Lambrinidis
Head of Engineering & Commando of
PeoplePerHour.com
Evmolpidon 23
118 54, Gkazi
Athens, Greece
Tel: +30 210 3455480

Follow us on Facebook 
Follow us on Twitter 


RE: SOLR4/lucene and JVM memory management

2013-03-24 Thread Toke Eskildsen
Spyros Lambrinidis [spy...@peopleperhour.com]:
> Then we increased the JVM memory to 4GB and we see that gradually, JVM
> starts to use as much as it can. It is now using 3GB out of the 4GB
> allocated.

That is to be expected. When the amount of garbage collections increases, the 
JVM might decide that it would be better overall to increase the size of the 
heap. Whether it will allocate up to your 4GB limit depends on how active it 
is. If you stress it, it will probably take the last GB. 

> i.e. Does the JVM always use as much as it can from the allocated space?

No, but the Oracle JVM do tend to be somewhat greedy (very subjective, I know). 
Since larger heaps means (hopefully infrequent) pauses for full garbage 
collection with a "standard" setup, the consensus seems to be that it is best 
to allocate conservatively and thereby avoid over-allocation. If 2GB worked 
well for you until you hit OOM, changing to 3GB seems like a better choice than 
4GB to me. Especially since you describe the allocation up to 3GB as gradual, 
which tells me that your installation is not starved with 3GB.

- Toke Eskildsen

RE: Solr using a ridiculous amount of memory

2013-03-24 Thread Toke Eskildsen
From: John Nielsen [j...@mcb.dk]:
> The index is about 35GB on disk with each register between 15k and 30k.
> (This is simply the size of a full xml reply of one register. I'm not sure
> how to measure it otherwise.)

> Our memory requirements are running amok. We have less than a quarter of
> our customers running now and even though we have allocated 25GB to the JVM
> already, we are still seeing daily OOM crashes.

That does sound a bit peculiar. I do not understand what you mean by "register" 
though. How many documents does your index holds?

> I can see from the memory dumps we've done that the field cache is by far
> the biggest sinner.

Do you sort on a lot of different fields?

> We do a lot of facetting. One client facets on about 50.000 docs of approx
> 30k each on 5 fields. I understand that this is VERY memory intensive.

To get a rough approximation of memory usage, we need the total number of 
documents, the average number of values for each of the 5 fields for a document 
and the number of unique values in each of the 5 fields. The rule of thumb I 
use for lower ceiling is

#documents*log2(#references) + #references*log2(#unique_values) bit

If your whole index has 10M documents, which each has 100 values for each 
field, with each field having 50M unique values, then the memory requirement 
would be more than 10M*log2(100*10M) + 100*10M*log2(50M) bit ~= 340MB/field ~= 
1.6GB for faceting on all fields. Even when we multiply that with 4 to get a 
more real-world memory requirement, it is far from the 25GB that you are 
allocating. Either you have an interestingly high number somewhere in the 
equation or something's off.

Regards,
Toke Eskildsen

Recommendation for integration test framework

2013-03-24 Thread Jan Morlock
Hi,

our solr implementation consists of several cores sometimes interacting with
each other. Using SolrTestCaseJ4 didn't work out for us. Instead we would
like to test the resulting war from outside using integration tests. We are
utilizing Apache Maven as build management tool. Therefore we are currently
thinking about using the maven failsafe plugin.
Does anybody have experiences with using it in combination with solr? Or
does somebody have a better recommendation for us?

Thank you very much in advance
Jan



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Recommendation-for-integration-test-framework-tp4050936.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Recommendation for integration test framework

2013-03-24 Thread Furkan KAMACI
Unrelated about your question you said that: "We are utilizing Apache Maven
as build management tool" I think currently ant + ivy is build and
dependency management tools, maven pom is generated  via plugin (If I am
wrong you can correct it). Are there any plan to move the project based on
Maven?

2013/3/25 Jan Morlock 

> Hi,
>
> our solr implementation consists of several cores sometimes interacting
> with
> each other. Using SolrTestCaseJ4 didn't work out for us. Instead we would
> like to test the resulting war from outside using integration tests. We are
> utilizing Apache Maven as build management tool. Therefore we are currently
> thinking about using the maven failsafe plugin.
> Does anybody have experiences with using it in combination with solr? Or
> does somebody have a better recommendation for us?
>
> Thank you very much in advance
> Jan
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Recommendation-for-integration-test-framework-tp4050936.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


RE: Solr using a ridiculous amount of memory

2013-03-24 Thread Toke Eskildsen
Toke Eskildsen [t...@statsbiblioteket.dk]:
> If your whole index has 10M documents, which each has 100 values
> for each field, with each field having 50M unique values, then the 
> memory requirement would be more than 
> 10M*log2(100*10M) + 100*10M*log2(50M) bit ~= 340MB/field ~=
> 1.6GB for faceting on all fields.

Whoops. Missed a 0 when calculating. The case above would actually take more 
than 15GB, probably also more than the 25GB you have allocated.


Anyway, I see now in your solrconfig that your main facet fields are "cat", 
"manu_exact", "content_type" and "author_s", with the 5th being maybe "price", 
"popularity" or "manufacturedate_dt"?

cat seems like category (relatively few references, few uniques), content_type 
probably has a single value/item and again few uniques. No memory problem 
there, unless you have a lot of documents (100M-range). That leaves manu_exact 
and author_s. If those are freetext fields with item descriptions or similar, 
that might explain the OOM.

Could you describe the facet fields in more detail and provide us with the 
total document count?


Quick sanity check: If you are using a Linux server, could you please verify 
that your virtual memory is set to unlimited with 'ulimit -v'?

Regards,
Toke Eskildsen


Re: Solr using a ridiculous amount of memory

2013-03-24 Thread Jack Krupansky
A step I meant to include was that after you "warm" Solr with a 
representative collection of queries that references all of the fields, 
facets, sorting, etc. that your daily load will reference, check the Java 
heap size at that point, and then set your Java heap limit to a moderate 
level higher, like 256M, restart, and then see what happens.


The theory is that if you have too much available heap, Java will gradually 
fill it all with garbage (no leaks implied, but maybe some leaks as well), 
and then a Java GC will be an expensive hit, and sometimes a rapid flow of 
incoming requests at that point can cause Java to freak out and even hit OOM 
even though a more graceful garbage collection would eventually free up tons 
of garbage.


So, by only allowing for a moderate amount of garbage, more frequent GCs 
will be less intensive and less likely to cause weird situations.


The other part of the theory is that it is usually better to leave tons of 
memory to the OS for efficiently caching files, rather than force Java to 
manage large amounts of memory, which it typically does not do so well.


-- Jack Krupansky

-Original Message- 
From: Jack Krupansky

Sent: Sunday, March 24, 2013 2:00 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr using a ridiculous amount of memory

Just to get started, do you hit OOM quickly with a few expensive queries, or
is it after a number of hours and lots of queries?

Does Java heap usage seem to be growing linearly as queries come in, or are
there big spikes?

How complex/rich are your queries (e.g., how many terms, wildcards, faceted
fields, sorting, etc.)?

As a baseline experiment, start a Solr server, see how much Java heap is
used/available. Then do a couple of typical queries, and check the heap size
again. Then do a couple more similar but different (to avoid query cache
matches), and check the heap again. Maybe do that a few times to get a
handle on the baseline memory required and whether there might be a leak of
some sort. Do enough queries to hits all of the fields, facets, sorting,
etc. that are likely to be encountered in one of your typical days that hits
OOM - just not the volume of queries. The goal is to determine if there is
something inherently memory intensive in your index/queries, or something
relating to a leak based on total query volume.

-- Jack Krupansky

-Original Message- 
From: John Nielsen

Sent: Sunday, March 24, 2013 4:19 AM
To: solr-user@lucene.apache.org
Subject: Solr using a ridiculous amount of memory

Hello all,

We are running a solr cluster which is now running solr-4.2.

The index is about 35GB on disk with each register between 15k and 30k.
(This is simply the size of a full xml reply of one register. I'm not sure
how to measure it otherwise.)

Our memory requirements are running amok. We have less than a quarter of
our customers running now and even though we have allocated 25GB to the JVM
already, we are still seeing daily OOM crashes. We used to just allocate
more memory to the JVM, but with the way solr is scaling, we would need
well over 100GB of memory on each node to finish the project, and thats
just not going to happen. I need to lower the memory requirements somehow.

I can see from the memory dumps we've done that the field cache is by far
the biggest sinner. Of special interest to me is the recent introduction of
DocValues which supposedly mitigates this issue by using memory outside the
JVM. I just can't, because of lack of documentation, seem to make it work.

We do a lot of facetting. One client facets on about 50.000 docs of approx
30k each on 5 fields. I understand that this is VERY memory intensive.

Schema with DocValues attempt at solving problem:
http://pastebin.com/Ne23NnW4
Config: http://pastebin.com/x1qykyXW

The cache is pretty well tuned. Any lower and i get evictions.

Come hell or high water, my JVM memory requirements must come down. Simply
moving some memory load outside of the JVM would be awesome! Making it not
use the field cache for anything would also (probably) work for me. I
thought about killing off my other caches, but from the dumps, they just
don't seem to use that much memory.

I am at my wits end. Any help would be sorely appreciated.

--
Med venlig hilsen / Best regards

*John Nielsen*
Programmer



*MCB A/S*
Enghaven 15
DK-7500 Holstebro

Kundeservice: +45 9610 2824
p...@mcb.dk
www.mcb.dk 



Re: Too many fields to Sort in Solr

2013-03-24 Thread adityab
thanks Eric. in this query "q=*:*" the Lucene score is always 1 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Too-many-fields-to-Sort-in-Solr-tp4049139p4050944.html
Sent from the Solr - User mailing list archive at Nabble.com.


[ANNOUNCE] Solr wiki editing change

2013-03-24 Thread Steve Rowe
The wiki at http://wiki.apache.org/solr/ has come under attack by spammers more 
frequently of late, so the PMC has decided to lock it down in an attempt to 
reduce the work involved in tracking and removing spam.

From now on, only people who appear on 
http://wiki.apache.org/solr/ContributorsGroup will be able to 
create/modify/delete wiki pages.

Please request either on the solr-user@lucene.apache.org or on 
d...@lucene.apache.org to have your wiki username added to the 
ContributorsGroup page - this is a one-time step.

Steve

RE: SOLR 4.2 SolrQuery exception

2013-03-24 Thread Sandeep Kumar Anumalla
Hi,

I managed to resolve this issue and I am getting the results also. But this 
time I am getting a different exception while loading Solr Container

Here is the Code.

String SOLR_HOME = "/data/solr1/example/solr/collection1";
CoreContainer coreContainer = new CoreContainer(SOLR_HOME);
CoreDescriptor discriptor = new CoreDescriptor(coreContainer, 
"collection1", new File(SOLR_HOME).getAbsolutePath());
SolrCore solrCore = coreContainer.create(discriptor);
coreContainer.register(solrCore, false);
File home = new File( SOLR_HOME );
File f = new File( home, "solr.xml" );
coreContainer.load( SOLR_HOME, f );
server = new EmbeddedSolrServer( coreContainer, "collection1" );
SolrQuery q = new SolrQuery();


Parameters inside Solrconfig.xml

simple
true


WARNING: Unable to get IndexCommit on startup
org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: 
SimpleFSLock@/data/solr1/example/solr/collection1/./data/index/write.lock
   at org.apache.lucene.store.Lock.obtain(Lock.java:84)
   at org.apache.lucene.index.IndexWriter.(IndexWriter.java:636)
   at org.apache.solr.update.SolrIndexWriter.(SolrIndexWriter.java:77)
   at org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:64)
   at 
org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:192)
   at 
org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:106)
   at 
org.apache.solr.handler.ReplicationHandler.inform(ReplicationHandler.java:904)
   at 
org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:592)
   at org.apache.solr.core.SolrCore.(SolrCore.java:801)
   at org.apache.solr.core.SolrCore.(SolrCore.java:619)
   at 
org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:1021)
   at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1051)
   at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:634)
   at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:629)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
   at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
   at java.lang.Thread.run(Thread.java:679)



From: Sandeep Kumar Anumalla
Sent: 24 March, 2013 03:44 PM
To: solr-user@lucene.apache.org
Subject: SOLR 4.2 SolrQuery exception

I am using the below code and getting the exception while using SolrQuery



Mar 24, 2013 3:08:07 PM org.apache.solr.core.QuerySenderListener newSearcher
INFO: QuerySenderListener sending requests to Searcher@795e0c2b 
main{StandardDirectoryReader(segments_49:524 _4v(4.2):C299313 
_4x(4.2):C2953/1396 _4y(4.2):C2866/1470 _4z(4.2):C4263/2793 _50(4.2):C3554/761 
_51(4.2):C1126/365 _52(4.2):C650/285 _53(4.2):C500/215 _54(4.2):C1808/1593 
_55(4.2):C1593)}
Mar 24, 2013 3:08:07 PM org.apache.solr.common.SolrException log
SEVERE: java.lang.NullPointerException
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:181)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1797)
at 
org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:64)
at org.apache.solr.core.SolrCore$5.call(SolrCore.java:1586)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:679)

Mar 24, 2013 3:08:07 PM org.apache.solr.core.SolrCore execute
INFO: [collection1] webapp=null path=null 
params={event=firstSearcher&q=static+firstSearcher+warming+in+solrconfig.xml&distrib=false}
 status=500 QTime=4
Mar 24, 2013 3:08:07 PM org.apache.solr.core.QuerySenderListener newSearcher
INFO: QuerySenderListener done.
Mar 24, 2013 3:08:07 PM 
org.apache.solr.handler.component.SpellCheckComponent$SpellCheckerListener 
newSearcher
INFO: Loading spell index for spellchecker: default
Mar 24, 2013 3:08:07 PM 
org.apache.solr.handler.component.SpellCheckComponent$SpellCheckerListener 
newSearcher
INFO: Loading spell index for spellchecker: wordbreak
Mar 24, 2013 3:08:07 PM org.apache.solr.core.SolrCore registerSearcher
INFO: [collection1] Registered new searcher Searcher@795e0c2b 
main{StandardDirectoryReader(segmen

Using Solrj to Get termVectors

2013-03-24 Thread Rendy Bambang Junior
Hi all,

I've enabled term vector component to be stored. The result has been
shown using http request on browser. Since I'm planning to build web
service using java, I need to get those values using Solrj.

I've been googling find this solution
(http://stackoverflow.com/questions/8977852/how-to-parse-the-termvectorcomponent-response-to-which-java-object)
but it seems like some function has been deprecated.

Does anybody know how to get termVectors using Solrj?

-- 
Regards,
Rendy Bambang Junior
Informatics Engineering '09
Bandung Institute of Technology


Re: how to get term vector information of sepcific word/position in field

2013-03-24 Thread vrparekh
Thanks Chris,





--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-to-get-term-vector-information-of-sepcific-word-position-in-field-tp4047637p4050997.html
Sent from the Solr - User mailing list archive at Nabble.com.


Multi-core and replicated Solr cloud testing. Data-directory mis-configures

2013-03-24 Thread Trevor Campbell

I have three indexes which I have set up as three separate cores, using this 
solr.xml config.

  

   


   


   

  

This works just fine a standalone solr.

I duplicated this setup on the same machine under a completely separate solr installation (solr-nodeb) and modified all 
the data directroies to point to the direstories in nodeb.  This all worked fine.


I then connected the 2 instances together with zoo-keeper using settings "-Dbootstrap_conf=true 
-Dcollection.configName=jiraCluster -DzkRun -DnumShards=1" for the first intsance and "-DzkHost=localhost:9080" for  the 
second. (I'm using tomcat and ports 8080 and 8081 for the 2 Solr instances)


Now the data directories of the second node point to the data directories in 
the first node.

I have tried many settings in the solrconfig.xml for each core but am now using 
absolute paths, e.g.
/home//solr-4.2.0-nodeb/example/multicore/jira-comment/data

previously I used
${solr.jira-comment.data.dir:/home/tcampbell/solr-4.2.0-nodeb/example/multicore/jira-comment/data}
but that had the same result.

It seems zookeeper is forcing data directory config from the uploaded 
configuration on all the nodes in the cluster?

How can I do testing on a single machine? Do I really need identical directory 
layouts on all machines?




Re: SOLR 4.2 SolrQuery exception

2013-03-24 Thread Gopal Patwa
manually delete lock file
"/data/solr1/example/solr/collection1/./data/index/write.lock",
And restart solr


On Sun, Mar 24, 2013 at 9:32 PM, Sandeep Kumar Anumalla <
sanuma...@etisalat.ae> wrote:

> Hi,
>
> I managed to resolve this issue and I am getting the results also. But
> this time I am getting a different exception while loading Solr Container
>
> Here is the Code.
>
> String SOLR_HOME = "/data/solr1/example/solr/collection1";
> CoreContainer coreContainer = new CoreContainer(SOLR_HOME);
> CoreDescriptor discriptor = new CoreDescriptor(coreContainer,
> "collection1", new File(SOLR_HOME).getAbsolutePath());
> SolrCore solrCore = coreContainer.create(discriptor);
> coreContainer.register(solrCore, false);
> File home = new File( SOLR_HOME );
> File f = new File( home, "solr.xml" );
> coreContainer.load( SOLR_HOME, f );
> server = new EmbeddedSolrServer( coreContainer, "collection1" );
> SolrQuery q = new SolrQuery();
>
>
> Parameters inside Solrconfig.xml
> 
> simple
> true
>
>
> WARNING: Unable to get IndexCommit on startup
> org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out:
> SimpleFSLock@/data/solr1/example/solr/collection1/./data/index/write.lock
>at org.apache.lucene.store.Lock.obtain(Lock.java:84)
>at org.apache.lucene.index.IndexWriter.(IndexWriter.java:636)
>at
> org.apache.solr.update.SolrIndexWriter.(SolrIndexWriter.java:77)
>at
> org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:64)
>at
> org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:192)
>at
> org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:106)
>at
> org.apache.solr.handler.ReplicationHandler.inform(ReplicationHandler.java:904)
>at
> org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:592)
>at org.apache.solr.core.SolrCore.(SolrCore.java:801)
>at org.apache.solr.core.SolrCore.(SolrCore.java:619)
>at
> org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:1021)
>at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1051)
>at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:634)
>at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:629)
>at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>at java.lang.Thread.run(Thread.java:679)
>
>
>
> From: Sandeep Kumar Anumalla
> Sent: 24 March, 2013 03:44 PM
> To: solr-user@lucene.apache.org
> Subject: SOLR 4.2 SolrQuery exception
>
> I am using the below code and getting the exception while using SolrQuery
>
>
>
> Mar 24, 2013 3:08:07 PM org.apache.solr.core.QuerySenderListener
> newSearcher
> INFO: QuerySenderListener sending requests to 
> Searcher@795e0c2bmain{StandardDirectoryReader(segments_49:524 _4v(4.2):C299313
> _4x(4.2):C2953/1396 _4y(4.2):C2866/1470 _4z(4.2):C4263/2793
> _50(4.2):C3554/761 _51(4.2):C1126/365 _52(4.2):C650/285 _53(4.2):C500/215
> _54(4.2):C1808/1593 _55(4.2):C1593)}
> Mar 24, 2013 3:08:07 PM org.apache.solr.common.SolrException log
> SEVERE: java.lang.NullPointerException
> at
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:181)
> at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1797)
> at
> org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:64)
> at org.apache.solr.core.SolrCore$5.call(SolrCore.java:1586)
> at
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
> at java.util.concurrent.FutureTask.run(FutureTask.java:166)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> at java.lang.Thread.run(Thread.java:679)
>
> Mar 24, 2013 3:08:07 PM org.apache.solr.core.SolrCore execute
> INFO: [collection1] webapp=null path=null
> params={event=firstSearcher&q=static+firstSearcher+warming+in+solrconfig.xml&distrib=false}
> status=500 QTime=4
> Mar 24, 2013 3:08:07 PM org.apache.solr.core.QuerySenderListener
> newSearcher
> INFO: QuerySenderListener done.
> Mar 24, 2013 3:08:07 PM
> org.apache.solr.handler.component.SpellCheckComponent$SpellCheckerListener
> newSearcher
> INFO: Lo