Re: NRT and commit behavior

2011-09-26 Thread Vadim Kisselmann
Tirthankar,

are you indexing 1.smaller docs or 2.books?
if 1.  your caches are too big for your memory, as Erick already said.
Try to allocate 10GB für JVM, leave 14GB for your HDD-Cache and make your
caches smaller.

if 2.  read the blog-posts on hathitrust.com.
http://www.hathitrust.org/blogs/large-scale-search

Regards
Vadim


2011/9/24 Erick Erickson 

> No . The problem is that "number of documents" isn't a reliable
> indicator of resource consumption. Consider the difference between
> indexing a twitter message and a book. I can put a LOT more docs
> of 140 chars on a single machine of size X than I can books.
>
> Unfortunately, the only way I know of is to test. Use something like
> jMeter of SolrMeter to fire enough queries at your machine to
> determine when you're over-straining resources and shard at that
> point (or get a bigger machine )..
>
> Best
> Erick
>
> On Wed, Sep 21, 2011 at 8:24 PM, Tirthankar Chatterjee
>  wrote:
> > Okay, but is there any number that if we reach on the index size or total
> docs in the index or the size of physical memory that sharding should be
> considered.
> >
> > I am trying to find the winning combination.
> > Tirthankar
> > -Original Message-
> > From: Erick Erickson [mailto:erickerick...@gmail.com]
> > Sent: Friday, September 16, 2011 7:46 AM
> > To: solr-user@lucene.apache.org
> > Subject: Re: NRT and commit behavior
> >
> > Uhm, you're putting  a lot of index into not very much memory. I really
> think you're going to have to shard your index across several machines to
> get past this problem. Simply increasing the size of your caches is still
> limited by the physical memory you're working with.
> >
> > You really have to put a profiler on the system to see what's going on.
> At that size there are too many things that it *could* be to definitively
> answer it with e-mails
> >
> > Best
> > Erick
> >
> > On Wed, Sep 14, 2011 at 7:35 AM, Tirthankar Chatterjee <
> tchatter...@commvault.com> wrote:
> >> Erick,
> >> Also, we had  our solrconfig where we have tried increasing the
> cache making the below value for autowarm count as 0 helps returning the
> commit call within the second, but that will slow us down on searches
> >>
> >>  >>  class="solr.FastLRUCache"
> >>  size="16384"
> >>  initialSize="4096"
> >>  autowarmCount="4096"/>
> >>
> >>
> >>
> >>   
> >> >>  class="solr.LRUCache"
> >>  size="16384"
> >>  initialSize="4096"
> >>  autowarmCount="4096"/>
> >>
> >>  
> >> >>  class="solr.LRUCache"
> >>  size="512"
> >>  initialSize="512"
> >>  autowarmCount="512"/>
> >>
> >> -Original Message-
> >> From: Tirthankar Chatterjee [mailto:tchatter...@commvault.com]
> >> Sent: Wednesday, September 14, 2011 7:31 AM
> >> To: solr-user@lucene.apache.org
> >> Subject: RE: NRT and commit behavior
> >>
> >> Erick,
> >> Here is the answer to your questions:
> >> Our index is 267 GB
> >> We are not optimizing...
> >> No we have not profiled yet to check the bottleneck, but logs indicate
> opening the searchers is taking time...
> >> Nothing except SOLR
> >> Total memory is 16GB tomcat has 8GB allocated Everything 64 bit OS and
> >> JVM and Tomcat
> >>
> >> -Original Message-
> >> From: Erick Erickson [mailto:erickerick...@gmail.com]
> >> Sent: Sunday, September 11, 2011 11:37 AM
> >> To: solr-user@lucene.apache.org
> >> Subject: Re: NRT and commit behavior
> >>
> >> Hmm, OK. You might want to look at the non-cached filter query stuff,
> it's quite recent.
> >> The point here is that it is a filter that is applied only after all of
> the less expensive filter queries are run, One of its uses is exactly ACL
> calculations. Rather than calculate the ACL for the entire doc set, it only
> calculates access for docs that have made it past all the other elements of
> the query See SOLR-2429 and note that it is a 3.4 (currently being
> released) only.
> >>
> >> As to why your commits are taking so long, I have no idea given that you
> really haven't given us much to work with.
> >>
> >> How big is your index? Are you optimizing? Have you profiled the
> application to see what the bottleneck is (I/O, CPU, etc?). What else is
> running on your machine? It's quite surprising that it takes that long. How
> much memory are you giving the JVM? etc...
> >>
> >> You might want to review:
> >> http://wiki.apache.org/solr/UsingMailingLists
> >>
> >> Best
> >> Erick
> >>
> >>
> >> On Fri, Sep 9, 2011 at 9:41 AM, Tirthankar Chatterjee <
> tchatter...@commvault.com> wrote:
> >>> Erick,
> >>> What you said is correct for us the searches are based on some Active
> Directory permissions which are populated in Filter query parameter. So we
> don't have any warming query concept as we cannot fire for every user ahead
> of time.
> >>>
> >>> What we do here is that when user logs in we do an invalid query(which
> return no results instead of '*') with the correct filter query (which is
> his permis

Re: strange performance issue with many shards on one server

2011-09-28 Thread Vadim Kisselmann
Hi Fred,
analyze the queries which take longer.
We observe our queries and see the problems with q-time with queries which
are complex, with phrase queries or queries which contains numbers or
special characters.
if you don't know it:
http://www.hathitrust.org/blogs/large-scale-search/tuning-search-performance
Regards
Vadim


2011/9/28 Frederik Kraus 

>  Hi,
>
>
> I am experiencing a strange issue doing some load tests. Our setup:
>
> - 2 server with each 24 cpu cores, 130GB of RAM
> - 10 shards per server (needed for response times) running in a single
> tomcat instance
> - each query queries all 20 shards (distributed search)
>
> - each shard holds about 1.5 mio documents (small shards are needed due to
> rather complex queries)
> - all caches are warmed / high cache hit rates (99%) etc.
>
>
> Now for some reason we cannot seem to fully utilize all CPU power (no disk
> IO), ie. increasing concurrent users doesn't increase CPU-Load at a point,
> decreases throughput and increases the response times of the individual
> queries.
>
> Also 1-2% of the queries take significantly longer: avg somewhere at 100ms
> while 1-2% take 1.5s or longer.
>
> Any ideas are greatly appreciated :)
>
> Fred.
>
>


Re: strange performance issue with many shards on one server

2011-09-28 Thread Vadim Kisselmann
Hi Fred,

ok, it's a strange behavior with same queries.
Another questions:
-which solr version?
-do you indexing during your load test? (because of index rebuilt)
-do you replicate your index?

Regards
Vadim



2011/9/28 Frederik Kraus 

> Hi Vladim,
>
> the thing is, that those exact same queries, that take longer during a load
> test, perform just fine when executed at a slower request rate and are also
> random, i.e. there is no pattern in bad/slow queries.
>
> My first thought was some kind of contention and/or connection starvation
> for the internal shard communication?
>
> Fred.
>
>
> Am Mittwoch, 28. September 2011 um 13:18 schrieb Vadim Kisselmann:
>
> > Hi Fred,
> > analyze the queries which take longer.
> > We observe our queries and see the problems with q-time with queries
> which
> > are complex, with phrase queries or queries which contains numbers or
> > special characters.
> > if you don't know it:
> >
> http://www.hathitrust.org/blogs/large-scale-search/tuning-search-performance
> > Regards
> > Vadim
> >
> >
> > 2011/9/28 Frederik Kraus  frederik.kr...@gmail.com)>
> >
> > >  Hi,
> > >
> > >
> > > I am experiencing a strange issue doing some load tests. Our setup:
> > >
> > > - 2 server with each 24 cpu cores, 130GB of RAM
> > > - 10 shards per server (needed for response times) running in a single
> > > tomcat instance
> > > - each query queries all 20 shards (distributed search)
> > >
> > > - each shard holds about 1.5 mio documents (small shards are needed due
> to
> > > rather complex queries)
> > > - all caches are warmed / high cache hit rates (99%) etc.
> > >
> > >
> > > Now for some reason we cannot seem to fully utilize all CPU power (no
> disk
> > > IO), ie. increasing concurrent users doesn't increase CPU-Load at a
> point,
> > > decreases throughput and increases the response times of the individual
> > > queries.
> > >
> > > Also 1-2% of the queries take significantly longer: avg somewhere at
> 100ms
> > > while 1-2% take 1.5s or longer.
> > >
> > > Any ideas are greatly appreciated :)
> > >
> > > Fred.
>
>


Re: Still too many files after running solr optimization

2011-09-28 Thread Vadim Kisselmann
why should the optimization reduce the number of files?
It happens only when you indexing docs with same unique key.

Have you differences in numDocs und maxDocs after optimize?
If yes:
how is your optimize command ?

Regards
Vadim



2011/9/28 Manish Bafna 

> Try to do optimize twice.
> The 2nd one will be quick and will delete lot of files.
>
> On Wed, Sep 28, 2011 at 5:26 PM, Kissue Kissue 
> wrote:
> > Hi,
> >
> > I am using solr 3.3. I noticed  that after indexing about 700, 000
> records
> > and running optimization at the end, i still have about 91 files in my
> index
> > directory. I thought that optimization was supposed to reduce the number
> of
> > files.
> >
> > My settings are the default that came with Solr (mergefactor, etc)
> >
> > Any ideas what i could be doing wrong?
> >
>


Re: Still too many files after running solr optimization

2011-09-28 Thread Vadim Kisselmann
if numDocs und maxDocs have the same mumber of docs nothing will be deleted
on optimize.
You only rebuild your index.

Regards
Vadim




2011/9/28 Kissue Kissue 

> numDocs and maxDocs are same size.
>
> I was worried because when i used to use only Lucene for the same indexing,
> before optimization there are many files but after optimization i always
> end
> up with just 3 files in my index filder. Just want to find out if this was
> ok.
>
> Thanks
>
> On Wed, Sep 28, 2011 at 1:23 PM, Vadim Kisselmann <
> v.kisselm...@googlemail.com> wrote:
>
> > why should the optimization reduce the number of files?
> > It happens only when you indexing docs with same unique key.
> >
> > Have you differences in numDocs und maxDocs after optimize?
> > If yes:
> > how is your optimize command ?
> >
> > Regards
> > Vadim
> >
> >
> >
> > 2011/9/28 Manish Bafna 
> >
> > > Try to do optimize twice.
> > > The 2nd one will be quick and will delete lot of files.
> > >
> > > On Wed, Sep 28, 2011 at 5:26 PM, Kissue Kissue 
> > > wrote:
> > > > Hi,
> > > >
> > > > I am using solr 3.3. I noticed  that after indexing about 700, 000
> > > records
> > > > and running optimization at the end, i still have about 91 files in
> my
> > > index
> > > > directory. I thought that optimization was supposed to reduce the
> > number
> > > of
> > > > files.
> > > >
> > > > My settings are the default that came with Solr (mergefactor, etc)
> > > >
> > > > Any ideas what i could be doing wrong?
> > > >
> > >
> >
>


Re: strange performance issue with many shards on one server

2011-09-28 Thread Vadim Kisselmann
:1987)
> at
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
> at
> java.util.concurrent.ExecutorCompletionService.take(ExecutorCompletionService.java:164)
> at
> org.apache.solr.handler.component.HttpCommComponent.takeCompletedOrError(SearchHandler.java:469)
> at
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:271)
> at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1368)
> at
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252)
> at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
> at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> at
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
> at
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
> at
> org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:554)
> at
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
> at
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
> at
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
> at
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
> at
> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859)
> at
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)
> at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
> at java.lang.Thread.run(Thread.java:662)
>
>
>
>
>
>
> Am Mittwoch, 28. September 2011 um 13:53 schrieb Frederik Kraus:
>
> >
> >
> > Am Mittwoch, 28. September 2011 um 13:41 schrieb Vadim Kisselmann:
> >
> > > Hi Fred,
> > >
> > > ok, it's a strange behavior with same queries.
> > > Another questions:
> > > -which solr version?
> >
> > 3.3 (might the NIOFSDirectory from 3.4 help?)
> >
> > > -do you indexing during your load test? (because of index rebuilt)
> > nope
> >
> > > -do you replicate your index?
> >
> > nope
> > >
> > > Regards
> > > Vadim
> > >
> > >
> > >
> > > 2011/9/28 Frederik Kraus  frederik.kr...@gmail.com)>
> > >
> > > > Hi Vladim,
> > > >
> > > > the thing is, that those exact same queries, that take longer during
> a load
> > > > test, perform just fine when executed at a slower request rate and
> are also
> > > > random, i.e. there is no pattern in bad/slow queries.
> > > >
> > > > My first thought was some kind of contention and/or connection
> starvation
> > > > for the internal shard communication?
> > > >
> > > > Fred.
> > > >
> > > >
> > > > Am Mittwoch, 28. September 2011 um 13:18 schrieb Vadim Kisselmann:
> > > >
> > > > > Hi Fred,
> > > > > analyze the queries which take longer.
> > > > > We observe our queries and see the problems with q-time with
> queries
> > > > which
> > > > > are complex, with phrase queries or queries which contains numbers
> or
> > > > > special characters.
> > > > > if you don't know it:
> > > >
> http://www.hathitrust.org/blogs/large-scale-search/tuning-search-performance
> > > > > Regards
> > > > > Vadim
> > > > >
> > > > >
> > > > > 2011/9/28 Frederik Kraus  frederik.kr...@gmail.com) (mailto:
> > > > frederik.kr...@gmail.com (mailto:frederik.kr...@gmail.com))>
> > > > >
> > > > > >  Hi,
> > > > > >
> > > > > >
> > > > > > I am experiencing a strange issue doing some load tests. Our
> setup:
> > > > > >
> > > > > > - 2 server with each 24 cpu cores, 130GB of RAM
> > > > > > - 10 shards per server (needed for response times) running in a
> single
> > > > > > tomcat instance
> > > > > > - each query queries all 20 shards (distributed search)
> > > > > >
> > > > > > - each shard holds about 1.5 mio documents (small shards are
> needed due
> > > > to
> > > > > > rather complex queries)
> > > > > > - all caches are warmed / high cache hit rates (99%) etc.
> > > > > >
> > > > > >
> > > > > > Now for some reason we cannot seem to fully utilize all CPU power
> (no
> > > > disk
> > > > > > IO), ie. increasing concurrent users doesn't increase CPU-Load at
> a
> > > > point,
> > > > > > decreases throughput and increases the response times of the
> individual
> > > > > > queries.
> > > > > >
> > > > > > Also 1-2% of the queries take significantly longer: avg somewhere
> at
> > > > 100ms
> > > > > > while 1-2% take 1.5s or longer.
> > > > > >
> > > > > > Any ideas are greatly appreciated :)
> > > > > >
> > > > > > Fred.
>
>


Re: Still too many files after running solr optimization

2011-09-28 Thread Vadim Kisselmann
2011/9/28 Manish Bafna 

> >>Will it not merge the index?
>

yes


> >>While merging on windows, the old index files dont get deleted.
> >>(Windows has an issue where the file opened for reading cannot be
> >>deleted)
> >>
> >>So, if you call optimize again, it will delete the older index files.
>
> no.
during optimize you only delete docs, which are flagged as deleted. no
matter how old they are.
if your numDocs and maxDocs have the same number of Docs, you only rebuild
and merge your index, but you delete nothing.

Regards




> On Wed, Sep 28, 2011 at 6:43 PM, Vadim Kisselmann
>  wrote:
> > if numDocs und maxDocs have the same mumber of docs nothing will be
> deleted
> > on optimize.
> > You only rebuild your index.
> >
> > Regards
> > Vadim
> >
> >
> >
> >
> > 2011/9/28 Kissue Kissue 
> >
> >> numDocs and maxDocs are same size.
> >>
> >> I was worried because when i used to use only Lucene for the same
> indexing,
> >> before optimization there are many files but after optimization i always
> >> end
> >> up with just 3 files in my index filder. Just want to find out if this
> was
> >> ok.
> >>
> >> Thanks
> >>
> >> On Wed, Sep 28, 2011 at 1:23 PM, Vadim Kisselmann <
> >> v.kisselm...@googlemail.com> wrote:
> >>
> >> > why should the optimization reduce the number of files?
> >> > It happens only when you indexing docs with same unique key.
> >> >
> >> > Have you differences in numDocs und maxDocs after optimize?
> >> > If yes:
> >> > how is your optimize command ?
> >> >
> >> > Regards
> >> > Vadim
> >> >
> >> >
> >> >
> >> > 2011/9/28 Manish Bafna 
> >> >
> >> > > Try to do optimize twice.
> >> > > The 2nd one will be quick and will delete lot of files.
> >> > >
> >> > > On Wed, Sep 28, 2011 at 5:26 PM, Kissue Kissue  >
> >> > > wrote:
> >> > > > Hi,
> >> > > >
> >> > > > I am using solr 3.3. I noticed  that after indexing about 700, 000
> >> > > records
> >> > > > and running optimization at the end, i still have about 91 files
> in
> >> my
> >> > > index
> >> > > > directory. I thought that optimization was supposed to reduce the
> >> > number
> >> > > of
> >> > > > files.
> >> > > >
> >> > > > My settings are the default that came with Solr (mergefactor, etc)
> >> > > >
> >> > > > Any ideas what i could be doing wrong?
> >> > > >
> >> > >
> >> >
> >>
> >
>


Re: Still too many files after running solr optimization

2011-09-28 Thread Vadim Kisselmann
we had an understanding problem:)

docs are the docs in index.
files are the files in the index directory (index parts).

during the optimization you don't delete docs if they are don't flagged as
deleted.
but you merge your index und delete the files in your index directory, thats
right.

after an second optimize the files are deleted which were opened for
reading.

Regards



2011/9/28 Manish Bafna 

> We tested it so many times.
> 1st time we optimize, the new index file is created (merged one), but
> the existing index files are not deleted (because they might be still
> open for reading)
> 2nd time optimize, other than the new index file, all else gets deleted.
>
> This is happening specifically on Windows.
>
> On Wed, Sep 28, 2011 at 8:23 PM, Vadim Kisselmann
>  wrote:
> > 2011/9/28 Manish Bafna 
> >
> >> >>Will it not merge the index?
> >>
> >
> > yes
> >
> >
> >> >>While merging on windows, the old index files dont get deleted.
> >> >>(Windows has an issue where the file opened for reading cannot be
> >> >>deleted)
> >> >>
> >> >>So, if you call optimize again, it will delete the older index files.
> >>
> >> no.
> > during optimize you only delete docs, which are flagged as deleted. no
> > matter how old they are.
> > if your numDocs and maxDocs have the same number of Docs, you only
> rebuild
> > and merge your index, but you delete nothing.
> >
> > Regards
> >
> >
> >
> >
> >> On Wed, Sep 28, 2011 at 6:43 PM, Vadim Kisselmann
> >>  wrote:
> >> > if numDocs und maxDocs have the same mumber of docs nothing will be
> >> deleted
> >> > on optimize.
> >> > You only rebuild your index.
> >> >
> >> > Regards
> >> > Vadim
> >> >
> >> >
> >> >
> >> >
> >> > 2011/9/28 Kissue Kissue 
> >> >
> >> >> numDocs and maxDocs are same size.
> >> >>
> >> >> I was worried because when i used to use only Lucene for the same
> >> indexing,
> >> >> before optimization there are many files but after optimization i
> always
> >> >> end
> >> >> up with just 3 files in my index filder. Just want to find out if
> this
> >> was
> >> >> ok.
> >> >>
> >> >> Thanks
> >> >>
> >> >> On Wed, Sep 28, 2011 at 1:23 PM, Vadim Kisselmann <
> >> >> v.kisselm...@googlemail.com> wrote:
> >> >>
> >> >> > why should the optimization reduce the number of files?
> >> >> > It happens only when you indexing docs with same unique key.
> >> >> >
> >> >> > Have you differences in numDocs und maxDocs after optimize?
> >> >> > If yes:
> >> >> > how is your optimize command ?
> >> >> >
> >> >> > Regards
> >> >> > Vadim
> >> >> >
> >> >> >
> >> >> >
> >> >> > 2011/9/28 Manish Bafna 
> >> >> >
> >> >> > > Try to do optimize twice.
> >> >> > > The 2nd one will be quick and will delete lot of files.
> >> >> > >
> >> >> > > On Wed, Sep 28, 2011 at 5:26 PM, Kissue Kissue <
> kissue...@gmail.com
> >> >
> >> >> > > wrote:
> >> >> > > > Hi,
> >> >> > > >
> >> >> > > > I am using solr 3.3. I noticed  that after indexing about 700,
> 000
> >> >> > > records
> >> >> > > > and running optimization at the end, i still have about 91
> files
> >> in
> >> >> my
> >> >> > > index
> >> >> > > > directory. I thought that optimization was supposed to reduce
> the
> >> >> > number
> >> >> > > of
> >> >> > > > files.
> >> >> > > >
> >> >> > > > My settings are the default that came with Solr (mergefactor,
> etc)
> >> >> > > >
> >> >> > > > Any ideas what i could be doing wrong?
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >> >
> >>
> >
>


Strange result behavior with wildcard-queries

2011-10-13 Thread Vadim Kisselmann
Hello folks,
my wildcard-search shows strange behavior.
Sometimes i have results, sometimes not.

I use the last nightly build(Solr 4.0, Build #1643)

I use this filters and tokenizers to "index":
WhitespaceTokenizer
WoldDelimiterFilter
LowerCaseFilter
RemoveDuplicateTokenFilter
ReversedWildcardFilter

This one for "query":
WhitespaceTokenizer
WoldDelimiterFilter
LowerCaseFilter
RemoDuplicatesTokenFilter

This is my search-text:
http://www.redakteur.eu/?p=89025

And my words:

Schnelle DNA-Analyse führt zu taktischen Vorsprüngen im Gefechtseinsatz
*DNA-Analyse AND Gefechtseinsatz*

Behavior and Query:

DNA-Analyse AND Gefechtseinsatz (1 MATCH)
DNA-Analyse AND Gefechts* (1 MATCH)
DNA-Analyse AND ge* (1 MATCH)

DNA* (0 MATCHES)
dna* (6 MATCHES)

Gefechts* (0 MATCHES)
gefecht* (6 MATCHES)

*seinsatz (1 MATCH)
*NA-Analyse (0 MATCHES)
*na-Analyse (0 MATCHES)
*na-analyse (0 MATCHES)

Which filters can help me to match this one result with standard-wildquery
und reversed-wildquery?

Thanks and regards
Vadim


Re: Strange result behavior with wildcard-queries

2011-10-13 Thread Vadim Kisselmann
Hi Erick,

thanks for your quick response.
I've analyzed it and i've already thought the same.

This are the JIRA-Issues:
https://issues.apache.org/jira/browse/SOLR-219
https://issues.apache.org/jira/browse/SOLR-2438
Both are still open.

I think i wait 1-2 months. Then i write a custom component:)

Regards
Vadim




2011/10/13 Erick Erickson 

> Wildcard queries are NOT analyzed. So the fact that your
> queries that are identical except for case produce different
> result sets is expected behavior. I believe there's a JIRA to
> allow limited analysis of wildcard queries, but I confess I
> don't know what the status of it is.
>
> You'll have to do whatever normalization you need to do at
> the app level before you pass the query on to Solr or write a
> custom component to deal with this case I think.
>
> Best
> Erick
>
> On Thu, Oct 13, 2011 at 10:26 AM, Vadim Kisselmann
>  wrote:
> > Hello folks,
> > my wildcard-search shows strange behavior.
> > Sometimes i have results, sometimes not.
> >
> > I use the last nightly build(Solr 4.0, Build #1643)
> >
> > I use this filters and tokenizers to "index":
> > WhitespaceTokenizer
> > WoldDelimiterFilter
> > LowerCaseFilter
> > RemoveDuplicateTokenFilter
> > ReversedWildcardFilter
> >
> > This one for "query":
> > WhitespaceTokenizer
> > WoldDelimiterFilter
> > LowerCaseFilter
> > RemoDuplicatesTokenFilter
> >
> > This is my search-text:
> > http://www.redakteur.eu/?p=89025
> >
> > And my words:
> >
> > Schnelle DNA-Analyse führt zu taktischen Vorsprüngen im Gefechtseinsatz
> > *DNA-Analyse AND Gefechtseinsatz*
> >
> > Behavior and Query:
> >
> > DNA-Analyse AND Gefechtseinsatz (1 MATCH)
> > DNA-Analyse AND Gefechts* (1 MATCH)
> > DNA-Analyse AND ge* (1 MATCH)
> >
> > DNA* (0 MATCHES)
> > dna* (6 MATCHES)
> >
> > Gefechts* (0 MATCHES)
> > gefecht* (6 MATCHES)
> >
> > *seinsatz (1 MATCH)
> > *NA-Analyse (0 MATCHES)
> > *na-Analyse (0 MATCHES)
> > *na-analyse (0 MATCHES)
> >
> > Which filters can help me to match this one result with
> standard-wildquery
> > und reversed-wildquery?
> >
> > Thanks and regards
> > Vadim
> >
>


Morelikethis understanding question

2011-10-14 Thread Vadim Kisselmann
Hello folks,
i have a question about the MLT.

For example my query:

localhost:8983/solr/mlt/?q=gefechtseinsatz+AND+dna&mlt=true&mlt.fl=text&mlt.count=0&mlt.boost=true&mlt.mindf=5&mlt.mintf=5&mlt.minwl=4

*I have 1 Query-RESULT and 13 MLT-docs. The MLT-Result corresponds to
the half of my index.*
In my case i want j*ust this docs, which have at least half of the words
from my Query-RESULT-Document,* they should be very similar.
How should i set my parameters to achieve this?

Thanks and Regards
Vadim


Re: millions of records problem

2011-10-17 Thread Vadim Kisselmann
Hi,
a number of relevant questions is given.
i have another one:
which type of docs do you have? Do you add some new docs every day? Or is it
a stable number of docs (500Mio.) ?
What about Replication?

Regards Vadim


2011/10/17 Otis Gospodnetic 

> Hi Jesús,
>
> Others have already asked a number of relevant question.  If I had to
> guess, I'd guess this is simply a disk IO issue, but of course there may be
> room for improvement without getting more RAM or SSDs, so tell us more about
> your queries, about disk IO you are seeing, etc.
>
> Otis
> 
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Lucene ecosystem search :: http://search-lucene.com/
>
>
> >
> >From: Jesús Martín García 
> >To: solr-user@lucene.apache.org
> >Sent: Monday, October 17, 2011 6:19 AM
> >Subject: millions of records problem
> >
> >Hi,
> >
> >I've got 500 millions of documents in solr everyone with the same number
> of fields an similar width. The version of solr which I used is 1.4.1 with
> lucene 2.9.3.
> >
> >I don't have the option to use shards so the whole index has to be in a
> machine...
> >
> >The size of the index is about 50Gb and the ram is 8GbEverything is
> working but the searches are so slowly, although I tried different
> configurations of the solrconfig.xml as:
> >
> >- configure a first searcher with the most used searches
> >- configure the caches (query, filter and document) with great numbers...
> >
> >but everything is still working slowly, so do you have any ideas to boost
> the searches without the penalty to use much more ram?
> >
> >Thanks in advance,
> >
> >Jesús
> >
> >-- ...
> >  __
> >/   /   Jesús Martín García
> >C E / S / C A   Tècnic de Projectes
> >  /__ / Centre de Serveis Científics i Acadèmics de Catalunya
> >
> >Gran Capità, 2-4 (Edifici Nexus) · 08034 Barcelona
> >T. 93 551 6213 · F. 93 205 6979 · jmar...@cesca.cat
> >...
> >
> >
> >
> >
>


LUCENE-2208 (SOLR-1883) Bug with HTMLStripCharFilter, given patch in next nightly build?

2011-10-20 Thread Vadim Kisselmann
Hello folks,

i have big problems with InvalidTokenOffsetExceptions with highlighting.
Looks like a bug in HTMLStripCharFilter.

H.Wang added a patch in LUCENE-2208, but nobody have time to look at this.
Could someone of the committers please take a look at this patch and commit
it or is this problem more complicated as i think? :)
Thanks guys...

Best Regards
Vadim


Re: LUCENE-2208 (SOLR-1883) Bug with HTMLStripCharFilter, given patch in next nightly build?

2011-10-21 Thread Vadim Kisselmann
UPDATE:
i checked out the latest trunk-version and patched this with the patch from
LUCENE-2208.
This patch seems not to work. Or i had done something wrong.

My old log snippets:

Http - 500 Internal Server Error
Error: Carrot2 clustering failed

And this was caused by:
Http - 500 Internal Server Error
Error: org.apache.lucene.search.highlight.InvalidTokenOffsetsException:
Token the exceeds length of provided text sized 41

Best Regards
Vadim





2011/10/20 Vadim Kisselmann 

> Hello folks,
>
> i have big problems with InvalidTokenOffsetExceptions with highlighting.
> Looks like a bug in HTMLStripCharFilter.
>
> H.Wang added a patch in LUCENE-2208, but nobody have time to look at this.
> Could someone of the committers please take a look at this patch and commit
> it or is this problem more complicated as i think? :)
> Thanks guys...
>
> Best Regards
> Vadim
>
>
>


how to : multicore setup with same config files

2011-10-31 Thread Vadim Kisselmann
Hi folks,

i have a small blockade in the configuration of an multicore setup.
i use the latest solr version (4.0) from trunk and the example (with jetty).
single core is running without problems.

We assume that i have this structure:

/solr-trunk/solr/example/multicore/

   solr.xml

   core0/

   core1/


/solr-data/

  /conf/

schema.xml

solrconfig.xml

  /data/

core0/

  index

core1/

  index


I want so share the config-files(same instanceDir but different docDir)

How can i configure this so that it works(solrconfig.xml, solr.xml)?

Do i need the directories for core0/core1 in solr-trunk/...?


I found issues in Jira with old patches which unfortunately doesn't work.


Thanks and Regards

Vadim


Re: how to : multicore setup with same config files

2011-10-31 Thread Vadim Kisselmann
it works.
it was one wrong placed backslash in my config;)
sharing the config/schema files is not a problem.
regards vadim


2011/10/31 Vadim Kisselmann 

> Hi folks,
>
> i have a small blockade in the configuration of an multicore setup.
> i use the latest solr version (4.0) from trunk and the example (with
> jetty).
> single core is running without problems.
>
> We assume that i have this structure:
>
> /solr-trunk/solr/example/multicore/
>
>solr.xml
>
>core0/
>
>core1/
>
>
> /solr-data/
>
>   /conf/
>
> schema.xml
>
> solrconfig.xml
>
>   /data/
>
> core0/
>
>   index
>
> core1/
>
>   index
>
>
> I want so share the config-files(same instanceDir but different docDir)
>
> How can i configure this so that it works(solrconfig.xml, solr.xml)?
>
> Do i need the directories for core0/core1 in solr-trunk/...?
>
>
> I found issues in Jira with old patches which unfortunately doesn't work.
>
>
> Thanks and Regards
>
> Vadim
>
>
>
>
>
>


shard indexing

2011-11-02 Thread Vadim Kisselmann
Hello folks,
i have an problem with shard indexing.

with an single core i use this update command:
http://localhost:8983/solr/update .

now i have 2 shards, we can call them core0 / core1
http://localhost:8983/solr/core0/update .


can i adjust anything to indexing in the same way like with a single core
without core-name?

thanks and regards
vadim


Re: shard indexing

2011-11-02 Thread Vadim Kisselmann
Hello Jan,

thanks for your quick response.

It's quite difficult to explain:
We want to create new shards on the fly every month and switch the default
shard to the newest one.
We always want to index to the newest shard with the same update query
like  http://localhost:8983/solr/update.(content stream)

Is our idea possible to implement?

Thanks in advance.
Regards

Vadim





2011/11/2 Jan Høydahl 

> Hi,
>
> The only difference is the core name in the URL, which should be easy
> enough to handle from your indexing client code. I don't really understand
> the reason behind your request. How would you control which core to index
> your document to if you did not specify it in the URL?
>
> You could name ONE of your cores as ".", meaning it would be the "default"
> core living at /solr/update, perhaps that is what you're looking for?
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
> Solr Training - www.solrtraining.com
>
> On 2. nov. 2011, at 10:00, Vadim Kisselmann wrote:
>
> > Hello folks,
> > i have an problem with shard indexing.
> >
> > with an single core i use this update command:
> > http://localhost:8983/solr/update .
> >
> > now i have 2 shards, we can call them core0 / core1
> > http://localhost:8983/solr/core0/update .
> >
> >
> > can i adjust anything to indexing in the same way like with a single core
> > without core-name?
> >
> > thanks and regards
> > vadim
>
>


Re: shard indexing

2011-11-02 Thread Vadim Kisselmann
Hello Yury,

thanks for your response.
This is exactly my plan. But "defaultCoreName" is buggy. When i use it
(defaultCore="core_november"), the defaultCore will be deleted.
I think this here was the issue:
https://issues.apache.org/jira/browse/SOLR-2127

Do you use this feature and did it work?

Thanks and Regards
Vadim




2011/11/2 Yury Kats 

> There's a "defaultCore" parameter in solr.xml that let's you specify what
> core should be used when none is specified in the URL. You can change that
> every time you create a new core.
>
>
>
> >
> >From: Vadim Kisselmann 
> >To: solr-user@lucene.apache.org
> >Sent: Wednesday, November 2, 2011 6:16 AM
> >Subject: Re: shard indexing
> >
> >Hello Jan,
> >
> >thanks for your quick response.
> >
> >It's quite difficult to explain:
> >We want to create new shards on the fly every month and switch the default
> >shard to the newest one.
> >We always want to index to the newest shard with the same update query
> >like  http://localhost:8983/solr/update.(content stream)
> >
> >Is our idea possible to implement?
> >
> >Thanks in advance.
> >Regards
> >
> >Vadim
> >
> >
> >
> >
> >
> >2011/11/2 Jan Høydahl 
> >
> >> Hi,
> >>
> >> The only difference is the core name in the URL, which should be easy
> >> enough to handle from your indexing client code. I don't really
> understand
> >> the reason behind your request. How would you control which core to
> index
> >> your document to if you did not specify it in the URL?
> >>
> >> You could name ONE of your cores as ".", meaning it would be the
> "default"
> >> core living at /solr/update, perhaps that is what you're looking for?
> >>
> >> --
> >> Jan Høydahl, search solution architect
> >> Cominvent AS - www.cominvent.com
> >> Solr Training - www.solrtraining.com
> >>
> >> On 2. nov. 2011, at 10:00, Vadim Kisselmann wrote:
> >>
> >> > Hello folks,
> >> > i have an problem with shard indexing.
> >> >
> >> > with an single core i use this update command:
> >> > http://localhost:8983/solr/update .
> >> >
> >> > now i have 2 shards, we can call them core0 / core1
> >> > http://localhost:8983/solr/core0/update .
> >> >
> >> >
> >> > can i adjust anything to indexing in the same way like with a single
> core
> >> > without core-name?
> >> >
> >> > thanks and regards
> >> > vadim
> >>
> >>
> >
> >
> >
>


Re: shard indexing

2011-11-02 Thread Vadim Kisselmann
Hello Jan,

i think personally the same (switch URL for my indexing code), but my
requirement is to use the same query.
Thanks for your suppose with this one trick. Great idea which could work in
my case, i test it.

Regards
Vadim



2011/11/2 Jan Høydahl 

> Personally I think it is better to be explicit about where you index, so
> that when you create a new shard "december", you also switch the URL for
> your indexing code.
>
> I suppose one trick you could use is to have a core called "current",
> which now would be for november, and once you get to december, you create a
> "november" core, and do a SWAP between "current"<->"november". Then your
> new core would now be "current" and you don't need to change URLs on the
> index client side.
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
> Solr Training - www.solrtraining.com
>
> On 2. nov. 2011, at 11:16, Vadim Kisselmann wrote:
>
> > Hello Jan,
> >
> > thanks for your quick response.
> >
> > It's quite difficult to explain:
> > We want to create new shards on the fly every month and switch the
> default
> > shard to the newest one.
> > We always want to index to the newest shard with the same update query
> > like  http://localhost:8983/solr/update.(content stream)
> >
> > Is our idea possible to implement?
> >
> > Thanks in advance.
> > Regards
> >
> > Vadim
> >
> >
> >
> >
> >
> > 2011/11/2 Jan Høydahl 
> >
> >> Hi,
> >>
> >> The only difference is the core name in the URL, which should be easy
> >> enough to handle from your indexing client code. I don't really
> understand
> >> the reason behind your request. How would you control which core to
> index
> >> your document to if you did not specify it in the URL?
> >>
> >> You could name ONE of your cores as ".", meaning it would be the
> "default"
> >> core living at /solr/update, perhaps that is what you're looking for?
> >>
> >> --
> >> Jan Høydahl, search solution architect
> >> Cominvent AS - www.cominvent.com
> >> Solr Training - www.solrtraining.com
> >>
> >> On 2. nov. 2011, at 10:00, Vadim Kisselmann wrote:
> >>
> >>> Hello folks,
> >>> i have an problem with shard indexing.
> >>>
> >>> with an single core i use this update command:
> >>> http://localhost:8983/solr/update .
> >>>
> >>> now i have 2 shards, we can call them core0 / core1
> >>> http://localhost:8983/solr/core0/update .
> >>>
> >>>
> >>> can i adjust anything to indexing in the same way like with a single
> core
> >>> without core-name?
> >>>
> >>> thanks and regards
> >>> vadim
> >>
> >>
>
>


Similar documents and advantages / disadvantages of MLT / Deduplication

2011-11-07 Thread Vadim Kisselmann
Hello folks,

i have questions about MLT and Deduplication and what would be the best
choice in my case.

Case:

I index 1000 docs, 5 of them are 95% the same (for example: copy pasted
blog articles from different sources, with slight changes (author name,
etc..)).
But they have differences.
*Now i like to see 1 doc in my result set and the other 4 should be marked
as similar.*

With *MLT*:
text
  5
  50
  3
  5000
  true
  text
   

With this config i get about 500 similar docs for this 1 doc, unfortunately
too much.


*Deduplication*:
I index this docs now with an signature and i'm using TextProfileSignature.


   
 true
 signature_t
 false
 text
 solr.processor.TextProfileSignature

   
   
 

How can i compare the created signatures?


I want only see the 5 similar docs, nothing else.
Which of this two cases is relevant to me? Can i tune the MLT for my
requirement? Or should i use Dedupe?

Thanks and Regards
Vadim


Re: InvalidTokenOffsetsException when using MappingCharFilterFactory, DictionaryCompoundWordTokenFilterFactory and Highlighting

2011-11-11 Thread Vadim Kisselmann
Hi Edwin, Chris

it´s an old bug. I have big problems too with OffsetExceptions when i use
Highlighting, or Carrot.
It looks like a problem with HTMLStripCharFilter.
Patch doesn´t work.

https://issues.apache.org/jira/browse/LUCENE-2208

Regards
Vadim



2011/11/11 Edwin Steiner 

> I just entered a bug: https://issues.apache.org/jira/browse/SOLR-2891
>
> Thanks & regards, Edwin
>
> On Nov 7, 2011, at 8:47 PM, Chris Hostetter wrote:
>
> >
> > : finally I want to use Solr highlighting. But there seems to be a
> problem
> > : if I combine the char filter and the compound word filter in
> combination
> > : with highlighting (an
> > : org.apache.lucene.search.highlight.InvalidTokenOffsetsException is
> > : raised).
> >
> > Definitely sounds like a bug somwhere in dealing with the offsets.
> >
> > can you please file a Jira, and include all of the data you have provided
> > here?  it would also be helpful to know what the analysis tool says about
> > the various attributes of your tokens at each stage of the analysis?
> >
> > : SEVERE: org.apache.solr.common.SolrException:
> org.apache.lucene.search.highlight.InvalidTokenOffsetsException: Token fall
> exceeds length of provided text sized 12
> > : at
> org.apache.solr.highlight.DefaultSolrHighlighter.doHighlightingByHighlighter(DefaultSolrHighlighter.java:469)
> > : at
> org.apache.solr.highlight.DefaultSolrHighlighter.doHighlighting(DefaultSolrHighlighter.java:378)
> > : at
> org.apache.solr.handler.component.HighlightComponent.process(HighlightComponent.java:116)
> > : at
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:194)
> > : at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
> > : at org.apache.solr.core.SolrCore.execute(SolrCore.java:1360)
> > : at
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356)
> > : at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252)
> > : at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
> > : at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
> > : at
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:224)
> > : at
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175)
> > : at
> org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:462)
> > : at
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:164)
> > : at
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:100)
> > : at
> org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:851)
> > : at
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
> > : at
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:405)
> > : at
> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:278)
> > : at
> org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:515)
> > : at
> org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:302)
> > : at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> > : at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> > : at java.lang.Thread.run(Thread.java:680)
> > : Caused by:
> org.apache.lucene.search.highlight.InvalidTokenOffsetsException: Token fall
> exceeds length of provided text sized 12
> > : at
> org.apache.lucene.search.highlight.Highlighter.getBestTextFragments(Highlighter.java:228)
> > : at
> org.apache.solr.highlight.DefaultSolrHighlighter.doHighlightingByHighlighter(DefaultSolrHighlighter.java:462)
> > : ... 23 more
> >
> >
> > -Hoss
>
>


Re: how to : multicore setup with same config files

2011-11-23 Thread Vadim Kisselmann
Hi,
yes, see http://wiki.apache.org/solr/DistributedSearch
Regards
Vadim


2011/11/2 Val Minyaylo 

> Have you tried to query multiple cores at same time?
>
>
> On 10/31/2011 8:30 AM, Vadim Kisselmann wrote:
>
>> it works.
>> it was one wrong placed backslash in my config;)
>> sharing the config/schema files is not a problem.
>> regards vadim
>>
>>
>> 2011/10/31 Vadim 
>> Kisselmann
>> >
>>
>>  Hi folks,
>>>
>>> i have a small blockade in the configuration of an multicore setup.
>>> i use the latest solr version (4.0) from trunk and the example (with
>>> jetty).
>>> single core is running without problems.
>>>
>>> We assume that i have this structure:
>>>
>>> /solr-trunk/solr/example/**multicore/
>>>
>>>solr.xml
>>>
>>>core0/
>>>
>>>core1/
>>>
>>>
>>> /solr-data/
>>>
>>>   /conf/
>>>
>>> schema.xml
>>>
>>> solrconfig.xml
>>>
>>>   /data/
>>>
>>> core0/
>>>
>>>   index
>>>
>>> core1/
>>>
>>>   index
>>>
>>>
>>> I want so share the config-files(same instanceDir but different docDir)
>>>
>>> How can i configure this so that it works(solrconfig.xml, solr.xml)?
>>>
>>> Do i need the directories for core0/core1 in solr-trunk/...?
>>>
>>>
>>> I found issues in Jira with old patches which unfortunately doesn't work.
>>>
>>>
>>> Thanks and Regards
>>>
>>> Vadim
>>>
>>>
>>>
>>>
>>>
>>>
>>>


Weird docs-id clustering output in Solr 1.4.1

2011-11-29 Thread Vadim Kisselmann
Hi folks,
i've installed the clustering component in solr 1.4.1 and it works, but not
really:)

You can see what the doc id is corrupt.



Euro-Krise

½Íџ
¾౥ͽ
¿)ై
ˆ࡯׸


my fields:





and my config-snippets:
title
 id
 
 text

i changed my config snippets (carrot.url=id, url, title..) but the
result is the same.
anyone an idea?

best regards and thanks
vadim


Re: Weird docs-id clustering output in Solr 1.4.1

2011-11-29 Thread Vadim Kisselmann
Hello Staszek,

thanks for testing:)
i think the same (serialization issue ->int to string).
This config works fine with solr 4.0 in my test cluster, i think with 3,5
too, without problems.
But my actual live system works on solr 1.4.1. i can only change my
solrconfig.xml and integrate new packages...
i check the possibility to upgrade from 1.4.1 to 3.5 with the same index
(without reinidex) with luceneMatchVersion 2.9.
i hope it works...

Thanks and regards
Vadim


2011/11/29 Stanislaw Osinski 

> Hi,
>
> It looks like some serialization issue related to writing integer ids to
> the output. I've just tried a similar configuration on Solr 3.5 and the
> integer identifiers looked fine. Can you try the same configuration on Solr
> 3.5?
>
> Thanks,
>
> Staszek
>
> On Tue, Nov 29, 2011 at 12:03, Vadim Kisselmann <
> v.kisselm...@googlemail.com
> > wrote:
>
> > Hi folks,
> > i've installed the clustering component in solr 1.4.1 and it works, but
> not
> > really:)
> >
> > You can see what the doc id is corrupt.
> >
> > 
> > 
> > Euro-Krise
> > 
> > ½Íџ
> > ¾౥ͽ
> > ¿)ై
> > ˆ࡯׸
> > 
> >
> > my fields:
> >  required="true"/>
> >  > required="true"/>
> >  > required="true"/>
> >  > multiValued="true" compressed="true"/>
> >
> > and my config-snippets:
> > title
> >  id
> >  
> >  text
> >
> > i changed my config snippets (carrot.url=id, url, title..) but the
> > result is the same.
> > anyone an idea?
> >
> > best regards and thanks
> > vadim
> >
>


Re: Weird docs-id clustering output in Solr 1.4.1

2011-11-29 Thread Vadim Kisselmann
Hi,
the quick and dirty way sound good:)
It would be great if you can send me a patch for 1.4.1.


By the way, i tested Solr. 3.5 with my 1.4.1 test index.
I can search and optimize, but clustering doesn't work (java.lang.Integer
cannot be cast to java.lang.String)
My uniqieKey for my docs it the "id"(sint).
These here was the error message:


Problem accessing /solr/select/. Reason:

   Carrot2 clustering failed

org.apache.solr.common.SolrException: Carrot2 clustering failed
   at
org.apache.solr.handler.clustering.carrot2.CarrotClusteringEngine.cluster(CarrotClusteringEngine.java:217)
   at
org.apache.solr.handler.clustering.ClusteringComponent.process(ClusteringComponent.java:91)
   at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:194)
   at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1372)
   at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356)
   at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252)
   at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
   at
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
   at
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
   at
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
   at
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
   at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
   at
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
   at
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
   at
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
   at org.mortbay.jetty.Server.handle(Server.java:326)
   at
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
   at
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
   at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
   at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
   at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
   at
org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
   at
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
Caused by: java.lang.ClassCastException: java.lang.Integer cannot be cast
to java.lang.String
   at
org.apache.solr.handler.clustering.carrot2.CarrotClusteringEngine.getDocuments(CarrotClusteringEngine.java:364)
   at
org.apache.solr.handler.clustering.carrot2.CarrotClusteringEngine.cluster(CarrotClusteringEngine.java:201)
   ... 23 more

It this case it's better for me to upgrade/patch the 1.4.1 version.

Best regards
Vadim




2011/11/29 Stanislaw Osinski 

> >
> > But my actual live system works on solr 1.4.1. i can only change my
> > solrconfig.xml and integrate new packages...
> > i check the possibility to upgrade from 1.4.1 to 3.5 with the same index
> > (without reinidex) with luceneMatchVersion 2.9.
> > i hope it works...
> >
>
> Another option would be to check out Solr 1.4.1 source code, fix the issue
> and recompile the clustering component. The quick and dirty way would be to
> convert all identifiers to strings in the clustering component, before the
> they are returned for serialization (I can send you a patch that does
> this). The proper way would be to fix the root cause of the problem, but
> I'd need to dig deeper into the code to find this.
>
> Staszek
>


Re: Weird docs-id clustering output in Solr 1.4.1

2011-12-01 Thread Vadim Kisselmann
Hi Stanislaw,
did you already have time to create a patch?
If not, can you tell me please which lines in which class in source code
are relevant?
Thanks and regards
Vadim Kisselmann



2011/11/29 Vadim Kisselmann 

> Hi,
> the quick and dirty way sound good:)
> It would be great if you can send me a patch for 1.4.1.
>
>
> By the way, i tested Solr. 3.5 with my 1.4.1 test index.
> I can search and optimize, but clustering doesn't work (java.lang.Integer
> cannot be cast to java.lang.String)
> My uniqieKey for my docs it the "id"(sint).
> These here was the error message:
>
>
> Problem accessing /solr/select/. Reason:
>
>Carrot2 clustering failed
>
> org.apache.solr.common.SolrException: Carrot2 clustering failed
>at
> org.apache.solr.handler.clustering.carrot2.CarrotClusteringEngine.cluster(CarrotClusteringEngine.java:217)
>at
> org.apache.solr.handler.clustering.ClusteringComponent.process(ClusteringComponent.java:91)
>at
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:194)
>at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
>at org.apache.solr.core.SolrCore.execute(SolrCore.java:1372)
>at
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356)
>at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252)
>at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
>at
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
>at
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
>at
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
>at
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
>at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
>at
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
>at
> org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
>at
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
>at org.mortbay.jetty.Server.handle(Server.java:326)
>at
> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
>at
> org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
>at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
>at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
>at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
>at
> org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
>at
> org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
> Caused by: java.lang.ClassCastException: java.lang.Integer cannot be cast
> to java.lang.String
>at
> org.apache.solr.handler.clustering.carrot2.CarrotClusteringEngine.getDocuments(CarrotClusteringEngine.java:364)
>at
> org.apache.solr.handler.clustering.carrot2.CarrotClusteringEngine.cluster(CarrotClusteringEngine.java:201)
>... 23 more
>
> It this case it's better for me to upgrade/patch the 1.4.1 version.
>
> Best regards
> Vadim
>
>
>
>
> 2011/11/29 Stanislaw Osinski 
>
>> >
>> > But my actual live system works on solr 1.4.1. i can only change my
>> > solrconfig.xml and integrate new packages...
>> > i check the possibility to upgrade from 1.4.1 to 3.5 with the same index
>> > (without reinidex) with luceneMatchVersion 2.9.
>> > i hope it works...
>> >
>>
>> Another option would be to check out Solr 1.4.1 source code, fix the issue
>> and recompile the clustering component. The quick and dirty way would be
>> to
>> convert all identifiers to strings in the clustering component, before the
>> they are returned for serialization (I can send you a patch that does
>> this). The proper way would be to fix the root cause of the problem, but
>> I'd need to dig deeper into the code to find this.
>>
>> Staszek
>>
>
>


Re: Error in New Solr version

2011-12-01 Thread Vadim Kisselmann
Hi,
comment out the lines with the collapse component in your solrconfig.xml if
not need it.
otherwise, you're missing the right jar's for this component, or path's to
this jars in your solrconfig.xml are wrong.
regards
vadim



2011/12/1 Pawan Darira 

> Hi
>
> I am migrating from Solr 1.4 to Solr 3.2. I am getting below error in my
> logs
>
> org.apache.solr.common.SolrException: Error loading class
> 'org.apache.solr.handler.component.CollapseComponent
>
> Could not found satisfactory solution on google. please help
>
> thanks
> Pawan
>


Re: Weird docs-id clustering output in Solr 1.4.1

2011-12-01 Thread Vadim Kisselmann
Hi Stanislaw,

unfortunately it doesn't work.
I changed the line 216 with the new "toString()"-part and rebuild the
source.
still the same behavior, without errors(because of changes).
an another line to change?

Thanks and regards
Vadim



2011/12/1 Stanislaw Osinski 

> Hi Vadim,
>
> I've had limited connectivity, so I couldn't check out the complete 1.4.1
> code and test the changes. Here's what you can try:
>
> In this file:
>
>
> http://svn.apache.org/viewvc/lucene/solr/tags/release-1.4.1/contrib/clustering/src/main/java/org/apache/solr/handler/clustering/carrot2/CarrotClusteringEngine.java?revision=957515&view=markup
>
> around line 216 you will see:
>
> for (Document doc : docs) {
>  docList.add(doc.getField("solrId"));
> }
>
> You need to change this to:
>
> for (Document doc : docs) {
>  docList.add(doc.getField("solrId").toString());
> }
>
> Let me know if this did the trick.
>
> Cheers,
>
> S.
>
> On Thu, Dec 1, 2011 at 10:43, Vadim Kisselmann
> wrote:
>
> > Hi Stanislaw,
> > did you already have time to create a patch?
> > If not, can you tell me please which lines in which class in source code
> > are relevant?
> > Thanks and regards
> > Vadim Kisselmann
> >
> >
> >
> > 2011/11/29 Vadim Kisselmann 
> >
> > > Hi,
> > > the quick and dirty way sound good:)
> > > It would be great if you can send me a patch for 1.4.1.
> > >
> > >
> > > By the way, i tested Solr. 3.5 with my 1.4.1 test index.
> > > I can search and optimize, but clustering doesn't work
> (java.lang.Integer
> > > cannot be cast to java.lang.String)
> > > My uniqieKey for my docs it the "id"(sint).
> > > These here was the error message:
> > >
> > >
> > > Problem accessing /solr/select/. Reason:
> > >
> > >Carrot2 clustering failed
> > >
> > > org.apache.solr.common.SolrException: Carrot2 clustering failed
> > >at
> > >
> >
> org.apache.solr.handler.clustering.carrot2.CarrotClusteringEngine.cluster(CarrotClusteringEngine.java:217)
> > >at
> > >
> >
> org.apache.solr.handler.clustering.ClusteringComponent.process(ClusteringComponent.java:91)
> > >at
> > >
> >
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:194)
> > >at
> > >
> >
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
> > >at org.apache.solr.core.SolrCore.execute(SolrCore.java:1372)
> > >at
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356)
> > >at
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252)
> > >at
> > >
> >
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
> > >at
> > >
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
> > >at
> > >
> >
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
> > >at
> > >
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
> > >at
> > >
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
> > >at
> > org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
> > >at
> > >
> >
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
> > >at
> > >
> >
> org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
> > >at
> > >
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
> > >at org.mortbay.jetty.Server.handle(Server.java:326)
> > >at
> > > org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
> > >at
> > >
> >
> org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
> > >at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
> > >at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
> > >at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
> > >at
> > >
> >
> org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
> > >at
> > >
> >
> org.mortbay.thread

Size of fields from one document (monitoring, debugging)

2012-01-18 Thread Vadim Kisselmann
Hello folks,

is it possible to find out the size (in KB) of specific fields from
one document? Eventually with Luke or Lucid Gaze?
My case:
docs in my old index (Solr 1.4) have sizes of 3-4KB each.
In my new index(Solr 4.0 trunk) there are about 15KB per doc.
I changed only 2 things in my schema.xml. I added the
ReversedWildcardFilterFactory(indexing) and one field (LatLonType,
stored and indexed).
My content is more or less the same.
I would like to debug this to refactor my schema.xml.

The newest Luke Version(3.5) doesn't work with Solr 4.0 from trunk, so
i can't test it.

Cheers
Vadim


Re: Size of index to use shard

2012-01-24 Thread Vadim Kisselmann
Hi,
it depends from your hardware.
Read this:
http://www.derivante.com/2009/05/05/solr-performance-benchmarks-single-vs-multi-core-index-shards/
Think about your cache-config (few updates, big caches) and a good
HW-infrastructure.
In my case i can handle a 250GB index with 100mil. docs on a I7
machine with RAID10 and 24GB RAM => q-times under 1 sec.
Regards
Vadim



2012/1/24 Anderson vasconcelos :
> Hi
> Has some size of index (or number of docs) that is necessary to break
> the index in shards?
> I have a index with 100GB of size. This index increase 10GB per year.
> (I don't have information how many docs they have) and the docs never
> will be deleted.  Thinking in 30 years, the index will be with 400GB
> of size.
>
> I think  is not required to break in shard, because i not consider
> this like a "large index". Am I correct? What's is a real "large
> index"
>
>
> Thanks


Re: Size of index to use shard

2012-01-24 Thread Vadim Kisselmann
@Erick
thanks:)
i´m with you with your opinion.
my load tests show the same.

@Dmitry
my docs are small too, i think about 3-15KB per doc.
i update my index all the time and i have an average of 20-50 requests
per minute (20% facet queries, 80% large boolean queries with
wildcard/fuzzy) . How much docs at a time=> depends from choosed
filters, from 10 to all 100Mio.
I work with very small caches (strangely, but if my index is under
100GB i need larger caches, over 100GB smaller caches..)
My JVM has 6GB, 18GB for I/O.
With few updates a day i would configure very big caches, like Tim
Burton (see HathiTrust´s Blog)

Regards Vadim



2012/1/24 Anderson vasconcelos :
> Thanks for the explanation Erick :)
>
> 2012/1/24, Erick Erickson :
>> Talking about "index size" can be very misleading. Take
>> a look at http://lucene.apache.org/java/3_5_0/fileformats.html#file-names.
>> Note that the *.fdt and *.fdx files are used to for stored fields, i.e.
>> the verbatim copy of data put in the index when you specify
>> stored="true". These files have virtually no impact on search
>> speed.
>>
>> So, if your *.fdx and *.fdt files are 90G out of a 100G index
>> it is a much different thing than if these files are 10G out of
>> a 100G index.
>>
>> And this doesn't even mention the peculiarities of your query mix.
>> Nor does it say a thing about whether your cheapest alternative
>> is to add more memory.
>>
>> Anderson's method is about the only reliable one, you just have
>> to test with your index and real queries. At some point, you'll
>> find your tipping point, typically when you come under memory
>> pressure. And it's a balancing act between how much memory
>> you allocate to the JVM and how much you leave for the op
>> system.
>>
>> Bottom line: No hard and fast numbers. And you should periodically
>> re-test the empirical numbers you *do* arrive at...
>>
>> Best
>> Erick
>>
>> On Tue, Jan 24, 2012 at 5:31 AM, Anderson vasconcelos
>>  wrote:
>>> Apparently, not so easy to determine when to break the content into
>>> pieces. I'll investigate further about the amount of documents, the
>>> size of each document and what kind of search is being used. It seems,
>>> I will have to do a load test to identify the cutoff point to begin
>>> using the strategy of shards.
>>>
>>> Thanks
>>>
>>> 2012/1/24, Dmitry Kan :
>>>> Hi,
>>>>
>>>> The article you gave mentions 13GB of index size. It is quite small index
>>>> from our perspective. We have noticed, that at least solr 3.4 has some
>>>> sort
>>>> of "choking" point with respect to growing index size. It just becomes
>>>> substantially slower than what we need (a query on avg taking more than
>>>> 3-4
>>>> seconds) once index size crosses a magic level (about 80GB following our
>>>> practical observations). We try to keep our indices at around 60-70GB for
>>>> fast searches and above 100GB for slow ones. We also route majority of
>>>> user
>>>> queries to fast indices. Yes, caching may help, but not necessarily we
>>>> can
>>>> afford adding more RAM for bigger indices. BTW, our documents are very
>>>> small, thus in 100GB index we can have around 200 mil. documents. It
>>>> would
>>>> be interesting to see, how you manage to ensure q-times under 1 sec with
>>>> an
>>>> index of 250GB? How many documents / facets do you ask max. at a time?
>>>> FYI,
>>>> we ask for a thousand of facets in one go.
>>>>
>>>> Regards,
>>>> Dmitry
>>>>
>>>> On Tue, Jan 24, 2012 at 10:30 AM, Vadim Kisselmann <
>>>> v.kisselm...@googlemail.com> wrote:
>>>>
>>>>> Hi,
>>>>> it depends from your hardware.
>>>>> Read this:
>>>>>
>>>>> http://www.derivante.com/2009/05/05/solr-performance-benchmarks-single-vs-multi-core-index-shards/
>>>>> Think about your cache-config (few updates, big caches) and a good
>>>>> HW-infrastructure.
>>>>> In my case i can handle a 250GB index with 100mil. docs on a I7
>>>>> machine with RAID10 and 24GB RAM => q-times under 1 sec.
>>>>> Regards
>>>>> Vadim
>>>>>
>>>>>
>>>>>
>>>>> 2012/1/24 Anderson vasconcelos :
>>>>> > Hi
>>>>> > Has some size of index (or number of docs) that is necessary to break
>>>>> > the index in shards?
>>>>> > I have a index with 100GB of size. This index increase 10GB per year.
>>>>> > (I don't have information how many docs they have) and the docs never
>>>>> > will be deleted.  Thinking in 30 years, the index will be with 400GB
>>>>> > of size.
>>>>> >
>>>>> > I think  is not required to break in shard, because i not consider
>>>>> > this like a "large index". Am I correct? What's is a real "large
>>>>> > index"
>>>>> >
>>>>> >
>>>>> > Thanks
>>>>>
>>>>
>>


decreasing of maxFieldLength in solrconfig.xml doesn't work

2012-01-26 Thread Vadim Kisselmann
Hello Folks,
i want to decrease the max. number of terms for my fields to 500.
I thought what the maxFieldLength parameter in solrconfig.xml is
intended for this.
In my case it doesn't work.

The half of my text fields includes longer text(about 1 words).
With 100 docs in my index i had an segment size of 1140KB for indexed
data and 270KB for stored data (.fdx, .fdt).
After a change from default 1 to
500,
delete(index folder), restarting Tomcat and reindex, i see the same
segment sizes (1140KB for indexed and 270KB for stored data).

Please tell me if I made an error in reasoning.

Regards
Vadim


Re: decreasing of maxFieldLength in solrconfig.xml doesn't work

2012-01-26 Thread Vadim Kisselmann
P.S.:
i use Solr 4.0 from trunk.
Is maxFieldLength deprecated in Solr 4.0 ?
If so, do i have an alternative to decrease the number of terms during indexing?
Regards
Vadim



2012/1/26 Vadim Kisselmann :
> Hello Folks,
> i want to decrease the max. number of terms for my fields to 500.
> I thought what the maxFieldLength parameter in solrconfig.xml is
> intended for this.
> In my case it doesn't work.
>
> The half of my text fields includes longer text(about 1 words).
> With 100 docs in my index i had an segment size of 1140KB for indexed
> data and 270KB for stored data (.fdx, .fdt).
> After a change from default 1 to
> 500,
> delete(index folder), restarting Tomcat and reindex, i see the same
> segment sizes (1140KB for indexed and 270KB for stored data).
>
> Please tell me if I made an error in reasoning.
>
> Regards
> Vadim


Re: decreasing of maxFieldLength in solrconfig.xml doesn't work

2012-01-26 Thread Vadim Kisselmann
Sean, Ahmet,
thanks for response:)

I use Solr 4.0 from trunk.
In my solrconfig.xml is only one maxFieldLength param.
I think it is deprecated in Solr Versions 3.5+...

But LimitTokenCountFilterFactory works in my case :)
Thanks!

Regards
Vadim



2012/1/26 Ahmet Arslan :
>> i want to decrease the max. number of terms for my fields to
>> 500.
>> I thought what the maxFieldLength parameter in
>> solrconfig.xml is
>> intended for this.
>> In my case it doesn't work.
>>
>> The half of my text fields includes longer text(about 1
>> words).
>> With 100 docs in my index i had an segment size of 1140KB
>> for indexed
>> data and 270KB for stored data (.fdx, .fdt).
>> After a change from default
>> 1 to
>> 500,
>> delete(index folder), restarting Tomcat and reindex, i see
>> the same
>> segment sizes (1140KB for indexed and 270KB for stored
>> data).
>>
>> Please tell me if I made an error in reasoning.
>
> What version of solr are you using?
>
> Could it be 
> http://lucene.apache.org/solr/api/org/apache/solr/analysis/LimitTokenCountFilterFactory.html?
>
> http://lucene.apache.org/java/3_5_0/api/core/org/apache/lucene/analysis/LimitTokenCountFilter.html


Re: Solr 3.5.0 can't find Carrot classes

2012-01-27 Thread Vadim Kisselmann
Hi Christopher,
when all needed jars are included, you can only have wrong paths in
your solrconfig.xml
Regards
Vadim



2012/1/26 Stanislaw Osinski :
> Hi,
>
> Can you paste the logs from the second run?
>
> Thanks,
>
> Staszek
>
> On Wed, Jan 25, 2012 at 00:12, Christopher J. Bottaro > wrote:
>
>> On Tuesday, January 24, 2012 at 3:07 PM, Christopher J. Bottaro wrote:
>> > SEVERE: java.lang.NoClassDefFoundError:
>> org/carrot2/core/ControllerFactory
>> >         at
>> org.apache.solr.handler.clustering.carrot2.CarrotClusteringEngine.(CarrotClusteringEngine.java:102)
>> >         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>> Method)
>> >         at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown
>> Source)
>> >         at
>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
>> >         at java.lang.reflect.Constructor.newInstance(Unknown Source)
>> >         at java.lang.Class.newInstance0(Unknown Source)
>> >         at java.lang.Class.newInstance(Unknown Source)
>> >
>> > …
>> >
>> > I'm starting Solr with -Dsolr.clustering.enabled=true and I can see that
>> the Carrot jars in contrib are getting loaded.
>> >
>> > Full log file is here:
>> http://onespot-development.s3.amazonaws.com/solr.log
>> >
>> > Any ideas?  Thanks for the help.
>> >
>> Ok, got a little further.  Seems that Solr doesn't like it if you include
>> jars more than once (I had a lib dir and also  directives in the
>> solrconfig which ended up loading the same jars twice).
>>
>> But now I'm getting these errors:  java.lang.NoClassDefFoundError:
>> org/apache/solr/handler/clustering/SearchClusteringEngine
>>
>> Any help?  Thanks.


Re: AutoSoftcommit option solr 4.0

2012-11-26 Thread Vadim Kisselmann
Hi Shaveta,
simple, index a doc and search for this ;)
An soft commit stands for NearRealTimeSearch, It could take a couple
of seconds to see this doc,
but it should be there.
Best regards
Vadim


2012/11/26 Shaveta_Chawla :
> I have migrated solr 3.6 to solr 4.0. I have implemented solr4.0's auto
> commit option by adding
> 
>  1000
>
> 
>6
>false
>  
>  these lines in solrconfig.xml.
>
> I am doing these changes on my local machine. I know what autosoftcommit
> features does but how can i check that the autocommit feature is working ok?
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/AutoSoftcommit-option-solr-4-0-tp4022302.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: AutoSoftcommit option solr 4.0

2012-11-26 Thread Vadim Kisselmann
Hi Shaveta,
simple, index a doc and search for this ;)
An soft commit stands for NearRealTimeSearch, It could take a couple
of seconds to see this doc,
but it should be there.
Best regards
Vadim


2012/11/26 Shaveta_Chawla :
> I have migrated solr 3.6 to solr 4.0. I have implemented solr4.0's auto
> commit option by adding
> 
>  1000
>
> 
>6
>false
>  
>  these lines in solrconfig.xml.
>
> I am doing these changes on my local machine. I know what autosoftcommit
> features does but how can i check that the autocommit feature is working ok?
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/AutoSoftcommit-option-solr-4-0-tp4022302.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Edismax, Filter Query and Highlighting

2012-01-31 Thread Vadim Kisselmann
Hi,

i have problems with edismax, filter queries and highlighting.

First of all: can edismax deal with filter queries?

My case:
Edismax is my default requestHandler.
My query in SolrAdminGUI: (roomba OR irobot) AND language:de

You can see, that my q is "roomba OR irobot" and my fq is
"language:de"(language is a field in schema.xml)
 With this params i turn highlighting on: &hl=true&hl.fl=text,title,url

In my shown result you can see that highlighting matched on
de in url(last ).


Erste Erfahrung mit unserem Roomba
Roboter Staubsauger

 Erste Erfahrung mit unserem Roomba Roboter Staubsauger
 Tags: Haushaltshilfe, Roboter
http://www.blog-gedanken.de/produkte/erste-erfahrung-mit-unserem-roomba-roboter-staubsauger/

in calalina.out i can see the following query:
path=/select/ 
params={hl=true&version=2.2&indent=on&rows=10&start=0&q=(roomba+OR+irobot)+AND+language:de}
hits=1 status=0 QTime=65

language:de is a filter, and shouldn't be highlighted.
Do i have a thinking error, or is my query wrong? Or is it an edismax problem?

Vest Regards
Vadim


Re: Edismax, Filter Query and Highlighting

2012-01-31 Thread Vadim Kisselmann
Hi Ahmet,

thanks for quick response :)
I've also discovered this failure.
I wonder that the query themselves works.
For example: query = language:de
I get results which only have language:de.
Also works the fq and i get only the "de"-result in my field "language".
I can't understand the behavior. It seems like the fq works, but at
the end my fq-params be converted to q-params.

Regards
Vadim



2012/1/31 Ahmet Arslan :
>> in calalina.out i can see the following query:
>> path=/select/
>> params={hl=true&version=2.2&indent=on&rows=10&start=0&q=(roomba+OR+irobot)+AND+language:de}
>> hits=1 status=0 QTime=65
>>
>> language:de is a filter, and shouldn't be highlighted.
>> Do i have a thinking error, or is my query wrong? Or is it
>> an edismax problem?
>
> In your example, language:de is a part of query. Use &fq= instead.
> q=(roomba OR irobot)&fq=language:de
>


Re: Edismax, Filter Query and Highlighting

2012-01-31 Thread Vadim Kisselmann
Hi Erick,
thanks for your response:)

Here its my query:
(roomba OR irobot) AND language:de AND
url:"http://www.blog-gedanken.de/produkte/erste-erfahrung-mit-unserem-roomba-roboter-staubsauger/";
Url and language are fields in my schema.xml

With &hl=true&hl.fl=text,url i see this, but i want only see "roomba"
or "robot" highlighted:
http://www.blog-gedanken.de/produkte/erste-erfahrung-mit-unserem-roomba-roboter-staubsauger/

you see, the whole url is highlighted.

with debugQuery=on:

(roomba OR irobot) AND
language:de AND
url:"http://www.blog-gedanken.de/produkte/erste-erfahrung-mit-unserem-roomba-roboter-staubsauger/";
(roomba OR irobot) AND language:de AND
url:"http://www.blog-gedanken.de/produkte/erste-erfahrung-mit-unserem-roomba-roboter-staubsauger/";
(+(+(DisjunctionMaxQuery((title:roomba)~0.01)
DisjunctionMaxQuery((title:irobot)~0.01)) +language:de
+PhraseQuery(url:"http www blog gedanken de produkte erste erfahrung
mit unserem roomba roboter staubsauger"))
DisjunctionMaxQuery((text:"roomba irobot"~100)~0.01))/no_coord
+(+((title:roomba)~0.01
(title:irobot)~0.01) +language:de +url:"http www blog gedanken de
produkte erste erfahrung mit unserem roomba roboter staubsauger")
(text:"roomba irobot"~100)~0.01

26.130154 = (MATCH) sum of:
  26.130154 = (MATCH) sum of:
0.30008852 = (MATCH) product of:
  0.60017705 = (MATCH) sum of:
0.60017705 = (MATCH) weight(title:roomba in 199491)
[DefaultSimilarity], result of:
  0.60017705 = score(doc=199491,freq=1.0 = termFreq=1
), product of:
0.119503364 = queryWeight, product of:
  13.392695 = idf(docFreq=19, maxDocs=4820692)
  0.008923026 = queryNorm
5.0222607 = fieldWeight in 199491, product of:
  1.0 = tf(freq=1.0), with freq of:
1.0 = termFreq=1
  13.392695 = idf(docFreq=19, maxDocs=4820692)
  0.375 = fieldNorm(doc=199491)
  0.5 = coord(1/2)
0.08084078 = (MATCH) weight(language:de in 199491)
[DefaultSimilarity], result of:
  0.08084078 = score(doc=199491,freq=1.0 = termFreq=1
), product of:
0.026857855 = queryWeight, product of:
  3.0099492 = idf(docFreq=645950, maxDocs=4820692)
  0.008923026 = queryNorm
3.0099492 = fieldWeight in 199491, product of:
  1.0 = tf(freq=1.0), with freq of:
1.0 = termFreq=1
  3.0099492 = idf(docFreq=645950, maxDocs=4820692)
  1.0 = fieldNorm(doc=199491)
25.749224 = (MATCH) weight(url:"http www blog gedanken de produkte
erste erfahrung mit unserem roomba roboter staubsauger" in 199491)
[DefaultSimilarity], result of:
  25.749224 = score(doc=199491,freq=1.0 = phraseFreq=1.0
), product of:
0.9586678 = queryWeight, product of:
  107.43752 = idf(), sum of:
1.0006605 = idf(docFreq=4817508, maxDocs=4820692)
1.4342768 = idf(docFreq=3122520, maxDocs=4820692)
4.5387235 = idf(docFreq=140042, maxDocs=4820692)
10.954706 = idf(docFreq=228, maxDocs=4820692)
3.1167865 = idf(docFreq=580497, maxDocs=4820692)
9.476681 = idf(docFreq=1003, maxDocs=4820692)
9.195494 = idf(docFreq=1329, maxDocs=4820692)
11.576243 = idf(docFreq=122, maxDocs=4820692)
6.3489246 = idf(docFreq=22913, maxDocs=4820692)
12.31089 = idf(docFreq=58, maxDocs=4820692)
13.392695 = idf(docFreq=19, maxDocs=4820692)
11.229373 = idf(docFreq=173, maxDocs=4820692)
12.862067 = idf(docFreq=33, maxDocs=4820692)
  0.008923026 = queryNorm
26.85938 = fieldWeight in 199491, product of:
  1.0 = tf(freq=1.0), with freq of:
1.0 = phraseFreq=1.0
  107.43752 = idf(), sum of:
1.0006605 = idf(docFreq=4817508, maxDocs=4820692)
1.4342768 = idf(docFreq=3122520, maxDocs=4820692)
4.5387235 = idf(docFreq=140042, maxDocs=4820692)
10.954706 = idf(docFreq=228, maxDocs=4820692)
3.1167865 = idf(docFreq=580497, maxDocs=4820692)
9.476681 = idf(docFreq=1003, maxDocs=4820692)
9.195494 = idf(docFreq=1329, maxDocs=4820692)
11.576243 = idf(docFreq=122, maxDocs=4820692)
6.3489246 = idf(docFreq=22913, maxDocs=4820692)
12.31089 = idf(docFreq=58, maxDocs=4820692)
13.392695 = idf(docFreq=19, maxDocs=4820692)
11.229373 = idf(docFreq=173, maxDocs=4820692)
12.862067 = idf(docFreq=33, maxDocs=4820692)
  0.25 = fieldNorm(doc=199491)
ExtendedDismaxQParser16.00.00.00.00.00.00.00.015.00.00.00.08.00.07.0

I hope you can read it:)

Best Regards
Vadim





2012/1/31 Erick Erickson :
> Seeing the results with &debugQuery=on would help.
>
> No, fq does NOT get translated into q params, it's a
> completely separate mechanism so I&#x

Re: Edismax, Filter Query and Highlighting

2012-01-31 Thread Vadim Kisselmann
Hi Erick,

> I didn't read your first post carefully enough, I was keying
> on the words "filter query". Your query does not have
> any filter queries! I thought you were talking
> about &fq=language:de type clauses, which is what
> I was responding to.

no problem, i understand:)

> Solr/Lucene have no way of
> interpreting an extended "q" clause and saying
> "this part is a query and should be highlighted and
> this part isn't".
>
> Try the &fq option maybe?

I thought so, unfortunately.
&fq will be the only option. I should rebuild my application :)

Best Regards
Vadim


Re: Edismax, Filter Query and Highlighting

2012-01-31 Thread Vadim Kisselmann
Hmm, i don´t know, but i can test it tomorrow at work.
i´m not sure about the right syntax with hl.q. (?)
but i report :)




2012/1/31 Ahmet Arslan :
>> > Try the &fq option maybe?
>>
>> I thought so, unfortunately.
>> &fq will be the only option. I should rebuild my
>> application :)
>
> Could hl.q help? http://wiki.apache.org/solr/HighlightingParameters#hl.q


Re: Edismax, Filter Query and Highlighting

2012-02-01 Thread Vadim Kisselmann
hl.q works:)
But i have to attach the hl.q to my standard query.
In bigger queries it would by a pain to find out, which terms i need in my hl.q.
My plan: Own query parser in solr, which loops through q, identifies
filter terms(in my case language:de) and append
it as hl.q to the standard query. Sounds like a plan? :)
Best Regards
Vadim





2012/2/1 Koji Sekiguchi :
> (12/02/01 4:28), Vadim Kisselmann wrote:
>>
>> Hmm, i don´t know, but i can test it tomorrow at work.
>> i´m not sure about the right syntax with hl.q. (?)
>> but i report :)
>
>
> hl.q can accept same syntax of q, including local params.
>
> koji
> --
> http://www.rondhuit.com/en/


How to reindex about 10Mio. docs

2012-02-08 Thread Vadim Kisselmann
Hello folks,

i want to reindex about 10Mio. Docs. from one Solr(1.4.1) to another
Solr(1.4.1).
I changed my schema.xml (field types sing to slong), standard
replication would fail.
what is the fastest and smartest way to manage this?
this here sound great (EntityProcessor):
http://www.searchworkings.org/blog/-/blogs/importing-data-from-another-solr
But would it work with Solr 1.4.1?

Best Regards
Vadim


Re: How to reindex about 10Mio. docs

2012-02-08 Thread Vadim Kisselmann
Hi Ahmet,
thanks for quick response:)
I've already thought the same...
And it will be a pain to export and import this huge doc-set as CSV.
Do i have an another solution?
Regards
Vadim


2012/2/8 Ahmet Arslan :
>> i want to reindex about 10Mio. Docs. from one Solr(1.4.1) to
>> another
>> Solr(1.4.1).
>> I changed my schema.xml (field types sing to slong),
>> standard
>> replication would fail.
>> what is the fastest and smartest way to manage this?
>> this here sound great (EntityProcessor):
>> http://www.searchworkings.org/blog/-/blogs/importing-data-from-another-solr
>> But would it work with Solr 1.4.1?
>
> SolrEntityProcessor is not available in 1.4.1. I would dump stored fields 
> into comma separated file, and use http://wiki.apache.org/solr/UpdateCSV to 
> feed into new solr instance.


Re: How to reindex about 10Mio. docs

2012-02-08 Thread Vadim Kisselmann
Another problem appeared ;)
how can i export my docs in csv-format?
In Solr 3.1+ i can use the query-param &wt=csv, but in Solr 1.4.1?
Best Regards
Vadim


2012/2/8 Vadim Kisselmann :
> Hi Ahmet,
> thanks for quick response:)
> I've already thought the same...
> And it will be a pain to export and import this huge doc-set as CSV.
> Do i have an another solution?
> Regards
> Vadim
>
>
> 2012/2/8 Ahmet Arslan :
>>> i want to reindex about 10Mio. Docs. from one Solr(1.4.1) to
>>> another
>>> Solr(1.4.1).
>>> I changed my schema.xml (field types sing to slong),
>>> standard
>>> replication would fail.
>>> what is the fastest and smartest way to manage this?
>>> this here sound great (EntityProcessor):
>>> http://www.searchworkings.org/blog/-/blogs/importing-data-from-another-solr
>>> But would it work with Solr 1.4.1?
>>
>> SolrEntityProcessor is not available in 1.4.1. I would dump stored fields 
>> into comma separated file, and use http://wiki.apache.org/solr/UpdateCSV to 
>> feed into new solr instance.


Re: How to reindex about 10Mio. docs

2012-02-09 Thread Vadim Kisselmann
Hi Otis,
thanks for your response:)
We had a solution yesterday. It works with an ruby script, curl and saxon/xslt.
The performance is great. We moved all the docs in 5-batches to
prevent an overload of our machines.
Best regards
Vadim



2012/2/8 Otis Gospodnetic :
> Vadim,
>
> Would using xslt output help?
>
> Otis
> 
> Performance Monitoring SaaS for Solr - 
> http://sematext.com/spm/solr-performance-monitoring/index.html
>
>
>
>>________
>> From: Vadim Kisselmann 
>>To: solr-user@lucene.apache.org
>>Sent: Wednesday, February 8, 2012 7:09 AM
>>Subject: Re: How to reindex about 10Mio. docs
>>
>>Another problem appeared ;)
>>how can i export my docs in csv-format?
>>In Solr 3.1+ i can use the query-param &wt=csv, but in Solr 1.4.1?
>>Best Regards
>>Vadim
>>
>>
>>2012/2/8 Vadim Kisselmann :
>>> Hi Ahmet,
>>> thanks for quick response:)
>>> I've already thought the same...
>>> And it will be a pain to export and import this huge doc-set as CSV.
>>> Do i have an another solution?
>>> Regards
>>> Vadim
>>>
>>>
>>> 2012/2/8 Ahmet Arslan :
>>>>> i want to reindex about 10Mio. Docs. from one Solr(1.4.1) to
>>>>> another
>>>>> Solr(1.4.1).
>>>>> I changed my schema.xml (field types sing to slong),
>>>>> standard
>>>>> replication would fail.
>>>>> what is the fastest and smartest way to manage this?
>>>>> this here sound great (EntityProcessor):
>>>>> http://www.searchworkings.org/blog/-/blogs/importing-data-from-another-solr
>>>>> But would it work with Solr 1.4.1?
>>>>
>>>> SolrEntityProcessor is not available in 1.4.1. I would dump stored fields 
>>>> into comma separated file, and use http://wiki.apache.org/solr/UpdateCSV 
>>>> to feed into new solr instance.
>>
>>
>>


Custom Query Component: parameters are not appended to query

2012-02-17 Thread Vadim Kisselmann
Hello folks,

I build a simple custom component for “hl.q” query.
My case was to inject hl.q=params on the fly,  with filter params like
fields which were in my
standard query.  These were highlighted , because Solr/Lucene have no way of
interpreting an extended "q" clause and saying "this part is a query and
should be highlighted and
this part isn't".
If it works,  the community can have it :)

Facts:  q=roomba AND irobot AND language:de

My component is extended form SearchComponent. I use ResponseBuilder to get
all needed params
like field-names from schema, q-params, etc…



My component  is called as first (it works(debugging,debugQuery)) from my
SearchHandler:

highlightQuery
 



Important Clippings from Sourcecode:

public class HighlightQueryComponent extends SearchComponent {

…….
…….

public void process(ResponseBuilder rb) throws IOException {

   if(rb.doHighlights){

List terms = new ArrayList(0);
  SolrQueryRequest req = rb.req;
IndexSchema schema = req.getSchema();
Map fields = schema.getFields();
  SolrParams params = req.getParams();
…..
….
…magic
…
….
Query hlq = new TermQuery(new Term(“text”, hlQuery.toString()));
rb.setHighlightQuery(hlq);   // hlq => text:(roomba AND irobot)



Problem:
In last step my query is adjusted (hlq params from debugging are
“text:(roomba AND irobot)”). It looks fine, the magic in method process()
works.
But nothing happen. If I continue to debug the next components were called,
But my query is the same, without changes.
Either setHighlightQuery doesn´t work, or my params are overridden in
following components.
What can it be?

Best Regards
Vadim


Re: maxClauseCount Exception

2012-02-28 Thread Vadim Kisselmann
Set maxBooleanClauses in your solrconfig.xml higher, default is 1024.
Your query blast this limit.
Regards
Vadim



2012/2/22 Darren Govoni 

> Hi,
>  I am suddenly getting a maxClauseCount exception for no reason. I am
> using Solr 3.5. I have only 206 documents in my index.
>
> Any ideas? This is wierd.
>
> QUERY PARAMS: [hl, hl.snippets, hl.simple.pre, hl.simple.post, fl,
> hl.mergeContiguous, hl.usePhraseHighlighter, hl.requireFieldMatch,
> echoParams, hl.fl, q, rows, start]|#]
>
>
> [#|2012-02-22T13:40:13.129-0500|INFO|glassfish3.1.1|
> org.apache.solr.core.SolrCore|_ThreadID=22;_ThreadName=Thread-2;|[]
> webapp=/solr3 path=/select
> params={hl=true&hl.snippets=4&hl.simple.pre=&fl=*,score&hl.mergeContiguous=true&hl.usePhraseHighlighter=true&hl.requireFieldMatch=true&echoParams=all&hl.fl=text_t&q={!lucene+q.op%3DOR+df%3Dtext_t}+(+kind_s:doc+OR+kind_s:xml)+AND+(type_s:[*+TO+*])+AND+(usergroup_sm:admin)&rows=20&start=0&wt=javabin&version=2}
> hits=204 status=500 QTime=166 |#]
>
>
> [#|2012-02-22T13:40:13.131-0500|SEVERE|glassfish3.1.1|
> org.apache.solr.servlet.SolrDispatchFilter|
> _ThreadID=22;_ThreadName=Thread-2;|org.apache.lucene.search.BooleanQuery
> $TooManyClauses: maxClauseCount is set to 1024
>at org.apache.lucene.search.BooleanQuery.add(BooleanQuery.java:136)
>at org.apache.lucene.search.BooleanQuery.add(BooleanQuery.java:127)
>at org.apache.lucene.search.ScoringRewrite
> $1.addClause(ScoringRewrite.java:51)
>at org.apache.lucene.search.ScoringRewrite
> $1.addClause(ScoringRewrite.java:41)
>at org.apache.lucene.search.ScoringRewrite
> $3.collect(ScoringRewrite.java:95)
>at
>
> org.apache.lucene.search.TermCollectingRewrite.collectTerms(TermCollectingRewrite.java:38)
>at
> org.apache.lucene.search.ScoringRewrite.rewrite(ScoringRewrite.java:93)
>at
> org.apache.lucene.search.MultiTermQuery.rewrite(MultiTermQuery.java:304)
>at
>
> org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extract(WeightedSpanTermExtractor.java:158)
>at
>
> org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extract(WeightedSpanTermExtractor.java:98)
>at
>
> org.apache.lucene.search.highlight.WeightedSpanTermExtractor.getWeightedSpanTerms(WeightedSpanTermExtractor.java:385)
>at
>
> org.apache.lucene.search.highlight.QueryScorer.initExtractor(QueryScorer.java:217)
>at
> org.apache.lucene.search.highlight.QueryScorer.init(QueryScorer.java:185)
>at
>
> org.apache.lucene.search.highlight.Highlighter.getBestTextFragments(Highlighter.java:205)
>at
>
> org.apache.solr.highlight.DefaultSolrHighlighter.doHighlightingByHighlighter(DefaultSolrHighlighter.java:490)
>at
>
> org.apache.solr.highlight.DefaultSolrHighlighter.doHighlighting(DefaultSolrHighlighter.java:401)
>at
>
> org.apache.solr.handler.component.HighlightComponent.process(HighlightComponent.java:131)
>at org.apache.so
>
>


Apache Lucene Eurocon 2012

2012-03-06 Thread Vadim Kisselmann
Hi folks,

where and when is the next Eurocon scheduled?
I read something about denmark and autumn 2012(i don't know where *g*).

Best regards and thanks
Vadim


Re: Apache Lucene Eurocon 2012

2012-03-08 Thread Vadim Kisselmann
Hi Chris,

thanks for your response.Ok, we will wait :)

Best Regards
Vadim




2012/3/8 Chris Hostetter 

>
> : where and when is the next Eurocon scheduled?
> : I read something about denmark and autumn 2012(i don't know where *g*).
>
> I do not know where, but sometime in the fall is probably the correct time
> frame.  I beleive the details will be announced at Lucene Revolution...
>
>http://lucenerevolution.org/
>
> (that's what happened last year)
>
> -Hoss
>


Solr 4.0 and tomcat, error in new admin UI

2012-03-15 Thread Vadim Kisselmann
Hi folks,

i comment this issue : https://issues.apache.org/jira/browse/SOLR-3238 ,
but i want to ask here if anyone have the same problem.


I use Solr 4.0 from trunk(latest) with tomcat6.

I get an error in New Admin UI:

This interface requires that you activate the admin request handlers,
add the following configuration to your solrconfig.xml:



Admin request Handlers are definitely activated in my solrconfig.

A problem with tomcat?
It works with embedded jetty, but i should use tomcat.

Best Regards
Vadim


SolrCloud with Tomcat and external Zookeeper, does it work?

2012-03-21 Thread Vadim Kisselmann
Hello folks,

i read the SolrCloud Wiki and Bruno Dumon's blog entry with his "First
Exploration of SolrCloud".
Examples and a first setup with embedded Jetty and ZK WORKS without problems.

I tried to setup my own configuration with Tomcat and an external
Zookeeper(my Master-ZK), but it doesn't work really.

My setup:
- latest Solr version from trunk
- Tomcat 6
- external ZK
- Target: 1 Server, 1 Tomcat, 1 Solr instance, 2 collections with
different config/schema

What i tried:
--
1. After checkout i build solr(ant run-example), it works.
---
2. I send my config/schema files to external ZK with Jetty:
java -Djetty.port=8080 -Dbootstrap_confdir=/root/solrCloud/conf/
-Dcollection.configName=conf1 -DzkHost=master-zk:2181 -jar start.jar
it works, too.
---
3. I create my ("empty, without cores")solr.xml, like Bruno:
http://www.ngdata.com/site/blog/57-ng.html#disqus_thread
---
4. I started my Tomcat, and get the first error:
in UI: This interface requires that you activate the admin request
handlers, add the following configuration to your solrconfig.xml:


Admin request Handlers are definitely activated in my solrconfig.

I get this error only with the latest trunk versions, with r1292064
from February not. Sometimes it works with the new version, sometimes
not and i get this error.

--
5. Ok , it it works, after few restarts, i changed my JAVA_OPTS for
Tomcat and added this: "-DzkHost=master-zk:2181"
Next Error:
This The web application [/solr2] appears to have started a thread
named [main-SendThread(master-zk:2181)] but has failed to stop it.
This is very likely to create a memory leak.
Exception in thread "Thread-2" java.lang.NullPointerException
at org.apache.solr.cloud.Overseer$CloudStateUpdater.amILeader(Overseer.java:179)
at org.apache.solr.cloud.Overseer$CloudStateUpdater.run(Overseer.java:104)
at java.lang.Thread.run(Thread.java:662)
15.03.2012 13:25:17 org.apache.catalina.loader.WebappClassLoader loadClass
INFO: Illegal access: this web application instance has been stopped
already. Could not load org.apache.zookeeper.server.ZooTrace. The
eventual following stack trace is caused by an error thrown for
debugging purposes as well as to attempt to terminate the thread which
caused the illegal access, and has no functional impact.
java.lang.IllegalStateException
at 
org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1531)
at 
org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1491)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1196)
15.03.2012 13:25:17 org.apache.coyote.http11.Http11Protocol destroy

-
6. Ok, we assume, that the first steps works, and i would create new
cores and my 2 collections. My requests with CoreAdminHandler are ok,
my solr.xml looks like this:


  

 
  


Now i get the following exception: "...couldn't find conf name for
collection1..."
I don't have an collection 1. Why this exception?

---
You can see, there are too many exceptions and eventually
configuration problems with Tomcat and an external ZK.
Has anyone set up an "identical" configuration and does it work?
Does anyone detect mistakes in my configuration steps?

Best regards
Vadim


Re: whethere solr 3.3 index file is compatable with solr 4.0

2012-03-21 Thread Vadim Kisselmann
you have to re-index your data.

best regards
vadim


2012/3/21 syed kather :
> Team
>
> I have indexed my data with solr 3.3 version , As I need to use
> hierarchical facets features from solr 4.0 .
> Can I use the existing data with Solr 4.0 version or should need to
> re-index the data with new version?
>
>
>
>            Thanks and Regards,
>        S SYED ABDUL KATHER


Localize the largest fields (content) in index

2012-03-28 Thread Vadim Kisselmann
Hello folks,

i work with Solr 4.0 r1292064 from trunk.
My index grows fast, with 10Mio. docs i get an index size of 150GB
(25% stored, 75% indexed).
I want to find out, which fields(content) are too large, to consider measures.

How can i localize/discover the largest fields in my index?
Luke(latest from trunk) doesn't work
with my Solr version. I build Lucene/Solr .jars and tried to feed Luke
this these, but i get many errors
and can't build it.

What other options do i have?

Thanks and best regards
Vadim


Re: SolrCloud with Tomcat and external Zookeeper, does it work?

2012-03-28 Thread Vadim Kisselmann
Hi Jerry,
thanks for your response:)
This thread("SolrCloud new...") is new for me, thanks!
How far are you with your setup? Which problems/errors du you have?
Best regards
Vadim




2012/3/27 jerry.min...@gmail.com :
> Hi Vadim,
>
> I too am experimenting with SolrCloud and need help with setting it up
> using Tomcat as the java servlet container.
> While searching for help on this question, I found another thread in
> the solr-mailing-list that is helpful.
> In case you haven't seen this thread that I found, please search the
> solr-mailing-list for: "SolrCloud new"
> You can also view it at nabble using this link:
> http://lucene.472066.n3.nabble.com/SolrCloud-new-td1528872.html
>
> Best,
> Jerry M.
>
>
>
>
> On Wed, Mar 21, 2012 at 5:51 AM, Vadim Kisselmann
>  wrote:
>>
>> Hello folks,
>>
>> i read the SolrCloud Wiki and Bruno Dumon's blog entry with his "First
>> Exploration of SolrCloud".
>> Examples and a first setup with embedded Jetty and ZK WORKS without problems.
>>
>> I tried to setup my own configuration with Tomcat and an external
>> Zookeeper(my Master-ZK), but it doesn't work really.
>>
>> My setup:
>> - latest Solr version from trunk
>> - Tomcat 6
>> - external ZK
>> - Target: 1 Server, 1 Tomcat, 1 Solr instance, 2 collections with
>> different config/schema
>>
>> What i tried:
>> --
>> 1. After checkout i build solr(ant run-example), it works.
>> ---
>> 2. I send my config/schema files to external ZK with Jetty:
>> java -Djetty.port=8080 -Dbootstrap_confdir=/root/solrCloud/conf/
>> -Dcollection.configName=conf1 -DzkHost=master-zk:2181 -jar start.jar
>> it works, too.
>> ---
>> 3. I create my ("empty, without cores")solr.xml, like Bruno:
>> http://www.ngdata.com/site/blog/57-ng.html#disqus_thread
>> ---
>> 4. I started my Tomcat, and get the first error:
>> in UI: This interface requires that you activate the admin request
>> handlers, add the following configuration to your solrconfig.xml:
>> 
>> 
>> Admin request Handlers are definitely activated in my solrconfig.
>>
>> I get this error only with the latest trunk versions, with r1292064
>> from February not. Sometimes it works with the new version, sometimes
>> not and i get this error.
>>
>> --
>> 5. Ok , it it works, after few restarts, i changed my JAVA_OPTS for
>> Tomcat and added this: "-DzkHost=master-zk:2181"
>> Next Error:
>> This The web application [/solr2] appears to have started a thread
>> named [main-SendThread(master-zk:2181)] but has failed to stop it.
>> This is very likely to create a memory leak.
>> Exception in thread "Thread-2" java.lang.NullPointerException
>> at 
>> org.apache.solr.cloud.Overseer$CloudStateUpdater.amILeader(Overseer.java:179)
>> at org.apache.solr.cloud.Overseer$CloudStateUpdater.run(Overseer.java:104)
>> at java.lang.Thread.run(Thread.java:662)
>> 15.03.2012 13:25:17 org.apache.catalina.loader.WebappClassLoader loadClass
>> INFO: Illegal access: this web application instance has been stopped
>> already. Could not load org.apache.zookeeper.server.ZooTrace. The
>> eventual following stack trace is caused by an error thrown for
>> debugging purposes as well as to attempt to terminate the thread which
>> caused the illegal access, and has no functional impact.
>> java.lang.IllegalStateException
>> at 
>> org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1531)
>> at 
>> org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1491)
>> at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1196)
>> 15.03.2012 13:25:17 org.apache.coyote.http11.Http11Protocol destroy
>>
>> -
>> 6. Ok, we assume, that the first steps works, and i would create new
>> cores and my 2 collections. My requests with CoreAdminHandler are ok,
>> my solr.xml looks like this:
>> 
>> 
>>  > hostContext="solr">
>>    >       name="shard1_data"
>>       collection="col1"
>>       shard="shard1"
>>       instanceDir="xxx/" />
>>  >       name="shard2_data"
>>       collection="col2"
>>       shard="shard2"
>>       instanceDir="xx2/" />
>>  
>> 
>>
>> Now i get the following exception: "...couldn't find conf name for
>> collection1..."
>> I don't have an collection 1. Why this exception?
>>
>> ---
>> You can see, there are too many exceptions and eventually
>> configuration problems with Tomcat and an external ZK.
>> Has anyone set up an "identical" configuration and does it work?
>> Does anyone detect mistakes in my configuration steps?
>>
>> Best regards
>> Vadim


Re: Localize the largest fields (content) in index

2012-03-29 Thread Vadim Kisselmann
Hi Erick,
thanks:)
The admin UI give me the counts, so i can identify fields with big
bulks of unique terms.
I known this wiki-page, but i read it one more time.
List of my file extensions with size in GB(Index size ~150GB):
tvf 90GB
fdt 30GB
tim 18GB
prx 15GB
frq 12GB
tip 200MB
tvx 150MB

tvf is my biggest file extension.
Wiki :This file contains, for each field that has a term vector
stored, a list of the terms, their frequencies and, optionally,
position and offest information.

Hmm, i use termVectors on my biggest fields because of MLT and Highlighting.
But i think i should test my performance without termVectors. Good Idea? :)

What do you think about my file extension sizes?

Best regards
Vadim




2012/3/29 Erick Erickson :
> The admin UI (schema browser) will give you the counts of unique terms
> in your fields, which is where I'd start.
>
> I suspect you've already seen this page, but if not:
> http://lucene.apache.org/java/3_5_0/fileformats.html#file-names
> the .fdt and .fdx file extensions are where data goes when
> you set 'stored="true" '. These files don't affect search speed,
> they just contain the verbatim copy of the data.
>
> The relative sizes of the various files above should give
> you a hint as to what's using the most space, but it'll be a bit
> of a hunt for you to pinpoint what's actually up. TermVectors
> and norms are often sources of using up space.
>
> Best
> Erick
>
> On Wed, Mar 28, 2012 at 10:55 AM, Vadim Kisselmann
>  wrote:
>> Hello folks,
>>
>> i work with Solr 4.0 r1292064 from trunk.
>> My index grows fast, with 10Mio. docs i get an index size of 150GB
>> (25% stored, 75% indexed).
>> I want to find out, which fields(content) are too large, to consider 
>> measures.
>>
>> How can i localize/discover the largest fields in my index?
>> Luke(latest from trunk) doesn't work
>> with my Solr version. I build Lucene/Solr .jars and tried to feed Luke
>> this these, but i get many errors
>> and can't build it.
>>
>> What other options do i have?
>>
>> Thanks and best regards
>> Vadim


Re: Localize the largest fields (content) in index

2012-03-29 Thread Vadim Kisselmann
Yes, i think so, too :)
MLT doesn´t need termVectors really, but it´s faster with them. I
found out, what
MLT works better on the title field in my case, instead of big text fields.

Sharding is in planning, but my setup with SolrCloud, ZK and Tomcat
doesn´t work,
see here: 
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201203.mbox/%3CCA+GXEZE3LCTtgXFzn9uEdRxMymGF=z0ujb9s8b0qkipafn6...@mail.gmail.com%3E
I split my huge index (150GB-index in this case is my test-index), and
want use SolrCloud,
but it´s not runnable with tomcat at this time.

Best regards
Vadim


2012/3/29 Erick Erickson :
> Yeah, it's worth a try. The term vectors aren't entirely necessary for
> highlighting,
> although they do make things more efficient.
>
> As far as MLT, does MLT really need such a big field?
>
> But you may be on your way to sharding your index if you remove this info
> and testing shows problems
>
> Best
> Erick
>
> On Thu, Mar 29, 2012 at 9:32 AM, Vadim Kisselmann
>  wrote:
>> Hi Erick,
>> thanks:)
>> The admin UI give me the counts, so i can identify fields with big
>> bulks of unique terms.
>> I known this wiki-page, but i read it one more time.
>> List of my file extensions with size in GB(Index size ~150GB):
>> tvf 90GB
>> fdt 30GB
>> tim 18GB
>> prx 15GB
>> frq 12GB
>> tip 200MB
>> tvx 150MB
>>
>> tvf is my biggest file extension.
>> Wiki :This file contains, for each field that has a term vector
>> stored, a list of the terms, their frequencies and, optionally,
>> position and offest information.
>>
>> Hmm, i use termVectors on my biggest fields because of MLT and Highlighting.
>> But i think i should test my performance without termVectors. Good Idea? :)
>>
>> What do you think about my file extension sizes?
>>
>> Best regards
>> Vadim
>>
>>
>>
>>
>> 2012/3/29 Erick Erickson :
>>> The admin UI (schema browser) will give you the counts of unique terms
>>> in your fields, which is where I'd start.
>>>
>>> I suspect you've already seen this page, but if not:
>>> http://lucene.apache.org/java/3_5_0/fileformats.html#file-names
>>> the .fdt and .fdx file extensions are where data goes when
>>> you set 'stored="true" '. These files don't affect search speed,
>>> they just contain the verbatim copy of the data.
>>>
>>> The relative sizes of the various files above should give
>>> you a hint as to what's using the most space, but it'll be a bit
>>> of a hunt for you to pinpoint what's actually up. TermVectors
>>> and norms are often sources of using up space.
>>>
>>> Best
>>> Erick
>>>
>>> On Wed, Mar 28, 2012 at 10:55 AM, Vadim Kisselmann
>>>  wrote:
>>>> Hello folks,
>>>>
>>>> i work with Solr 4.0 r1292064 from trunk.
>>>> My index grows fast, with 10Mio. docs i get an index size of 150GB
>>>> (25% stored, 75% indexed).
>>>> I want to find out, which fields(content) are too large, to consider 
>>>> measures.
>>>>
>>>> How can i localize/discover the largest fields in my index?
>>>> Luke(latest from trunk) doesn't work
>>>> with my Solr version. I build Lucene/Solr .jars and tried to feed Luke
>>>> this these, but i get many errors
>>>> and can't build it.
>>>>
>>>> What other options do i have?
>>>>
>>>> Thanks and best regards
>>>> Vadim


Weird query results with edismax and boolean operator +

2012-04-27 Thread Vadim Kisselmann
Hi folks,

i use solr 4.0 from trunk, and edismax as standard query handler.
In my schema i defined this:  

I have this simple problem:

 nascar +author:serg* (3500 matches)

 +nascar +author:serg* (1 match)

 nascar author:serg* (5200 matches)

 nascar  AND author:serg* (1 match)

I think i understand the query syntax, but this behavior confused me.
Why this match-differences?

By the way, i get in all matches at least one of my terms.
But not always both.

Best regards
Vadim


Re: Master config

2012-04-27 Thread Vadim Kisselmann
hi,
when only the slaves are used for search, why not, more RAM for OS.
I keep my default settings on my master, because of when my slaves are
busy with client-queries,
i can test a few things on my master.

best regards
vadim



2012/4/27 Jamel ESSOUSSI :
> Hi,
>
> I use two Solr slaves and one Solr master, it's a good idea to disable all
> the the caches in the master ?
>
> Best Regards
>
> -- Jamel ESSOUSSI
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Master-config-tp3943648p3943648.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Weird query results with edismax and boolean operator +

2012-04-30 Thread Vadim Kisselmann
Hi Jan,
thanks for your response!

My "qf" parameter for edismax is: "title". My
"defaultSearchField=text" in schema.xml.
In my app i generate a query with "qf=title,text", so i think the
default parameters in config/schema should bei overridden, right?

I found eventually 2 reasons for this behavior.
1. "mm"-parameter in solrconfig.xml for edismax is 0. 0 stands for
"OR", but it should be an "AND" => 100%.
2. I suppose that my app does not override my "default-qf".
I test it today and report, with my parsed query and all params.

Best regards
Vadim




2012/4/29 Jan Høydahl :
> Hi,
>
> What is your "qf" parameter?
> Can you run the three queries with debugQuery=true&echoParams=all and attach 
> parsed query and all params? It will probably explain what is happening.
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
> Solr Training - www.solrtraining.com
>
> On 27. apr. 2012, at 11:21, Vadim Kisselmann wrote:
>
>> Hi folks,
>>
>> i use solr 4.0 from trunk, and edismax as standard query handler.
>> In my schema i defined this:  
>>
>> I have this simple problem:
>>
>> nascar +author:serg* (3500 matches)
>>
>> +nascar +author:serg* (1 match)
>>
>> nascar author:serg* (5200 matches)
>>
>> nascar  AND author:serg* (1 match)
>>
>> I think i understand the query syntax, but this behavior confused me.
>> Why this match-differences?
>>
>> By the way, i get in all matches at least one of my terms.
>> But not always both.
>>
>> Best regards
>> Vadim
>


Re: Weird query results with edismax and boolean operator +

2012-04-30 Thread Vadim Kisselmann
I tested it.
With default "qf=title text" in solrconfig and "mm=100%"
i get the same result(1) for "nascar AND author:serg*" and "+nascar
+author:serg*", great.
With "nascar +author:serg*" i get 3500 matches, in this case the
mm-parameter seems not to work.

Here are my debug params for "nascar AND author:serg*":

nascar AND author:serg*
(+(+DisjunctionMaxQuery((text:nascar |
title:nascar)~0.01) +author:serg*))/no_coord
+(+(text:nascar | title:nascar)~0.01
+author:serg*)
8.235954 = (MATCH) sum of:
  8.10929 = (MATCH) max plus 0.01 times others of:
8.031613 = (MATCH) weight(text:nascar in 0) [DefaultSimilarity], result of:
  8.031613 = score(doc=0,freq=2.0 = termFreq=2.0
), product of:
0.84814763 = queryWeight, product of:
  6.6960144 = idf(docFreq=27, maxDocs=8335)
  0.12666455 = queryNorm
9.469594 = fieldWeight in 0, product of:
  1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
  6.6960144 = idf(docFreq=27, maxDocs=8335)
  1.0 = fieldNorm(doc=0)
7.7676363 = (MATCH) weight(title:nascar in 0) [DefaultSimilarity],
result of:
  7.7676363 = score(doc=0,freq=1.0 = termFreq=1.0
), product of:
0.9919093 = queryWeight, product of:
  7.830994 = idf(docFreq=8, maxDocs=8335)
  0.12666455 = queryNorm
7.830994 = fieldWeight in 0, product of:
  1.0 = tf(freq=1.0), with freq of:
1.0 = termFreq=1.0
  7.830994 = idf(docFreq=8, maxDocs=8335)
  1.0 = fieldNorm(doc=0)
  0.12666455 = (MATCH) ConstantScore(author:serg*), product of:
1.0 = boost
0.12666455 = queryNorm



And here for  "nascar +author:serg*":
nascar +author:serg*
(+(DisjunctionMaxQuery((text:nascar |
title:nascar)~0.01) +author:serg*))/no_coord
+((text:nascar | title:nascar)~0.01
+author:serg*)
8.235954 = (MATCH) sum of:
  8.10929 = (MATCH) max plus 0.01 times others of:
8.031613 = (MATCH) weight(text:nascar in 0) [DefaultSimilarity], result of:
  8.031613 = score(doc=0,freq=2.0 = termFreq=2.0
), product of:
0.84814763 = queryWeight, product of:
  6.6960144 = idf(docFreq=27, maxDocs=8335)
  0.12666455 = queryNorm
9.469594 = fieldWeight in 0, product of:
  1.4142135 = tf(freq=2.0), with freq of:
2.0 = termFreq=2.0
  6.6960144 = idf(docFreq=27, maxDocs=8335)
  1.0 = fieldNorm(doc=0)
7.7676363 = (MATCH) weight(title:nascar in 0) [DefaultSimilarity],
result of:
  7.7676363 = score(doc=0,freq=1.0 = termFreq=1.0
), product of:
0.9919093 = queryWeight, product of:
  7.830994 = idf(docFreq=8, maxDocs=8335)
  0.12666455 = queryNorm
7.830994 = fieldWeight in 0, product of:
  1.0 = tf(freq=1.0), with freq of:
1.0 = termFreq=1.0
  7.830994 = idf(docFreq=8, maxDocs=8335)
  1.0 = fieldNorm(doc=0)
  0.12666455 = (MATCH) ConstantScore(author:serg*), product of:
1.0 = boost
0.12666455 = queryNorm


0.063332275 = (MATCH) product of:
  0.12666455 = (MATCH) sum of:
0.12666455 = (MATCH) ConstantScore(author:serg*), product of:
  1.0 = boost
  0.12666455 = queryNorm
  0.5 = coord(1/2)



You can see, that for first doc in "nascar +author:serg*" all
query-params match, but in the second doc only
"ConstantScore(author:serg*)".
But with an "mm=100%" all query-params should match.
http://www.lucidimagination.com/blog/2010/05/23/whats-a-dismax/
http://lucene.apache.org/solr/api/org/apache/solr/util/doc-files/min-should-match.html

Best regards
Vadim



2012/4/30 Vadim Kisselmann :
> Hi Jan,
> thanks for your response!
>
> My "qf" parameter for edismax is: "title". My
> "defaultSearchField=text" in schema.xml.
> In my app i generate a query with "qf=title,text", so i think the
> default parameters in config/schema should bei overridden, right?
>
> I found eventually 2 reasons for this behavior.
> 1. "mm"-parameter in solrconfig.xml for edismax is 0. 0 stands for
> "OR", but it should be an "AND" => 100%.
> 2. I suppose that my app does not override my "default-qf".
> I test it today and report, with my parsed query and all params.
>
> Best regards
> Vadim
>
>
>
>
> 2012/4/29 Jan Høydahl :
>> Hi,
>>
>> What is your "qf" parameter?
>> Can you run the three queries with debugQuery=true&echoParams=all and attach 
>> parsed query and all params? It will probably explain what is happening.
>>
>> --
>> Jan Høydahl, search solution architect
>> Cominvent AS - www.cominvent.com
>> Solr Training - www.solrtraining.com
>>
>> On 27. apr. 2012, at 11:21, Vadim Kisselmann wrote:
>>
>>> Hi folks,
>>>
>>&g

Re: Poll: What do you use for Solr performance monitoring?

2012-05-31 Thread Vadim Kisselmann
Hi Otis,
done :) Till now we use Graphite, Ganglia and Zabbix. For our JVM
monitoring JStatsD.
Best regards
Vadim


2012/5/31 Otis Gospodnetic :
> Hi,
>
> Super quick poll:  What do you use for Solr performance monitoring?
> Vote here: 
> http://blog.sematext.com/2012/05/30/poll-what-do-you-use-for-solr-performance-monitoring/
>
>
> I'm collecting data for my Berlin Buzzwords talk that will touch on Solr, so 
> your votes will be greatly appreciated!
>
> Thanks,
> Otis


Solr 1.4, slaves hang after replication from an just optimized master

2012-06-19 Thread Vadim Kisselmann
Hi folks,

i have to look for an old live system with solr 1.4.
When i optimize an bigger index with round about 200GB(after optimize
and cut, 100GB) and my slaves
replicate the newest version after(!) optimize, they hang(all) with
100% in replication and they have at once circa 300GB index sizes.
After a couple of seconds i have to restart my Tomcat, because the
slaves are no longer be able to response on queries...
Ironically, they have the same number of segments like master, i can't
see errors in my logfile and the server load is normal.
What's wrong here? :)
Normal HTTP Replication is used, this params are set on master:
commit
startup
optimize

Any ideas?

Best regards
Vadim


Re: Solr 1.4, slaves hang after replication from an just optimized master

2012-06-19 Thread Vadim Kisselmann
Forget to mention:
After Tomcat-restart, the slaves still have an index with 300GB.
After an manual replication command in UI, 100GB like master in a
couple of seconds and all is ok.



2012/6/19 Vadim Kisselmann :
> Hi folks,
>
> i have to look for an old live system with solr 1.4.
> When i optimize an bigger index with round about 200GB(after optimize
> and cut, 100GB) and my slaves
> replicate the newest version after(!) optimize, they hang(all) with
> 100% in replication and they have at once circa 300GB index sizes.
> After a couple of seconds i have to restart my Tomcat, because the
> slaves are no longer be able to response on queries...
> Ironically, they have the same number of segments like master, i can't
> see errors in my logfile and the server load is normal.
> What's wrong here? :)
> Normal HTTP Replication is used, this params are set on master:
>    commit
>    startup
>    optimize
>
> Any ideas?
>
> Best regards
> Vadim


Replication slows down massively during high load

2011-03-16 Thread Vadim Kisselmann
Hi everyone,

I have Solr running on one master and two slaves (load balanced) via
Solr 1.4.1 native replication.

If the load is low, both slaves replicate with around 100MB/s from master.

But when I use Solrmeter (100-400 queries/min) for load tests (over
the load balancer), the replication slows down to an unacceptable
speed, around 100KB/s (at least that's whats the replication page on
/solr/admin says).

Going to a slave directly without load balancer yields the same result
for the slave under test:

Slave 1 gets hammered with Solrmeter and the replication slows down to 100KB/s.
At the same time, Slave 2 with only 20-50 queries/min without the load
test has no problems. It replicates with 100MB/s and the index version
is 5-10 versions ahead of Slave 1.

The replications stays in the 100KB/s range even after the load test
is over until the application server is restarted. The same issue
comes up under both Tomcat and Jetty.

The setup looks like this:

- Same hardware for all servers: Physical machines with quad core
CPUs, 24GB RAM (JVM starts up with -XX:+UseConcMarkSweepGC -Xms10G
-Xmx10G)
- Index size is about 100GB with 40M docs
- Master commits every 10 min/10k docs
- Slaves polls every minute

I checked this:

- Changed network interface; same behavior
- Increased thread pool size from 200 to 500 and queue size from 100
to 500 in Tomcat; same behavior
- Both disk and network I/O are not bottlenecked. Disk I/O went down
to almost zero after every query in the load test got cached. Network
isn't doing much and can put through almost an GBit/s with iPerf
(network throughput tester) while Solrmeter is running.

Any ideas what could be wrong?


Best Regards
Vadim


Re: Replication slows down massively during high load

2011-03-17 Thread Vadim Kisselmann
Hello Shawn,

Primary assumption:  You have a 64-bit OS and a 64-bit JVM.

>Jepp, it's running 64-bit Linux with 64-bit JVM

It sounds to me like you're I/O bound, because your machine cannot
keep enough of your index in RAM.  Relative to your 100GB index, you
only have a maximum of 14GB of RAM available to the OS disk cache,
since Java's heap size is 10GB.

The load test seems to be more CPU bound than I/O bound. 
All cores are fully busy and iostat says that there isn't 
much more disk I/O going on than without load test. The 
index is on a RAID10 array with four disks.

How much disk space do all of the index files that end in "x" take up?
 I would venture a guess that it's significantly more than 14GB.  On
Linux, you could do this command to tally it quickly:

# du -hc *x

>>>27G total

# du -hc `ls | egrep -v "tvf|fdt"`

>>>51G total

If you installed enough RAM so the disk cache can be much larger than
the total size of those files ending in "x", you'd probably stop
having these performance issues.  Realizing that this is a
Alternatively, you could take steps to reduce the size of your index,
or perhaps add more machines to go distributed.

>>Unfortunately, this doesn't seem to be the problem. 
>>The queries themselves are running fine. The problem 
>>is that the replications is crawling when there are 
>>many queries going on and that the replication speed 
>>stays low even after the load is gone.



Cheers
Vadim


Re: Replication slows down massively during high load

2011-03-17 Thread Vadim Kisselmann
On Mar 17, 2011, at 3:19 PM, Shawn Heisey wrote:

On 3/17/2011 3:43 AM, Vadim Kisselmann wrote:
Unfortunately, this doesn't seem to be the problem. The queries
themselves are running fine. The problem is that the replications is
crawling when there are many queries going on and that the replication
speed stays low even after the load is gone.

If you run "iostat 5" what are typical values on each iteration for
the various CPU states while you're doing load testing and replication
at the same time?  In particular, %iowait is important.



CPU stats from top (iostat doesn't seem to show CPU load correctly):

90.1%us,  4.5%sy,  0.0%ni,  5.5%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st

Seems like I/O is not the bottleneck here.

Other interesting thing: When Solr starts its replication under heavy
load, it tries to download the whole index from master.

>From /solr/admin/replication/index.jsp:

Current Replication Status

Start Time: Thu Mar 17 15:57:20 CET 2011
Files Downloaded: 9 / 163
Downloaded: 83,04 MB / 97,75 GB [0.0%]
Downloading File: _d5x.nrm, Downloaded: 86,82 KB / 86,82 KB [100.0%]
Time Elapsed: 419s, Estimated Time Remaining: 504635s, Speed: 202,94 
KB/s


Re: Replication slows down massively during high load

2011-03-17 Thread Vadim Kisselmann
Hi Bill,

> You could always rsync the index dir and reload (old scripts).

I used them previously but was getting problems with them. The
application querying the Solr doesn't cause enough load on it to
trigger the issue. Yet.

> But this is still something we should investigate.

Indeed :-)

> See if the Nic is configured right? Routing? Speed of transfer?

Network doesn't seem to be the problem. Testing with iperf from slave
to master yields a full gigabit, even while Solrmeter is hammering the
server.

> Bill Bell

Vadim


Unbuffered entity enclosing request can not be repeated & Invalid chunk header

2011-08-04 Thread Vadim Kisselmann
Hello folks,

i use solr 1.4.1 and every 2 to 6 hours i have indexing errors in my log
files.

on the client side:
2011-08-04 12:01:18,966 ERROR [Worker-242] IndexServiceImpl - Indexing
failed with SolrServerException.
Details: org.apache.commons.httpclient.ProtocolException: Unbuffered entity
enclosing request can not be repeated.:
Stacktrace: 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:469)
.
.
on the server side:
INFO: [] webapp=/solr path=/update params={wt=javabin&version=1} status=0
QTime=3
04.08.2011 12:01:18 org.apache.solr.update.processor.LogUpdateProcessor
finish
INFO: {} 0 0
04.08.2011 12:01:18 org.apache.solr.common.SolrException log
SCHWERWIEGEND: org.apache.solr.common.SolrException: java.io.IOException:
Invalid chunk header
.
.
.
i`m indexing ONE document per call, 15-20 documents per second, 24/7.
what may be the problem?

best regards
vadim


Last successful build of Solr 4.0 and Near Realtime Search

2011-08-12 Thread Vadim Kisselmann
Hi folks,

I'm writing here again (beside Jira: SOLR-2565), eventually any one can help
here:


I tested the nightly build #1595 with an new patch (2565), but NRT doesn't
work in my case.

I index 10 docs/sec, it takes 1-30sec. to see the results.
same behavior when i update an existing document.

My addedDate is an timestamp (default="NOW"). In worst case i can see what
the document which i indexed is already more when 30
seconds in my index, but i can't see it.

My Settings:

1000
6



1
1000


Are my settings wrong or need you more details?
Should i use the coldSearcher (default=false)? Or set maxWarmingSearchers
higher than 2?
UPDATE:
If i only use autoSoftCommit and uncomment autoCommit it works.
But i should use the "hard" autoCommit, right?
Mark said yes, because only with hard commits my docs are in stable storage:
http://www.lucidimagination.com/blog/2011/07/11/benchmarking-the-new-solr-
‘near-realtime’-improvements/

Regards
Vadim


Re: Unbuffered entity enclosing request can not be repeated & Invalid chunk header

2011-08-12 Thread Vadim Kisselmann
Hi Markus,

thanks for your answer.
I'm using Solr. 4.0 and jetty now and observe the behavior and my error logs
next week.
tomcat can be a reason, we will see, i'll report.

I'm indexing WITHOUT batches, one doc after another. But i would try out the
batch indexing as well as
retry indexing faulty docs.
if you indexing one batch, and one doc in batch is corrupt, what happens
with another 249docs(total 250/batch)? Are they indexed and
updated when you retry to indexing the batch, or fails the complete batch?

Regards
Vadim




2011/8/11 Markus Jelsma 

> Hi,
>
> We  see these errors too once on a while but there is real answer on the
> mailing list here except one user suspecting Tomcat is responsible
> (connection
> time outs).
>
> Another user proposed to limit the number of documents per batch but that,
> of
> course, increases the number of connections made. We do only 250 docs/batch
> to
> limit RAM usage on the client and start to see these errors very
> occasionally.
> There may be a coincidence.. or not.
>
> Anyway, it's really hard to reproduce if not impossible. It happens when
> connecting directly as well when connecting through a proxy.
>
> What you can do is simply retry the batch and it usually works out fine. At
> least you don't loose a batch in the process. We retry all failures at
> least a
> couple of times before giving up an indexing job.
>
> Cheers,
>
> > Hello folks,
> >
> > i use solr 1.4.1 and every 2 to 6 hours i have indexing errors in my log
> > files.
> >
> > on the client side:
> > 2011-08-04 12:01:18,966 ERROR [Worker-242] IndexServiceImpl - Indexing
> > failed with SolrServerException.
> > Details: org.apache.commons.httpclient.ProtocolException: Unbuffered
> entity
> > enclosing request can not be repeated.:
> > Stacktrace:
> >
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHtt
> > pSolrServer.java:469) .
> > .
> > on the server side:
> > INFO: [] webapp=/solr path=/update params={wt=javabin&version=1} status=0
> > QTime=3
> > 04.08.2011 12:01:18 org.apache.solr.update.processor.LogUpdateProcessor
> > finish
> > INFO: {} 0 0
> > 04.08.2011 12:01:18 org.apache.solr.common.SolrException log
> > SCHWERWIEGEND: org.apache.solr.common.SolrException: java.io.IOException:
> > Invalid chunk header
> > .
> > .
> > .
> > i`m indexing ONE document per call, 15-20 documents per second, 24/7.
> > what may be the problem?
> >
> > best regards
> > vadim
>


Re: Dismax Question

2012-07-02 Thread Vadim Kisselmann
in your schema.xml you can set the default query parser operator, in
your case , but it's
deprecated.
When you use the edismax, read this:http://drupal.org/node/1559394 .
mm-param is here the answer.

Best regards
Vadim





2012/7/2 Steve Fatula :
> Let's say a user types in:
>
> DualHead2Go
>
>
> The way solr is working, it splits this into:
>
> Dual Head 2 Go
>
> And searches the index for various fields, and finds records where any ONE of 
> them matches.
>
> Now, if I simply type the search terms Dual Head 2 Go, it finds records where 
> ALL of them match. This is because we set q.op to AND.
>
> Recently, we went from Solr 3.4 to 3.6, and, 3.4 used to work ok, 3.6 seems o 
> behave differently, or, perhaps we mucked something up.
>
> So, my question is how do we get Solr search to work with AND when it is 
> splitting words? The splitting part is good, the bad part is that it is 
> searching for any one of those split words.
>
> Steve


Re: Trunk error in Tomcat

2012-07-03 Thread Vadim Kisselmann
same problem here:

https://mail.google.com/mail/u/0/?ui=2&view=btop&ver=18zqbez0n5t35&q=tomcat%20v.kisselmann&qs=true&search=query&th=13615cfb9a5064bd&qt=kisselmann.1.tomcat.1.tomcat's.1.v.1&cvid=3


https://issues.apache.org/jira/browse/SOLR-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13230056#comment-13230056

i use an older solr-trunk version from february/march, it works. with
newer versions from trunk i get the same error: "This interface
requires that you activate the admin request handlers..."

regards
vadim



2012/7/3 Briggs Thompson :
> Also, I forgot to include this before, but there is a client side error
> which is a failed 404 request to the below URL.
>
> http://localhost:8983/solr/null/admin/system?wt=json
>
> On Tue, Jul 3, 2012 at 8:45 AM, Briggs Thompson > wrote:
>
>> Thanks Erik. If anyone else has any ideas about the NoSuchFieldError issue
>> please let me know. Thanks!
>>
>> -Briggs
>>
>>
>> On Mon, Jul 2, 2012 at 6:27 PM, Erik Hatcher wrote:
>>
>>> Interestingly, I just logged the issue of it not showing the right error
>>> in the UI here: 
>>>
>>> As for your specific issue, not sure, but the error should at least also
>>> show in the admin view.
>>>
>>> Erik
>>>
>>>
>>> On Jul 2, 2012, at 18:59 , Briggs Thompson wrote:
>>>
>>> > Hi All,
>>> >
>>> > I just grabbed the latest version of trunk and am having a hard time
>>> > getting it running properly in tomcat. It does work fine in Jetty. The
>>> > admin screen gives the following error:
>>> > This interface requires that you activate the admin request handlers,
>>> add
>>> > the following configuration to your  Solrconfig.xml
>>> >
>>> > I am pretty certain the front end error has nothing to do with the
>>> actual
>>> > error. I have seen some other folks on the distro with the same problem,
>>> > but none of the threads have a solution (that I could find). Below is
>>> the
>>> > stack trace. I also tried with different versions of Lucene but none
>>> > worked. Note: my index is EMPTY and I am not migrating over an index
>>> build
>>> > with a previous version of lucene. I think I ran into this a while ago
>>> with
>>> > an earlier version of trunk, but I don't recall doing anything to fix
>>> it.
>>> > Anyhow, if anyone has an idea with this one, please let me know.
>>> >
>>> > Thanks!
>>> > Briggs Thompson
>>> >
>>> > SEVERE: null:java.lang.NoSuchFieldError: LUCENE_50
>>> > at
>>> >
>>> org.apache.solr.analysis.SynonymFilterFactory$1.createComponents(SynonymFilterFactory.java:83)
>>> > at org.apache.lucene.analysis.Analyzer.tokenStream(Analyzer.java:83)
>>> > at
>>> >
>>> org.apache.lucene.analysis.synonym.SynonymMap$Builder.analyze(SynonymMap.java:120)
>>> > at
>>> >
>>> org.apache.lucene.analysis.synonym.SolrSynonymParser.addInternal(SolrSynonymParser.java:99)
>>> > at
>>> >
>>> org.apache.lucene.analysis.synonym.SolrSynonymParser.add(SolrSynonymParser.java:70)
>>> > at
>>> >
>>> org.apache.solr.analysis.SynonymFilterFactory.loadSolrSynonyms(SynonymFilterFactory.java:131)
>>> > at
>>> >
>>> org.apache.solr.analysis.SynonymFilterFactory.inform(SynonymFilterFactory.java:93)
>>> > at
>>> >
>>> org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:584)
>>> > at org.apache.solr.schema.IndexSchema.(IndexSchema.java:112)
>>> > at org.apache.solr.core.CoreContainer.create(CoreContainer.java:812)
>>> > at org.apache.solr.core.CoreContainer.load(CoreContainer.java:510)
>>> > at org.apache.solr.core.CoreContainer.load(CoreContainer.java:333)
>>> > at
>>> >
>>> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:282)
>>> > at
>>> >
>>> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:101)
>>> > at
>>> >
>>> org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:277)
>>> > at
>>> >
>>> org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:258)
>>> > at
>>> >
>>> org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:382)
>>> > at
>>> >
>>> org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:103)
>>> > at
>>> >
>>> org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4649)
>>> > at
>>> >
>>> org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5305)
>>> > at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150)
>>> > at
>>> >
>>> org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:899)
>>> > at
>>> org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:875)
>>> > at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:618)
>>> > at org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:963)
>>> > at
>>> >
>>> org.apache.catalina.startup.HostConfig$DeployWar.run(HostConfig.java:1600)
>>> > at
>>> java.util.concurrent.Executors$R

Re: Trunk error in Tomcat

2012-07-03 Thread Vadim Kisselmann
Hi Stefan,
sorry, i overlooked your latest comment with the new issue in SOLR-3238 ;)
Should i open an new issue? I´m not testing it with newer
trunk-versions about a couple of months because
solr cloud with an external ZK and tomcat fails too, but i can do it
and post all the errors which i find in my log files.
Regards
Vadim



2012/7/3 Stefan Matheis :
> Hey Vadim
>
> Right now JIRA is Down for Maintenance, but afaik there was another comment 
> asking for more informations. I'll check Eric's Issue today or tomorrow and 
> see how we can handle (and hopefully fix) that.
>
> Regards
> Stefan
>
>
> On Tuesday, July 3, 2012 at 4:00 PM, Vadim Kisselmann wrote:
>
>> same problem here:
>>
>> https://mail.google.com/mail/u/0/?ui=2&view=btop&ver=18zqbez0n5t35&q=tomcat%20v.kisselmann&qs=true&search=query&th=13615cfb9a5064bd&qt=kisselmann.1.tomcat.1.tomcat's.1.v.1&cvid=3
>>
>>
>> https://issues.apache.org/jira/browse/SOLR-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13230056#comment-13230056
>>
>> i use an older solr-trunk version from february/march, it works. with
>> newer versions from trunk i get the same error: "This interface
>> requires that you activate the admin request handlers..."
>>
>> regards
>> vadim
>
>
>


Re: Trunk error in Tomcat

2012-07-05 Thread Vadim Kisselmann
Hi Stefan,
ok, i would test the latest version from trunk with tomcat in next
days and open an new issue:)
regards
Vadim


2012/7/3 Stefan Matheis :
> On Tuesday, July 3, 2012 at 8:10 PM, Vadim Kisselmann wrote:
>> sorry, i overlooked your latest comment with the new issue in SOLR-3238 ;)
>> Should i open an new issue?
>
>
> NP Vadim, yes a new Issue would help .. all available Information too :)


Solr 4.0 IllegalStateException: this writer hit an OutOfMemoryError; cannot commit

2012-07-10 Thread Vadim Kisselmann
Hi folks,
my Test-Server with Solr 4.0 from trunk(version 1292064 from late
february) throws this exception...


auto commit error...:java.lang.IllegalStateException: this writer hit
an OutOfMemoryError; cannot commit
at 
org.apache.lucene.index.IndexWriter.prepareCommit(IndexWriter.java:2650)
at 
org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2804)
at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2786)
at 
org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:391)
at org.apache.solr.update.CommitTracker.run(CommitTracker.java:197)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:206)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)


My Server has 24GB RAM, 8GB for JVM. I index round about 20 docs per
seconds, my index is small with 10Mio docs. It runs
about a couple of weeks and then suddenly i get this errors..
I can't see any problems in VisualVM with my GC. It's all ok, memory
consumption is about 6GB, no swapping, no i/o problems..it's all
green:)
What's going on on this machine?:)  My uncommitted docs are gone, right?

Best regards
Vadim


Re: Solr 4.0 IllegalStateException: this writer hit an OutOfMemoryError; cannot commit

2012-07-10 Thread Vadim Kisselmann
Hi Robert,

> Can you run Lucene's checkIndex tool on your index?

No, unfortunately not. This Solr should run without stoppage, an
tomcat-restart is ok, but not more:)
I tested newer trunk-versions a couple of months ago, but they fail
all with tomcat.
i would test 4.0-alpha in next days with tomcat and open an jira-issue
if it doesn't work with it.

> do you have another exception in your logs? To my knowledge, in all
> cases that IndexWriter throws an OutOfMemoryError, the original
> OutOfMemoryError is also rethrown (not just this IllegalStateException
> noting that at some point, it hit OOM.

Hmm, i checked older logs and found something new, what i have not
seen in VisualVM. "Java heap space"-Problems, just before OOM.
My JVM has 8GB -Xmx/-Xms, 16GB for OS, nothing else on this machine.
This Errors pop up's during normal run according logs, no optimizes,
high loads(max. 30 queries per minute) or something special at this time.

SCHWERWIEGEND: null:ClientAbortException:  java.net.SocketException: Broken pipe
SCHWERWIEGEND: null:java.lang.OutOfMemoryError: Java heap space
SCHWERWIEGEND: auto commit error...:java.lang.IllegalStateException:
this writer hit an OutOfMemoryError; cannot commit
SCHWERWIEGEND: Error during auto-warming of
key:org.apache.solr.search.QueryResultKey@7cba935e:java.lang.OutOfMemoryError:
Java heap space
SCHWERWIEGEND: org.apache.solr.common.SolrException: Internal Server Error
SCHWERWIEGEND: null:org.apache.solr.common.SolrException: Internal Server Error

I knew this failures when i work on virtual machines with solr 1.4,
big indexes and ridiculous small -Xmx sizes.
But on real hardware, with enough RAM, fast disks/cpu's it's new for me:)

Best regards
Vadim


Re: Solr 4.0 IllegalStateException: this writer hit an OutOfMemoryError; cannot commit

2012-07-11 Thread Vadim Kisselmann
Hi Simon,
i checked my log files one more time to get the error timestamps.
I get the first Error at 14:37:

06.07.2012 14:37:52 org.apache.solr.common.SolrException log
SCHWERWIEGEND: null:ClientAbortException:  java.net.SocketException: Broken pipe
at 
org.apache.catalina.connector.OutputBuffer.realWriteBytes(OutputBuffer.java:358)
at org.apache.tomcat.util.buf.ByteChunk.append(ByteChunk.java:323)

Next one, and the first Java heap Space error at 17:35:
06.07.2012 17:35:36 org.apache.solr.common.SolrException log
SCHWERWIEGEND: null:java.lang.OutOfMemoryError: Java heap space
at 
org.apache.lucene.index.FreqProxTermsWriterPerField$FreqProxPostingsArray.(FreqProxTermsWriterPerField.java:248)
at 
org.apache.lucene.index.FreqProxTermsWriterPerField$FreqProxPostingsArray.newInstance(FreqProxTermsWriterPerField.java:269)
at 
org.apache.lucene.index.ParallelPostingsArray.grow(ParallelPostingsArray.java:48)
at 
org.apache.lucene.index.TermsHashPerField$PostingsBytesStartArray.grow(TermsHashPerField.java:307)
at org.apache.lucene.util.BytesRefHash.add(BytesRefHash.java:330)

Commit failure a couple of seconds later:
06.07.2012 17:35:38 org.apache.solr.common.SolrException log
SCHWERWIEGEND: auto commit error...:java.lang.IllegalStateException:
this writer hit an OutOfMemoryError; cannot commit
at 
org.apache.lucene.index.IndexWriter.prepareCommit(IndexWriter.java:2650)

follow by 10 Java heap space Exceptions, and one minute later at 17:36
the first auto-warming Exception:
06.07.2012 17:36:26 org.apache.solr.common.SolrException log
SCHWERWIEGEND: Error during auto-warming of key:pubDate:[1340971496000
TO 1341576296000]:java.lang.OutOfMemoryError: Java heap space
06.07.2012 17:36:28 org.apache.solr.common.SolrException log
SCHWERWIEGEND: Error during auto-warming of key:pubDate:[1340971495000
TO 1341576295000]:java.lang.OutOfMemoryError: Java heap space

> it really seems that you are hitting an OOM during auto warming. can
> this be the case for your failure.
> Can you raise the JVM memory and see if you still hit the spike and go
> OOM? this is very unlikely a IndexWriter problem. I'd rather look at
> your warmup queries ie. fieldcache, FieldValueCache usage. Are you
> sorting / facet on anything?

Auto warming problems began one minute after the "java
heap"-exceptions, so i think this are subsequent problems.
I configured very small caches(max. sizes between 512 and 2048) for my use case.
Warming queries looks like this, with sorting, but without facetting:

   ag
   pubDate:[NOW-1DAY TO *]
   pubDate desc


Du you think that 8GB for JVM are not enough? To raise the JVM memory
can solve the problem..
As mentioned, this server runs a long time with the same config
without problems, i am surprised that this problem was there at one
time
without heavy usage...now it's running smoothly again after restart
yesterday, so i don't know whet the problem appears again.

I try to update to 4.0 alpha today and run it with tomcat and report:)

Best regards
Vadim





2012/7/10 Simon Willnauer :
> it really seems that you are hitting an OOM during auto warming. can
> this be the case for your failure.
> Can you raise the JVM memory and see if you still hit the spike and go
> OOM? this is very unlikely a IndexWriter problem. I'd rather look at
> your warmup queries ie. fieldcache, FieldValueCache usage. Are you
> sorting / facet on anything?
>
> simon
>
> On Tue, Jul 10, 2012 at 4:49 PM, Vadim Kisselmann
>  wrote:
>> Hi Robert,
>>
>>> Can you run Lucene's checkIndex tool on your index?
>>
>> No, unfortunately not. This Solr should run without stoppage, an
>> tomcat-restart is ok, but not more:)
>> I tested newer trunk-versions a couple of months ago, but they fail
>> all with tomcat.
>> i would test 4.0-alpha in next days with tomcat and open an jira-issue
>> if it doesn't work with it.
>>
>>> do you have another exception in your logs? To my knowledge, in all
>>> cases that IndexWriter throws an OutOfMemoryError, the original
>>> OutOfMemoryError is also rethrown (not just this IllegalStateException
>>> noting that at some point, it hit OOM.
>>
>> Hmm, i checked older logs and found something new, what i have not
>> seen in VisualVM. "Java heap space"-Problems, just before OOM.
>> My JVM has 8GB -Xmx/-Xms, 16GB for OS, nothing else on this machine.
>> This Errors pop up's during normal run according logs, no optimizes,
>> high loads(max. 30 queries per minute) or something special at this time.
>>
>> SCHWERWIEGEND: null:ClientAbortException:  java.net.SocketException: Broken 
>> pipe
>> SCHWERWIEGEND: null:java.lang.OutOfMemoryError: Java heap space
&g

Re: Trunk error in Tomcat

2012-07-12 Thread Vadim Kisselmann
it works, with a "few" changes :) I think we don't need a new issue in jura.

Solr 4.0 is no longer Solr 4.0 since late february.
There were some changes in solrconfig.xml in this time.
I migrate my solr 4.0 trunk-config, which works till late february in
a new config from 4.0 alpha.

A couple of changes which i noticed:
- abortOnConfigurationError:true is gone
- luceneMatchVersion was changed to LUCENE_50
- a couple of new jars included for velocity and lang
- new directory Factory = solr.directoryFactory:solr.NRTCachingDirectoryFactory
- indexDefaults replaced by indexConfig
- updateLog added
- replication Handler for SoldCloud added
- Names for handlers were changed like "/select" for "search"
- new Handler added  
and so on...

 This "AdminHandler"-Exception is still there, when i use the
clusteringComponent, see here:
SCHWERWIEGEND: null:org.apache.solr.common.SolrException: Error
loading class 'solr.clustering.ClusteringComponent'

But if i comment it out, Solr starts without errors.
The paths to the clustering jar in ../contrib/clustering/lib/ is
correct and the needed jars are there, eventually we need new
jar-files?

Best regards
Vadim




2012/7/5 Stefan Matheis :
> Great, thanks Vadim
>
>
>
> On Thursday, July 5, 2012 at 9:34 AM, Vadim Kisselmann wrote:
>
>> Hi Stefan,
>> ok, i would test the latest version from trunk with tomcat in next
>> days and open an new issue:)
>> regards
>> Vadim
>>
>>
>> 2012/7/3 Stefan Matheis > (mailto:matheis.ste...@googlemail.com)>:
>> > On Tuesday, July 3, 2012 at 8:10 PM, Vadim Kisselmann wrote:
>> > > sorry, i overlooked your latest comment with the new issue in SOLR-3238 
>> > > ;)
>> > > Should i open an new issue?
>> >
>> >
>> >
>> >
>> > NP Vadim, yes a new Issue would help .. all available Information too :)
>
>


Re: Pb installation Solr/Tomcat6

2012-07-14 Thread Vadim Kisselmann
same problem.
but here should tomcat6 have the right to read/write your index.
regards
vadim


2012/7/14 Bruno Mannina :
> I found the problem I think, It was a permission problem on the schema.xml
>
> schema.xml was only readable by the solr user.
>
> Now I have the same problem with the solr index directory
>
> Le 14/07/2012 14:00, Bruno Mannina a écrit :
>
>> Dear Solr users,
>>
>> I try to run solr/ with tomcat but I have always this error:
>> Can't find resource 'schema.xml' in classpath or
>> '/home/solr/apache-solr-3.6.0/example/solr/./conf/', cwd='/var/lib/tomcat6
>>
>> but schema.xml is inside the directory
>> '/home/solr/apache-solr-3.6.0/example/solr/./conf/'
>>
>> http://localhost:8080/manager/html => works fine, I see Applications
>> /solr, fonctionnelle True
>>
>> but when I click on solr/ (http://localhost:8080/solr/) I get this error.
>>
>> Could you help me to solve this problem, it makes me crazy.
>>
>> thanks a lot,
>> Bruno
>>
>>
>> Tomcat6
>> Ubuntu 12.04
>> Solr 3.6
>>
>>
>
>


Does SolrEntityProcessor fulfill my requirements?

2012-07-18 Thread Vadim Kisselmann
Hi folks,

i have this case:
i want to update my solr 4.0 from trunk to solr 4.0 alpha. the index
structure has changed, i can't replicate.
10 cores are in use, each with 30Mio docs. We assume that all fields
are stored and indexed.
What is the best way to export the docs from all cores on one machine
with solr 4.0trunk to same named cores on other machine with solr 4.0
alpha.
SolrEntityProcessor can be one solution, but does it work with this
size of data? I want reindex all docs at once and not in "small"
parts. I find no examples
of bigger reindex-attempts with SolrEntityProcessor.
Xslt as option two?
What were the best solution to do this, what do you think?

Best Regards
Vadim


Re: Problem to start solr-4.0.0-BETA with tomcat-6.0.20

2012-08-24 Thread Vadim Kisselmann
a presumption:
do you use your "old" solrconfig.xml files from older installations?
when yes, compare the default config and yours.


2012/8/23 Claudio Ranieri :
> I made this instalation on a new tomcat.
> With Solr 3.4.*, 3.5.*, 3.6.* works with jars into 
> $TOMCAT_HOME/webapps/solr/WEB-INF/lib, but with solr 4.0 beta doesn´t work. I 
> needed to add the jars into $TOMCAT_HOME/lib.
> The problem with the cast seems to be in the source code.
>
>
> -Mensagem original-
> De: Karthick Duraisamy Soundararaj [mailto:karthick.soundara...@gmail.com]
> Enviada em: quinta-feira, 23 de agosto de 2012 09:22
> Para: solr-user@lucene.apache.org
> Assunto: Re: Problem to start solr-4.0.0-BETA with tomcat-6.0.20
>
> Not sure if this can help. But once I had a similar problem with Solr 3.6.0 
> where tomcat refused to find one of the classes that existed. I deleted the 
> tomcat's webapp directory and then it worked fine.
>
> On Thu, Aug 23, 2012 at 8:19 AM, Erick Erickson 
> wrote:
>
>> First, I'm no Tomcat expert here's the Tomcat Solr page, but
>> you've probably already seen it:
>> http://wiki.apache.org/solr/SolrTomcat
>>
>> But I'm guessing that you may have old jars around somewhere and
>> things are getting confused. I'd blow away the whole thing and start
>> over, whenever I start copying jars around I always lose track of
>> what's where.
>>
>> Have you successfully had any other Solr operate under Tomcat?
>>
>> Sorry I can't be more help
>> Erick
>>
>> On Wed, Aug 22, 2012 at 9:47 AM, Claudio Ranieri
>>  wrote:
>> > Hi,
>> >
>> > I tried to start the solr-4.0.0-BETA with tomcat-6.0.20 but does not
>> work.
>> > I copied the apache-solr-4.0.0-BETA.war to $TOMCAT_HOME/webapps.
>> > Then I
>> copied the directory apache-solr-4.0.0-BETA\example\solr to
>> C:\home\solr-4.0-beta and adjusted the file
>> $TOMCAT_HOME\conf\Catalina\localhost\apache-solr-4.0.0-BETA.xml to
>> point the solr/home to C:/home/solr-4.0-beta. With this configuration,
>> when I startup tomcat I got:
>> >
>> > SEVERE: org.apache.solr.common.SolrException: Invalid
>> > luceneMatchVersion
>> 'LUCENE_40', valid values are: [LUCENE_20, LUCENE_21, LUCENE_22,
>> LUCENE_23, LUCENE_24, LUCENE_29, LUCENE_30, LUCENE_31, LUCENE_32,
>> LUCENE_33, LUCENE_34, LUCENE_35, LUCENE_36, LUCENE_CURRENT ] or a string in 
>> format 'VV'
>> >
>> > So I changed the line in solrconfig.xml:
>> >
>> > LUCENE_40
>> >
>> > to
>> >
>> > LUCENE_CURRENT
>> >
>> > So I got a new error:
>> >
>> > Caused by: java.lang.ClassNotFoundException:
>> solr.NRTCachingDirectoryFactory
>> >
>> > This class is within the file apache-solr-core-4.0.0-BETA.jar but
>> > for
>> some reason classloader of the class is not loaded. I then moved all
>> jars in $TOMCAT_HOME\webapps\apache-solr-4.0.0-BETA\WEB-INF\lib to
>> $TOMCAT_HOME\lib.
>> > After this setup, I got a new error:
>> >
>> > SEVERE: java.lang.ClassCastException:
>> org.apache.solr.core.NRTCachingDirectoryFactory can not be cast to
>> org.apache.solr.core.DirectoryFactory
>> >
>> > So I changed the line in solrconfig.xml:
>> >
>> > > >
>> class="${solr.directoryFactory:solr.NRTCachingDirectoryFactory}"/>
>> >
>> > to
>> >
>> > > >
>> class="${solr.directoryFactory:solr.NIOFSDirectoryFactory}"/>
>> >
>> > So I got a new error:
>> >
>> > Caused by: java.lang.ClassCastException:
>> org.apache.solr.spelling.DirectSolrSpellChecker can not be cast to
>> org.apache.solr.spelling.SolrSpellChecker
>> >
>> > How can I resolve the problem of classloader?
>> > How can I resolve the problem of cast of NRTCachingDirectoryFactory
>> > and
>> DirectSolrSpellChecker?
>> > I can not startup the solr 4.0 beta with tomcat.
>> > Thanks,
>> >
>> >
>> >
>> >
>>
>
>
>
> --
> --
> Karthick D S
> Master's in Computer Engineering ( Software Track ) Syracuse University 
> Syracuse - 13210 New York United States of America


Re: flush (delete all document) solr 4 Beta

2012-08-27 Thread Vadim Kisselmann
your docs are marked as deleted.
you should optimize after commit, then they will be really deleted.
it's easier and faster to stop your jetty/tomcat, drop your index
directory and start your servlet container...
when it's not possible, then optimize.
regards
Vadim


2012/8/27 Jamel ESSOUSSI :
> Hi,
>
> I should flush solr (delete all existing documents)
> --> for doing this, I have the following code:
>
>
> HttpSolrServer server = HttpSolrServer(url);
>
> server.setSoTimeout(1000);
> server.setConnectionTimeout(100);
> server.setDefaultMaxConnectionsPerHost(100);
> server.setMaxTotalConnections(100);
> server.setFollowRedirects(false);
> server.setAllowCompression(true);
> server.setMaxRetries(1);
> server.setParser(new XMLResponseParser());
>
> UpdateResponse ur = server.deleteByQuery("*:*");
>
> server.commit(true, true);
>
> In the result, I hace already all document, --> the ur.getStatus() eq "0"
> and the solr documents was not deleted
>
> --> I have'nt server or client errors
>
> Can you explain me why it did not work,
>
> Thinks
>
>
>
>
>
>
>
>
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/flush-delete-all-document-solr-4-Beta-tp4003434.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Problem to start solr-4.0.0-BETA with tomcat-6.0.20

2012-08-28 Thread Vadim Kisselmann
Hi Claudio,
great to hear that it works.
Everyone can edit the wiki, you need only to login.
Regards
Vadim


2012/8/27 Claudio Ranieri :
> I solved the problem.
> I added the parameter sharedLib="lib" in $SOLR_HOME/solr.xml ( persistent="true" sharedLib="lib">) and moved all jars into 
> $TOMCAT_HOME/webapps/solr/WEB-INF/lib to $SOLR_HOME/lib
> This information could be included in the wiki Solr / Tomcat.
>
> Claudio Ranieri | Especialista Sistemas de Busca | S.A O Estado de S.Paulo
> Av. Eng. Caetano Álvares, 55 - Limão - São Paulo - SP - 02598-900
> + 55 11 3856-5790 | + 55 11 9344-2674
>
>
>
>
>
> -Mensagem original-
> De: Claudio Ranieri [mailto:claudio.rani...@estadao.com]
> Enviada em: segunda-feira, 27 de agosto de 2012 10:34
> Para: solr-user@lucene.apache.org
> Assunto: RES: Problem to start solr-4.0.0-BETA with tomcat-6.0.20
>
> Can anyone help me?
>
>
> -Mensagem original-
> De: Claudio Ranieri [mailto:claudio.rani...@estadao.com]
> Enviada em: sexta-feira, 24 de agosto de 2012 11:40
> Para: solr-user@lucene.apache.org
> Assunto: RES: Problem to start solr-4.0.0-BETA with tomcat-6.0.20
>
> Hi Vadim,
> No, I used the entire apache-solr-4.0.0-BETA\example\solr (schema.xml, 
> solrconfig.xml ...)
>
>
> -Mensagem original-
> De: Vadim Kisselmann [mailto:v.kisselm...@gmail.com] Enviada em: sexta-feira, 
> 24 de agosto de 2012 07:26
> Para: solr-user@lucene.apache.org
> Assunto: Re: Problem to start solr-4.0.0-BETA with tomcat-6.0.20
>
> a presumption:
> do you use your "old" solrconfig.xml files from older installations?
> when yes, compare the default config and yours.
>
>
> 2012/8/23 Claudio Ranieri :
>> I made this instalation on a new tomcat.
>> With Solr 3.4.*, 3.5.*, 3.6.* works with jars into 
>> $TOMCAT_HOME/webapps/solr/WEB-INF/lib, but with solr 4.0 beta doesn´t work. 
>> I needed to add the jars into $TOMCAT_HOME/lib.
>> The problem with the cast seems to be in the source code.
>>
>>
>> -Mensagem original-
>> De: Karthick Duraisamy Soundararaj
>> [mailto:karthick.soundara...@gmail.com]
>> Enviada em: quinta-feira, 23 de agosto de 2012 09:22
>> Para: solr-user@lucene.apache.org
>> Assunto: Re: Problem to start solr-4.0.0-BETA with tomcat-6.0.20
>>
>> Not sure if this can help. But once I had a similar problem with Solr 3.6.0 
>> where tomcat refused to find one of the classes that existed. I deleted the 
>> tomcat's webapp directory and then it worked fine.
>>
>> On Thu, Aug 23, 2012 at 8:19 AM, Erick Erickson 
>> wrote:
>>
>>> First, I'm no Tomcat expert here's the Tomcat Solr page, but
>>> you've probably already seen it:
>>> http://wiki.apache.org/solr/SolrTomcat
>>>
>>> But I'm guessing that you may have old jars around somewhere and
>>> things are getting confused. I'd blow away the whole thing and start
>>> over, whenever I start copying jars around I always lose track of
>>> what's where.
>>>
>>> Have you successfully had any other Solr operate under Tomcat?
>>>
>>> Sorry I can't be more help
>>> Erick
>>>
>>> On Wed, Aug 22, 2012 at 9:47 AM, Claudio Ranieri
>>>  wrote:
>>> > Hi,
>>> >
>>> > I tried to start the solr-4.0.0-BETA with tomcat-6.0.20 but does
>>> > not
>>> work.
>>> > I copied the apache-solr-4.0.0-BETA.war to $TOMCAT_HOME/webapps.
>>> > Then I
>>> copied the directory apache-solr-4.0.0-BETA\example\solr to
>>> C:\home\solr-4.0-beta and adjusted the file
>>> $TOMCAT_HOME\conf\Catalina\localhost\apache-solr-4.0.0-BETA.xml to
>>> point the solr/home to C:/home/solr-4.0-beta. With this
>>> configuration, when I startup tomcat I got:
>>> >
>>> > SEVERE: org.apache.solr.common.SolrException: Invalid
>>> > luceneMatchVersion
>>> 'LUCENE_40', valid values are: [LUCENE_20, LUCENE_21, LUCENE_22,
>>> LUCENE_23, LUCENE_24, LUCENE_29, LUCENE_30, LUCENE_31, LUCENE_32,
>>> LUCENE_33, LUCENE_34, LUCENE_35, LUCENE_36, LUCENE_CURRENT ] or a string in 
>>> format 'VV'
>>> >
>>> > So I changed the line in solrconfig.xml:
>>> >
>>> > LUCENE_40
>>> >
>>> > to
>>> >
>>> > LUCENE_CURRENT
>>> >
>>> > So I got a new error:
>>> >
>>> > Caused by: java.lang.ClassNotFoundException:
>>> solr.NRTCachingDirectoryFacto

Proximity(tilde) combined with wildcard, AutomatonQuery ?

2012-09-26 Thread Vadim Kisselmann
Hi guys,

we assume i have a simple query like this with wildcard and tilde:

"japa* fukushima"~10

instead of "japan fukushima"~10 OR "japanese fukushima"~10, etc.

Do we have a solution in Solr 4.0 to work with these kind of queries?
Does the AutomatonQuery/Filter cover this case?

Best regards
Vadim


Re: Proximity(tilde) combined with wildcard, AutomatonQuery ?

2012-09-27 Thread Vadim Kisselmann
Hi Ahmet,
thanks for your reply:)
I see that it does not come with the 4.0 release, because the given
patches do not work with this version.
Right?
Best regards
Vadim


2012/9/26 Ahmet Arslan :
>
>> we assume i have a simple query like this with wildcard and
>> tilde:
>>
>> "japa* fukushima"~10
>>
>> instead of "japan fukushima"~10 OR "japanese fukushima"~10,
>> etc.
>>
>> Do we have a solution in Solr 4.0 to work with these kind of
>> queries?
>
> Vadim, two open jira issues:
>
> https://issues.apache.org/jira/browse/SOLR-1604
> https://issues.apache.org/jira/browse/LUCENE-1486
>


Re: How to run Solr Cloud using Tomcat?

2012-09-27 Thread Vadim Kisselmann
Hi Roy,
jepp, it works with Tomcat 6 and an external Zookeeper.
I will publish a blogpost about it tomorrow on sentric.ch
My blogpost is ready, but i had no time to publish it in the last
couple of days:)
Best regards
Vadim



2012/9/27 Markus Jelsma :
> Hi - on Debian systems there's a /etc/default/tomcat properties file you can 
> use to set your flags.
>
> -Original message-
>> From:Benjamin, Roy 
>> Sent: Thu 27-Sep-2012 19:57
>> To: solr-user@lucene.apache.org
>> Subject: How to run Solr Cloud using Tomcat?
>>
>> I've gone through the guide on running Solr Cloud using Jetty but it's not
>> practical to use JAVA_OPTS etc on real cloud deployments. I don't see how
>> to extend these instructions to running on Tomcat.
>>
>> Has anyone run Solr Cloud under Tomcat successfully?  Did they document how?
>>
>> Thanks
>>
>> Roy
>>


Re: Proximity(tilde) combined with wildcard, AutomatonQuery ?

2012-10-05 Thread Vadim Kisselmann
Hi Ahmet,
thank you, it sounds great:)
I will test it in the next days and give feedback.
Best regards
Vadim



2012/10/5 Ahmet Arslan :
> Hi Vadim,
>
> I attached a zip (solr plugin) file to SOLR-1604. This not a patch. This is 
> supposed to work with solr 4.0. Some tests fails but it should  work with 
> "pol* tel*"~5 types of queries.
>
> Ahmet
>
> --- On Thu, 9/27/12, Vadim Kisselmann  wrote:
>
>> From: Vadim Kisselmann 
>> Subject: Re: Proximity(tilde) combined with wildcard, AutomatonQuery ?
>> To: solr-user@lucene.apache.org
>> Date: Thursday, September 27, 2012, 10:38 AM
>> Hi Ahmet,
>> thanks for your reply:)
>> I see that it does not come with the 4.0 release, because
>> the given
>> patches do not work with this version.
>> Right?
>> Best regards
>> Vadim
>>
>>
>> 2012/9/26 Ahmet Arslan :
>> >
>> >> we assume i have a simple query like this with
>> wildcard and
>> >> tilde:
>> >>
>> >> "japa* fukushima"~10
>> >>
>> >> instead of "japan fukushima"~10 OR "japanese
>> fukushima"~10,
>> >> etc.
>> >>
>> >> Do we have a solution in Solr 4.0 to work with
>> these kind of
>> >> queries?
>> >
>> > Vadim, two open jira issues:
>> >
>> > https://issues.apache.org/jira/browse/SOLR-1604
>> > https://issues.apache.org/jira/browse/LUCENE-1486
>> >
>>


Re: Multicore setup is ignored when deploying solr.war on Tomcat 5/6/7

2012-10-15 Thread Vadim Kisselmann
Hi Rogerio,
i can imagine what it is. Tomcat extract the war-files in
/var/lib/tomcatXX/webapps.
If you already run an older Solr-Version on your server, the old
extracted Solr-war could still be there (keyword: tomcat cache).
Delete the /var/lib/tomcatXX/webapps/solr - folder and restart tomcat,
when Tomcat should put your new war-file.
Best regards
Vadim



2012/10/14 Rogerio Pereira :
> I'll try to be more specific Jack.
>
> I just download the apache-solr-4.0.0.zip, from this archive I took the
> core1 and core2 folders from multicore example and rename them to
> collection1 and collection2, I also did all necessary changes on solr.xml
> and solrconfig.xml and schema.xml on these two correct to reflect the new
> names.
>
> After this step I just tried to deploy and war file on tomcat pointing to
> the the directory (solr/home) where these two cores are located, solr.xml
> is there, with collection1 and collection2 properly configured.
>
> The question is, now matter what is contained on solr.xml, this file isn't
> read at Tomcat startup, I tried to cause a parser error on solr.xml by
> removing closing tags, but even with this change I can't get at least a
> parser error.
>
> I hope to be clear now.
>
>
> 2012/10/14 Jack Krupansky 
>
>> I can't quite parse "the same multicore deployment as we have on apache
>> solr 4.0 distribution archive". Could you rephrase and be more specific.
>> What "archive"?
>>
>> Were you already using 4.0-ALPHA or BETA (or some snapshot of 4.0) or are
>> you moving from pre-4.0 to 4.0? The directory structure did change in 4.0.
>> Look at the example/solr directory.
>>
>> -- Jack Krupansky
>>
>> -Original Message- From: Rogerio Pereira
>> Sent: Sunday, October 14, 2012 10:01 AM
>> To: solr-user@lucene.apache.org
>> Subject: Multicore setup is ignored when deploying solr.war on Tomcat 5/6/7
>>
>>
>> Hi,
>>
>> I tried to perform the same multicore deployment as we have on apache solr
>> 4.0 distribution archive, I created a directory for solr/home with solr.xml
>> inside and two subdirectories collection1 and collection2, these two cores
>> are properly configured with conf folder and solrconfi.xml and schema.xml,
>> on Tomcat I setup the system property pointing to solr/home path,
>> unfortunatelly when I start tomcat the solr.xml is ignored and only the
>> default collection1 is loaded.
>>
>> As a test, I made changes on solr.xml to cause parser errors, and guess
>> what? These errors aren't reported on tomcat startup.
>>
>> The same thing doesn't happens on multicore example that comes on
>> distribution archive, now I'm trying to figure out what's the black magic
>> happening.
>>
>> Let me do the same kind of deployment on Windows and Mac OSX, if persist,
>> I'll update this thread.
>>
>> Regards,
>>
>> Rogério
>>
>
>
>
> --
> Regards,
>
> Rogério Pereira Araújo
>
> Blogs: http://faces.eti.br, http://ararog.blogspot.com
> Twitter: http://twitter.com/ararog
> Skype: rogerio.araujo
> MSN: ara...@hotmail.com
> Gtalk/FaceTime: rogerio.ara...@gmail.com
>
> (0xx62) 8240 7212
> (0xx62) 3920 2666


Re: how solr4.0 and zookeeper run on weblogic

2012-10-16 Thread Vadim Kisselmann
Hi,
these are JAVA_OPTS params, you can find and set this stuff in the
startManagedWeblogic script.
Best regards
Vadim



2012/10/16 rayvicky :
> who can help me ?
> where to settings   -DzkRun-Dbootstrap_conf=true
> -DzkHost=localhost:9080   -DnumShards=2
> in weblogic
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/how-solr4-0-and-zookeeper-run-on-weblogic-tp4013882.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Re: how solr4.0 and zookeeper run on weblogic

2012-10-18 Thread Vadim Kisselmann
Hi,
how your update/add command looks like?
Regards
Vadim


2012/10/18 rayvicky :
> i make it work on weblogic.
> but when i add or update index  ,it error
>
>
> <2012-10-17 ?Χ03?47·?3? CST> unexpected error occurred while retrieving the session for Web application: 
> weblogic.servlet.internal.WebAppServletContext@425eab87 - appName: 'solr', 
> name: 'solr', context-path: '/solr', spec-version: '2.5'.
> weblogic.utils.NestedRuntimeException: Cannot parse POST parameters of 
> request: '/solr/collection1/update'
> at 
> weblogic.servlet.internal.ServletRequestImpl$RequestParameters.mergePostParams(ServletRequestImpl.java:2021)
> at 
> weblogic.servlet.internal.ServletRequestImpl$RequestParameters.parseQueryParams(ServletRequestImpl.java:1901)
> at 
> weblogic.servlet.internal.ServletRequestImpl$RequestParameters.peekParameter(ServletRequestImpl.java:2047)
> at 
> weblogic.servlet.internal.ServletRequestImpl$SessionHelper.initSessionInfoWithContext(ServletRequestImpl.java:2602)
> at 
> weblogic.servlet.internal.ServletRequestImpl$SessionHelper.initSessionInfo(ServletRequestImpl.java:2506)
> Truncated. see log file for complete stacktrace
> java.net.SocketTimeoutException: Read timed out
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.read(SocketInputStream.java:129)
> at 
> weblogic.servlet.internal.PostInputStream.read(PostInputStream.java:142)
> at 
> weblogic.utils.http.HttpChunkInputStream.readChunkSize(HttpChunkInputStream.java:109)
> at 
> weblogic.utils.http.HttpChunkInputStream.initChunk(HttpChunkInputStream.java:71)
> Truncated. see log file for complete stacktrace
>>
> <2012-10-17 ?Χ03?47·?3? CST>
> <[weblogic.servlet.internal.WebAppServletContext@425eab87 - appName: 'solr', 
> name: 'solr', context-path: '/solr', spec-version: '2.5'] Servlet failed with 
> Exception
> java.lang.IllegalStateException: Failed to retrieve session: Cannot parse 
> POST parameters of request: '/solr/collection1/update'
> at 
> weblogic.servlet.security.internal.SecurityModule.getUserSession(SecurityModule.java:486)
> at 
> weblogic.servlet.security.internal.ServletSecurityManager.checkAccess(ServletSecurityManager.java:81)
> at 
> weblogic.servlet.internal.WebAppServletContext.securedExecute(WebAppServletContext.java:2116)
> at 
> weblogic.servlet.internal.WebAppServletContext.execute(WebAppServletContext.java:2086)
> at 
> weblogic.servlet.internal.ServletRequestImpl.run(ServletRequestImpl.java:1406)
> Truncated. see log file for complete stacktrace
>>
> <2012-10-17 ?Χ03?47·?3? CST> failed
>  weblogic.utils.NestedRuntimeException: Cannot parse POST parameters of 
> request: '/solr/collection1/update'.
> weblogic.utils.NestedRuntimeException: Cannot parse POST parameters of 
> request: '/solr/collection1/update'
> at 
> weblogic.servlet.internal.ServletRequestImpl$RequestParameters.mergePostParams(ServletRequestImpl.java:2021)
> at 
> weblogic.servlet.internal.ServletRequestImpl$RequestParameters.parseQueryParams(ServletRequestImpl.java:1901)
> at 
> weblogic.servlet.internal.ServletRequestImpl$RequestParameters.peekParameter(ServletRequestImpl.java:2047)
> at 
> weblogic.servlet.internal.ServletRequestImpl$SessionHelper.initSessionInfoWithContext(ServletRequestImpl.java:2602)
> at 
> weblogic.servlet.internal.ServletRequestImpl$SessionHelper.initSessionInfo(ServletRequestImpl.java:2506)
> Truncated. see log file for complete stacktrace
> java.io.IOException: Malformed chunk
> at 
> weblogic.utils.http.HttpChunkInputStream.initChunk(HttpChunkInputStream.java:67)
> at 
> weblogic.utils.http.HttpChunkInputStream.read(HttpChunkInputStream.java:142)
> at 
> weblogic.utils.http.HttpChunkInputStream.read(HttpChunkInputStream.java:182)
> at 
> weblogic.servlet.internal.ServletInputStreamImpl.read(ServletInputStreamImpl.java:222)
> at 
> weblogic.servlet.internal.ServletRequestImpl$RequestParameters.mergePostParams(ServletRequestImpl.java:1995)
> Truncated. see log file for complete stacktrace
>>
>
> how to handle it ?
>
> thanks,
> ray.
>
>
> 2012-10-18
>
>
>
> zongweilei
>
>
>
> 发件人: Jan_Høydahl_/_Cominvent_[via_Lucene]
> 发送时间: 2012-10-17  23:13:10
> 收件人: rayvicky
> 抄送:
> 主题: Re: how solr4.0 and zookeeper run on weblogic
>
> Did it work for you? You probably also have to set -Djetty.port=8080 in order 
> for local ZK not to be started on port 9983. It's confusing, but you can also 
> edit solr.xml to achieve the same.
>
> --
> Jan H酶ydahl, search solution architect
> Cominvent AS - www.cominvent.com
> Solr Training - www.solrtraining.com
>
> 17. okt. 2012 kl. 10:06 skrev rayvicky <[hidden email]>:
>
>> thanks
>>
>>
>>
>> --
>> View this message in context: 
>> http://lucene.472066.n3.nabble.com/how-solr4-0-and-zookeeper-run-on-weblogi

Re: Solr Cloud Implementation with Apache Tomcat

2012-11-04 Thread Vadim Kisselmann
Hi Guru,
here my blogpost about this:
http://www.sentric.ch/blog/setting-up-solr-4-0-beta-with-tomcat-and-zookeeper
It´s pretty simple, just follow the mentioned steps.
Best regards
Vadim



2012/9/5 bsargurunathan :
> Hi Markus,
>
> Can you please tell me the exact file name in the tomcat folder?
> Means where I have to set the properties?
> I am using Windows machine and I have the Tomcat6.
>
>
> Thanks,
> Guru
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Solr-Cloud-Implementation-with-Apache-Tomcat-tp4005209p4005535.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Out Of Memory =( Too many cores on one server?

2012-11-16 Thread Vadim Kisselmann
Hi,
your JVM need more RAM. My setup works well with 10 Cores, and 300mio.
docs, Xmx8GB Xms8GB, 16GB for OS.
But it's how Bernd mentioned, the memory consumption depends on the
number of fields and the fieldCache.
Best Regards
Vadim



2012/11/16 Bernd Fehling :
> I guess you should give JVM more memory.
>
> When starting to find a good value for -Xmx I "oversized" and  set
> it to Xmx20G and Xms20G. Then I monitored the system and saw that JVM is
> between 5G and 10G (java7 with G1 GC).
> Now it is finally set to Xmx11G and Xms11G for my system with 1 core and 38 
> million docs.
> But JVM memory depends pretty much on number of fields in schema.xml
> and fieldCache (sortable fields).
>
> Regards
> Bernd
>
> Am 16.11.2012 09:29, schrieb stockii:
>> Hello.
>>
>> if my server is running for a while i get some OOM Problems. I think the
>> problem is, that i running to many cores on one Server with too many
>> documents.
>>
>> this is my server concept:
>> 14 cores.
>> 1 with 30 million docs
>> 1 with 22 million docs
>> 1 with growing 25 million docs
>> 1 with 67 million docs
>> and the other cores are under 1 million docs.
>>
>> all these cores are running fine in one jetty and searching is very fast and
>> we are satisfied with this.
>> yesterday we got OOM.
>>
>> Do you think that we should "outsource" the big cores into another virtual
>> instance of the server? so that the JVM not share the memory and going OOM?
>> starting with: MEMORY_OPTIONS="-Xmx6g -Xms2G -Xmn1G"
>>