Re: It cost some many memory with solrj 3.5 & how to decrease it?

2012-04-05 Thread a sd
hi,Erick.
thanks at first.
I had watched the status of JVM at  runtime helped by "jconsole" and "jmap".
1,When the "Xmx" was not assigned, then, the "Old Gen" area was full whose
size was up to 1.5Gb and whose major content are instances of "String" ,
when the whole size of heap was up to the maximum ( about 2GB), the JVM run
gc() ,which wasted the CPU time,then, the performance was degraded sharply,
which was from 100,000 docs per minute to 10,000 docs per minute, as a
examination, i assigned "Xmx=1024m" purposely, the amount was down to 1000
docs per minute.
2,When assigned "Xmx=4096m", i found that the "Old Gen" was up to 2.1 GB
and the all size of JVM was up to 3GB, but, the performance with 100,000
docs per minute can attained.
During all of the test above, i only adjust the setting of client, which
connect to the identical solr server and i empty the "data" directory of
solr home before every test.
By the way, i know the client code was very ugly occupied so many heap too,
but, i wan`t permitted to promote them before i obtain a benchrank using
solrj 3.5 as much as which the old version did using solrj 1.4.
B.R
murphy

On Fri, Apr 6, 2012 at 5:54 AM, Erick Erickson wrote:

> "What's memory"? Really, how are you measuring it?
>
> If it's virtual, you don't need to worry about it. Is this
> causing you a real problem or are you just nervous about
> the difference?
>
> Best
> Erick
>
> On Wed, Apr 4, 2012 at 11:23 PM, a sd  wrote:
> > hi,all.
> >I have write a program which send data to solr using the "update"
> > request handler, when i adopted server & client library ( namely solrj )
> > with version 4.0 or 3.2, jvm`s heap size was up to 1.0 G about, but
> ,when i
> > transfer the all of them to solr 3.5 ( both server and client libs), the
> > size of heap was top to 3.0G ! There are the same server configuration
> and
> > the identical program. What`s wrong with the new version of solrj 3.5 , i
> > had looked the source code, there is no difference between solrj 3.2 and
> > solrj 3.5 where my program may invoke. How can i do to decrease the
> memory
> > cost by solrj 3.5?
> >   Any advice will be appreciated!
> >  murphy
>


Re: It cost some many memory with solrj 3.5 & how to decrease it?

2012-04-06 Thread a sd
Study the update examination more deeply,i logged all "elapsetime" value of
Updateresponse,  the result list following:
It seems that it spent almost 20 ms on adding/updating one document in
general, thus, i called which spend less than 20ms on adding one docs as
normal log,and the others were "abnormal" logs.
i can`t get a correct suit of solr 1.4, i use solr3.2 which has same
performance as solr 1.4 during the test.
solr3.5 vs solr 3.2
solr3.5
sum of docs:31998
sum of elapsetime:1218344 ms
average: 38.0744 ms /doc
sum of normal docs:28409
sum of normal elapsetime:442258
average=15.5675 ms/doc
normal percentage:28409/31998 = 88.78%
abnormal docs: 3590

solr 3.2
sum of docs:31998
sum of elapsetime:852935 ms
average:26.6559 ms /doc
sum of normal docs:28416
sum of normal elapsetime:443045
average=15.5914 ms/doc
normal percentage:28409/31998 = 88.80%
abnormal docs: 3160


What can be analyzed from them?

B.R.

murphy



On Fri, Apr 6, 2012 at 10:28 AM, a sd  wrote:

> hi,Erick.
> thanks at first.
> I had watched the status of JVM at  runtime helped by "jconsole" and
> "jmap".
> 1,When the "Xmx" was not assigned, then, the "Old Gen" area was full whose
> size was up to 1.5Gb and whose major content are instances of "String" ,
> when the whole size of heap was up to the maximum ( about 2GB), the JVM run
> gc() ,which wasted the CPU time,then, the performance was degraded sharply,
> which was from 100,000 docs per minute to 10,000 docs per minute, as a
> examination, i assigned "Xmx=1024m" purposely, the amount was down to 1000
> docs per minute.
> 2,When assigned "Xmx=4096m", i found that the "Old Gen" was up to 2.1 GB
> and the all size of JVM was up to 3GB, but, the performance with 100,000
> docs per minute can attained.
> During all of the test above, i only adjust the setting of client, which
> connect to the identical solr server and i empty the "data" directory of
> solr home before every test.
> By the way, i know the client code was very ugly occupied so many heap
> too, but, i wan`t permitted to promote them before i obtain a benchrank
> using solrj 3.5 as much as which the old version did using solrj 1.4.
> B.R
> murphy
>
>
> On Fri, Apr 6, 2012 at 5:54 AM, Erick Erickson wrote:
>
>> "What's memory"? Really, how are you measuring it?
>>
>> If it's virtual, you don't need to worry about it. Is this
>> causing you a real problem or are you just nervous about
>> the difference?
>>
>> Best
>> Erick
>>
>> On Wed, Apr 4, 2012 at 11:23 PM, a sd  wrote:
>> > hi,all.
>> >I have write a program which send data to solr using the "update"
>> > request handler, when i adopted server & client library ( namely solrj )
>> > with version 4.0 or 3.2, jvm`s heap size was up to 1.0 G about, but
>> ,when i
>> > transfer the all of them to solr 3.5 ( both server and client libs), the
>> > size of heap was top to 3.0G ! There are the same server configuration
>> and
>> > the identical program. What`s wrong with the new version of solrj 3.5 ,
>> i
>> > had looked the source code, there is no difference between solrj 3.2 and
>> > solrj 3.5 where my program may invoke. How can i do to decrease the
>> memory
>> > cost by solrj 3.5?
>> >   Any advice will be appreciated!
>> >  murphy
>>
>
>


Re: How to get a list of values of a specified field

2012-04-11 Thread a sd
The type of content is "solr.string", actually is a sequence of any
characters,"_",number,etc.

On Wed, Apr 11, 2012 at 7:06 PM, Marcelo Carvalho Fernandes <
mcf2...@gmail.com> wrote:

> What type of content do you have in this field?
>
> ---
> Marcelo Carvalho Fernandes
>
> On Wednesday, April 11, 2012, a sd  wrote:
> > hi,all.
> >  I want to get all values of a specified field,  this field is type
> of
> > "solr.string".
> >  I can achieve this object by using "facet" feature, but there is a
> > trouble : it respond me the all values by the default "facet" query. If
> > there are millions of values with a field ,or more, it is a disaster to
> > the application.  I thought, was there a way  by which i can account the
> > amount of values at first, and then i query a segment of values by
> > specified the "facet.offset" and "facet.limit" iteratively?
> > Thanks for your attention.
> > B.R.
> > murphy
> >
>
> --
> 
> Marcelo Carvalho Fernandes
> +55 21 8272-7970
> +55 21 2205-2786
>


Re: How to get a list of values of a specified field

2012-04-11 Thread a sd
I know,i know, This is a very expensive operation,the requirement  is also
very odd ,but is also very real. It is actually desired to go through the
whole documents within lucene again and again.
List the all potential value of a specified, and then divide the all work
(to go through) into a series of sub jobs by the variant field values. this
is my original intention.
By the way,I had a suggestion: can solr/lucene expose some class/interface
"public"? I had study the src of ( lucene/solr), i found some  utilities is
convenient to fulfill my required, but it is sad that they are  all access
modifier with private,protected or default.
B.R.
murphy

On Wed, Apr 11, 2012 at 11:59 PM, Erick Erickson wrote:

> Consider using the TermsComponent
> (http://wiki.apache.org/solr/TermsComponent)
> You could get some number of terms from
> your field at a time by judicious use of, say,
> facet.prefix if you wanted.
>
> But why do you want to do this? It's kind of an
> odd requirement, and since you say there are
> millions of values this will be expensive
>
> Best
> Erick
>
> On Wed, Apr 11, 2012 at 7:23 AM, a sd  wrote:
> > The type of content is "solr.string", actually is a sequence of any
> > characters,"_",number,etc.
> >
> > On Wed, Apr 11, 2012 at 7:06 PM, Marcelo Carvalho Fernandes <
> > mcf2...@gmail.com> wrote:
> >
> >> What type of content do you have in this field?
> >>
> >> ---
> >> Marcelo Carvalho Fernandes
> >>
> >> On Wednesday, April 11, 2012, a sd  wrote:
> >> > hi,all.
> >> >  I want to get all values of a specified field,  this field is
> type
> >> of
> >> > "solr.string".
> >> >  I can achieve this object by using "facet" feature, but there is
> a
> >> > trouble : it respond me the all values by the default "facet" query.
> If
> >> > there are millions of values with a field ,or more, it is a disaster
> to
> >> > the application.  I thought, was there a way  by which i can account
> the
> >> > amount of values at first, and then i query a segment of values by
> >> > specified the "facet.offset" and "facet.limit" iteratively?
> >> > Thanks for your attention.
> >> > B.R.
> >> > murphy
> >> >
> >>
> >> --
> >> 
> >> Marcelo Carvalho Fernandes
> >> +55 21 8272-7970
> >> +55 21 2205-2786
> >>
>


Re: Does the lucene can read the index file from solr?

2012-04-12 Thread a sd
hi,neosky, how to do? i need this way too. thanks

On Thu, Apr 12, 2012 at 9:35 PM, neosky  wrote:

> Thanks!I will try again
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Does-the-lucene-can-read-the-index-file-from-solr-tp3902927p3905364.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>