Re: Get Config / Schema, 1.3-dev Broken?

2008-03-02 Thread Chris Hostetter

: add:
: 
:   

Actually, this seems like a legitimate bug to me ... 1.3 should work fine 
for people using 1.2 configs, and the links on the admin screen should 
work for people regardless of their config -- that may be tricky in cases 
where they register multiple instances of the ShowFileRequestHandler, or 
no instances, but for the simple compatibility case of a solrconfig.xml 
that has a  section, and does not already have an instance 
of ShowFileRequestHandler registered, why don't we auto register one for 
them using the  list?

: This is using a requestHandler rather then a jsp file...  CHANGES.txt exaplins
: this too

there isn't currently anything about this in the "Upgrading from Solr 1.2" 
section of CHANGES.txt ... i think we should fix this to be backwards 
compatible, but if we don't we should call it out as a specific task 
people need to do when upgrading.  I'll open a Jira to track it.

: > Recently, using the latest SVN code, it seems that the links to view the
: > schema & config files have been broken.
: > 
: > Urls such as /solr/admin/file/?file=solrconfig.xml result in a 404 error.
: > Has anyone else noticed this behavior? I just wanted to point it out if so.



-Hoss



RE: Commit preformance problem

2008-03-02 Thread justin alexander

a script for posting large sets (23GB here)
 
http://www.nabble.com/file/p15786630/post3.sh post3.sh 
-- 
View this message in context: 
http://www.nabble.com/Commit-preformance-problem-tp15434972p15786630.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: too many open files , is it a leak of handler there?

2008-03-02 Thread 陈亮亮
Yonik, can i ask where can i get the fixed code? or the patch? i can not find 
it in jira? i am a new here.thank you ^_^
- Original Message - 
From: "Yonik Seeley" <[EMAIL PROTECTED]>
To: 
Sent: Saturday, March 01, 2008 10:53 AM
Subject: Re: too many open files , is it a leak of handler there?


>I just committed a fix for this.
> Thanks for tracking this down!
> 
> -Yonik
> 
> 2008/2/29 陈亮亮 <[EMAIL PROTECTED]>:
>> ok i have compared the DirectSolrConnection .java and 
>> SolrDispatchFilter.java, and found that the DirecSolrConnection really do 
>> not call the req.colse() as SolrDispatchFilter do, which is said to free the 
>> resources. i guess it is the leak of handlers,i will try and see.^_^
>>
>>
>> - Original Message -
>>  From: "陈亮亮" <[EMAIL PROTECTED]>
>>  To: 
>>  Sent: Friday, February 29, 2008 4:51 PM
>>  Subject: too many open files , is it a leak of handler there?
>>
>>
>>  >when embeded the solr in my application last night, i encountered the 
>> too many open file exception just like that said in 
>> "http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200702.mbox/[EMAIL
>>  PROTECTED]". And i used DirectSolrConnection to integrate solr with my 
>> application.
>>  >
>>  >
>>  >I debuged when solr doing search, and i found that every time solr had 
>> a query search done the refcount will increase one. I know solr use refcount 
>> to keep reasher safe close, but should it be +1 when solr doing search and 
>> -1( it is to say call decref()) after search done? or else when commit take 
>> place it just decrease the refcount by 1, and the refount still>1 and 
>> searcher not really close, which leads to the hanging of the handlers and 
>> finally the too many open files problem. Is it a bug? or i should not use 
>> the DirectSolrConnection?
>>  >
>>  > By the way the "ulimit -n 1" not work out in my case, it just delay 
>> the time. i use the "lsof |grep solr| wc -l" to monitor the handler, it 
>> keeps increasing, i think it is a leak of handler here.
>>  >
>>  > Any help would appreciated, thank you !!!
>

Re: invalid XML character

2008-03-02 Thread Brian Whitman



I'm pretty sure it's a bad idea :-)  I was just explaining why it
wasn't really feasible to do on the server side.



This particular case came from this solr.py: 
https://issues.apache.org/jira/browse/SOLR-216

By the way, is that going to become the official 1.3 solr python  
client? It would be nice because then someone who knows more about  
unicode/python will go in and fix this :) If someone wants to point me  
to a place that lists invalid xml characters I can probably figure it  
out.









Re: invalid XML character

2008-03-02 Thread Walter Underwood
Section 2.2 of the XML spec. Three characters from the 0x00-0x19 block
are allowed: 0x09, 0x0A, 0x0D.

Annotated version: http://www.xml.com/axml/testaxml.htm

Section 2.2 in current official spec: http://www.w3.org/TR/REC-xml/#charsets

wunder

On 3/2/08 6:44 AM, "Brian Whitman" <[EMAIL PROTECTED]> wrote:

> If someone wants to point me
> to a place that lists invalid xml characters I can probably figure it
> out.



Re: Question regarding Solr ranking

2008-03-02 Thread Chris Hostetter
: I am not really clear to what the analysis mode is supposed to give me. It
: requires me to specify a field when I specify a query. What does that do?
: Also, I don't see anything in the analyzer to explain the weighting of a
: particular document.

i think what Otis ment is that the analysis tool would help you verify 
that your Analyzers are doing what you expect them to be doing.

If you try that with your locRvwText and the text you are asking about you 
would see that RemoveDuplicatesTokenFilterFactory does not make it the 
same as a single instance of "Pizza" ... per the docs...

"Filters out any tokens which are at the same logical position 
in the tokenstream as a previous token with the same text. ..."

http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#head-b05ef0377d71df53b47b9dd9cc28c26d95097a0b

so it isn't removing any tokens in your situation because they do not 
existing in the same logical position.



-Hoss



Re: about the >, < operation in solr

2008-03-02 Thread Chris Hostetter

: Neither Lucene nor Solr support those operators.  I believe there is or 
: used to be a way to specify an open begin/end for the range query, but I 
: don't recall the exact details at the moment.

field:[* TO X]  ...  and field:[A TO *]

there is (unfortunately) no mechanism in the QueryParser syntax for 
distibguishing between "less than" and "less than or equal to"


-Hoss



Re: Fastest Solr query

2008-03-02 Thread Chris Hostetter
: The fastest solr query I can find is any query on unused dynamic field name:
: unused_dynamic_field_s:3 

: A better ping query would be 
: 
:   q=unused_dynamic_field_s:3

faster isn't neccessarily better for a ping query ... the goal of a ping 
query is typically to get a baseline sense of how long the box is taking 
to respond to fixed query for comparison with other solr instances.  i 
tend to recomend that the ping query be something fairly complex, at least 
as intensive as the most intensive query you expect to see from clients.


-Hoss



Re: too many open files , is it a leak of handler there?

2008-03-02 Thread Yonik Seeley
2008/3/2 陈亮亮 <[EMAIL PROTECTED]>:
> Yonik, can i ask where can i get the fixed code? or the patch? i can not find 
> it in jira? i am a new here.thank you ^_^

I committed it to "trunk" (the most current development version in subversion).
The most recent nightly build should have it.

-Yonik


> - Original Message -
>  From: "Yonik Seeley" <[EMAIL PROTECTED]>
>  To: 
>
>
> Sent: Saturday, March 01, 2008 10:53 AM
>  Subject: Re: too many open files , is it a leak of handler there?
>
>
>  >I just committed a fix for this.
>  > Thanks for tracking this down!
>  >
>  > -Yonik
>  >
>  > 2008/2/29 陈亮亮 <[EMAIL PROTECTED]>:
>  >> ok i have compared the DirectSolrConnection .java and 
> SolrDispatchFilter.java, and found that the DirecSolrConnection really do not 
> call the req.colse() as SolrDispatchFilter do, which is said to free the 
> resources. i guess it is the leak of handlers,i will try and see.^_^
>  >>
>  >>
>  >> - Original Message -
>  >>  From: "陈亮亮" <[EMAIL PROTECTED]>
>  >>  To: 
>  >>  Sent: Friday, February 29, 2008 4:51 PM
>  >>  Subject: too many open files , is it a leak of handler there?
>  >>
>  >>
>  >>  >when embeded the solr in my application last night, i encountered 
> the too many open file exception just like that said in 
> "http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200702.mbox/[EMAIL 
> PROTECTED]". And i used DirectSolrConnection to integrate solr with my 
> application.
>  >>  >
>  >>  >
>  >>  >I debuged when solr doing search, and i found that every time solr 
> had a query search done the refcount will increase one. I know solr use 
> refcount to keep reasher safe close, but should it be +1 when solr doing 
> search and -1( it is to say call decref()) after search done? or else when 
> commit take place it just decrease the refcount by 1, and the refount still>1 
> and searcher not really close, which leads to the hanging of the handlers and 
> finally the too many open files problem. Is it a bug? or i should not use the 
> DirectSolrConnection?
>  >>  >
>  >>  > By the way the "ulimit -n 1" not work out in my case, it just 
> delay the time. i use the "lsof |grep solr| wc -l" to monitor the handler, it 
> keeps increasing, i think it is a leak of handler here.
>  >>  >
>  >>  > Any help would appreciated, thank you !!!
>  >


Re: too many open files , is it a leak of handler there?

2008-03-02 Thread 陈亮亮
thanks Yonik 
- Original Message - 
From: "Yonik Seeley" <[EMAIL PROTECTED]>
To: 
Sent: Monday, March 03, 2008 9:20 AM
Subject: Re: too many open files , is it a leak of handler there?


> 2008/3/2 陈亮亮 <[EMAIL PROTECTED]>:
>> Yonik, can i ask where can i get the fixed code? or the patch? i can not 
>> find it in jira? i am a new here.thank you ^_^
> 
> I committed it to "trunk" (the most current development version in 
> subversion).
> The most recent nightly build should have it.
> 
> -Yonik
> 
> 
>> - Original Message -
>>  From: "Yonik Seeley" <[EMAIL PROTECTED]>
>>  To: 
>>
>>
>> Sent: Saturday, March 01, 2008 10:53 AM
>>  Subject: Re: too many open files , is it a leak of handler there?
>>
>>
>>  >I just committed a fix for this.
>>  > Thanks for tracking this down!
>>  >
>>  > -Yonik
>>  >
>>  > 2008/2/29 陈亮亮 <[EMAIL PROTECTED]>:
>>  >> ok i have compared the DirectSolrConnection .java and 
>> SolrDispatchFilter.java, and found that the DirecSolrConnection really do 
>> not call the req.colse() as SolrDispatchFilter do, which is said to free the 
>> resources. i guess it is the leak of handlers,i will try and see.^_^
>>  >>
>>  >>
>>  >> - Original Message -
>>  >>  From: "陈亮亮" <[EMAIL PROTECTED]>
>>  >>  To: 
>>  >>  Sent: Friday, February 29, 2008 4:51 PM
>>  >>  Subject: too many open files , is it a leak of handler there?
>>  >>
>>  >>
>>  >>  >when embeded the solr in my application last night, i encountered 
>> the too many open file exception just like that said in 
>> "http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200702.mbox/[EMAIL
>>  PROTECTED]". And i used DirectSolrConnection to integrate solr with my 
>> application.
>>  >>  >
>>  >>  >
>>  >>  >I debuged when solr doing search, and i found that every time solr 
>> had a query search done the refcount will increase one. I know solr use 
>> refcount to keep reasher safe close, but should it be +1 when solr doing 
>> search and -1( it is to say call decref()) after search done? or else when 
>> commit take place it just decrease the refcount by 1, and the refount 
>> still>1 and searcher not really close, which leads to the hanging of the 
>> handlers and finally the too many open files problem. Is it a bug? or i 
>> should not use the DirectSolrConnection?
>>  >>  >
>>  >>  > By the way the "ulimit -n 1" not work out in my case, it just 
>> delay the time. i use the "lsof |grep solr| wc -l" to monitor the handler, 
>> it keeps increasing, i think it is a leak of handler here.
>>  >>  >
>>  >>  > Any help would appreciated, thank you !!!
>>  >
>

Re: How long does optimize take on your Solr installation?

2008-03-02 Thread Norberto Meijome
On Fri, 29 Feb 2008 13:02:21 -0500
"Yonik Seeley" <[EMAIL PROTECTED]> wrote:

> On Fri, Feb 29, 2008 at 12:45 AM, Walter Underwood
> <[EMAIL PROTECTED]> wrote:
> > Good point. My numbers are from a full rebuild. Let's collect maximum
> >  times, to keep it simple. --wunder  
> 
> You may see more variation than you expect since optimization is done
> in stages of mergeFactor segments.  In the same environment, you could
> add a single extra doc, and then an optimize would be faster than a
> previous run because that add happened to force a bunch of merges.

Hi all,

Does providing "my optimise takes x minutes on this hardware on a data set this
big" actually tell us useful information, other than rough ideas of how long an
optimise operation could take for {those variables}? I mean, the data ,
configuration, etc you are working on makes quite a bit of difference.

As Walter mentioned, so many variables at hand can get scary...but they
could be grouped as : 
1) your SOLR setup (schema, # of docs, configuration)

2) your hardware and OS configuration.

I would guess that to get a proper understanding and provide most useful
information they could be treated separately.

For example, for test 2), a sample SOLR configuration and data be provided and
a set of test scripts be provided. Then anyone can provide information back on
how fast their hardware / config works on SOLR-PERF-TEST_1 (optimised for
overall speed) vs SOLR-PERF-TEST_2 (optimised for commit times) vs ... whatever.

I am not too sure how to have a standard test for the first group... maybe the
data and configuration examples from 2) would be useful enough for finetuning ,
as examples (similar to MySQL 'large' and 'huge' configurations)...

just a thought...
B

_
{Beto|Norberto|Numard} Meijome

Do not take away the camels hump, you may be stopping him from being a camel.

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.


Re: Redirect the solr request to the two diffrent folders

2008-03-02 Thread Shalin Shekhar Mangar
Not sure what problem you're trying to solve but this can be achieved
using MultiCore support in the Solr 1.3 (under development).

Look at http://wiki.apache.org/solr/MultiCore for more details.

On Thu, Feb 28, 2008 at 9:11 AM, wkwickramanayake <[EMAIL PROTECTED]> wrote:
>
>  Hi...All...
>  I'm having  a problem in my current project.There i need to create two
>  schema file.After that i need to keep them in two diffrent solr directories
>  ..(Like solr_dev and solr_prod).But request doesn't direct to the relevant
>  folder...Can some one please tel me how to do this...
>  Thankx
>  --
>  View this message in context: 
> http://www.nabble.com/Redirect-the-solr-request-to-the-two-diffrent-folders-tp15729044p15729044.html
>  Sent from the Solr - User mailing list archive at Nabble.com.
>
>



-- 
Regards,
Shalin Shekhar Mangar.


Re: what's the schedule of the release of solr 1.3?

2008-03-02 Thread Shalin Shekhar Mangar
I would like to propose a milestone or a beta build to make our lives easier.

On Sun, Mar 2, 2008 at 4:35 AM, Lance Norskog <[EMAIL PROTECTED]> wrote:
> An alternative would be for someone to give a subversion checkout number
>  against 1.3-dev which represents a solid working checkout.
>
>  There are a lot of people using 1.3-dev in production, could you all please
>  tell us what checkout number you are using?
>
>  Cheers,
>
>  Lance
>
>
>  -Original Message-
>  From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]
>  Sent: Thursday, February 28, 2008 8:33 PM
>  To: solr-user@lucene.apache.org
>
>
> Subject: Re: what's the schedule of the release of solr 1.3?
>
>  Hi Feng,
>
>  Somebody just asked this over on solr-dev.  As far as I know, no concrete
>  discussions about this had taken place recently, which means nothing planned
>  for March if not longer.
>
>  Do you need something that's in 1.3-dev?
>
>  Otis
>  --
>  Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>  - Original Message 
>  > From: Feng Gao <[EMAIL PROTECTED]>
>  > To: solr-user@lucene.apache.org
>  > Sent: Thursday, February 28, 2008 5:09:48 PM
>  > Subject: what's the schedule of the release of solr 1.3?
>  >
>  > Hi,
>  >
>  >
>  >
>  > There are so many new features in solr 1.3. What's the schedule of the
>  > release of solr 1.3?
>  >
>  >
>  >
>  > Thanks
>  >
>  >
>  >
>  > Feng
>  >
>  >
>  >
>  >
>
>
>
>



-- 
Regards,
Shalin Shekhar Mangar.


Re: what's the schedule of the release of solr 1.3?

2008-03-02 Thread Noble Paul നോബിള്‍ नोब्ळ्
+1
Even if it is not a release , if we can have a milestone build (not a
release) we can start our QA on that.
We can do without all the latest and greatest features (distributed
search etc)  .
At present, it is a rapidly moving target against which we need to work
--Noble

On Mon, Mar 3, 2008 at 11:19 AM, Shalin Shekhar Mangar
<[EMAIL PROTECTED]> wrote:
> I would like to propose a milestone or a beta build to make our lives easier.
>
>
>
>  On Sun, Mar 2, 2008 at 4:35 AM, Lance Norskog <[EMAIL PROTECTED]> wrote:
>  > An alternative would be for someone to give a subversion checkout number
>  >  against 1.3-dev which represents a solid working checkout.
>  >
>  >  There are a lot of people using 1.3-dev in production, could you all 
> please
>  >  tell us what checkout number you are using?
>  >
>  >  Cheers,
>  >
>  >  Lance
>  >
>  >
>  >  -Original Message-
>  >  From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]
>  >  Sent: Thursday, February 28, 2008 8:33 PM
>  >  To: solr-user@lucene.apache.org
>  >
>  >
>  > Subject: Re: what's the schedule of the release of solr 1.3?
>  >
>  >  Hi Feng,
>  >
>  >  Somebody just asked this over on solr-dev.  As far as I know, no concrete
>  >  discussions about this had taken place recently, which means nothing 
> planned
>  >  for March if not longer.
>  >
>  >  Do you need something that's in 1.3-dev?
>  >
>  >  Otis
>  >  --
>  >  Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>  >
>  >  - Original Message 
>  >  > From: Feng Gao <[EMAIL PROTECTED]>
>  >  > To: solr-user@lucene.apache.org
>  >  > Sent: Thursday, February 28, 2008 5:09:48 PM
>  >  > Subject: what's the schedule of the release of solr 1.3?
>  >  >
>  >  > Hi,
>  >  >
>  >  >
>  >  >
>  >  > There are so many new features in solr 1.3. What's the schedule of the
>  >  > release of solr 1.3?
>  >  >
>  >  >
>  >  >
>  >  > Thanks
>  >  >
>  >  >
>  >  >
>  >  > Feng
>  >  >
>  >  >
>  >  >
>  >  >
>  >
>  >
>  >
>  >
>
>
>
>  --
>  Regards,
>  Shalin Shekhar Mangar.
>


Re: what's the schedule of the release of solr 1.3?

2008-03-02 Thread Chris Hostetter

: Even if it is not a release , if we can have a milestone build (not a
: release) we can start our QA on that.
: We can do without all the latest and greatest features (distributed
: search etc)  .

I don't really understand this sentiment.  Anyone is free to create any 
builds they want at any point and do anything they want with them -- I 
could make a build for you right now based on the latest trunk (r632966) 
and call it the "SOLR-492 milestone build" but how is that anymore helpful 
then you picking your own version based on the features and bug fixes you 
actaully care about and making your own "milestone build" to test against?

Lance's suggestion that people using various trunk builds (who consider 
them "stable enough") report back on which svnversion they built from is a 
great one -- i would happily through out a data point if i were using a 
trunk build anywhere in production, but i don't see how a community 
"milestone" build makes anysense when every commit is a milestone in one 
form or another.

: >  > An alternative would be for someone to give a subversion checkout number
: >  >  against 1.3-dev which represents a solid working checkout.
: >  >
: >  >  There are a lot of people using 1.3-dev in production, could you all 
please
: >  >  tell us what checkout number you are using?



-Hoss