Re: Deduplication patch not working in nightly build

2009-01-10 Thread Grant Ingersoll
I've seen similar errors when large background merges happen while  
looping in a result set.  See http://lucene.grantingersoll.com/2008/07/16/mysql-solr-and-communications-link-failure/




On Jan 9, 2009, at 12:50 PM, Mark Miller wrote:

Your basically writing segments more often now, and somehow avoiding  
a longer merge I think. Also, likely, deduplication is probably  
adding enough extra data to your index to hit a sweet spot where a  
merge is too long. Or something to that effect - MySql is especially  
sensitive to timeouts when doing a select * on a huge db in my  
testing. I didnt understand your answer on the autocommit - I take  
it you are using it? Or no?


All a guess, but it def points to a merge taking a bit long and  
causing a timeout. I think you can relax the MySql timeout settings  
if that is it.


I'd like to get to the bottom of this as well, so any other info you  
can provide would be great.


- Mark

Marc Sturlese wrote:

Hey Shalin,

In the begining (when the error was appearing) i had  
32

and no maxBufferedDocs set

Now I have:
32
50

I think taht setting maxBufferedDocs to 50 I am forcing more disk  
writting
than I would like... but at least it works fine (but a bit  
slower,opiously).


I keep saying that the most weird thing is that I don't have that  
problem

using solr1.3, just with the nightly...

Even that it's good that it works well now, would be great if  
someone can

give me an explanation why this is happening


Shalin Shekhar Mangar wrote:


On Fri, Jan 9, 2009 at 9:23 PM, Marc Sturlese
wrote:



hey there,
I hadn't autoCommit set to true but I have it sorted! The error
stopped
appearing after setting the property maxBufferedDocs in  
solrconfig.xml. I

can't exactly undersand why but it just worked.
Anyway, maxBufferedDocs is deprecaded, would ramBufferSizeMB do  
the same?





What I find strange is this line in the exception:
"Last packet sent to the server was 202481 ms ago."

Something took very very long to complete and the connection got  
closed by

the time the next row was fetched from the opened resultset.

Just curious, what was the previous value of maxBufferedDocs and  
what did

you change it to?




--
View this message in context:
http://www.nabble.com/Deduplication-patch-not-working-in-nightly-build-tp21287327p21374908.html
Sent from the Solr - User mailing list archive at Nabble.com.




--
Regards,
Shalin Shekhar Mangar.










--
Grant Ingersoll

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ












Re: Beginner: importing own data

2009-01-10 Thread phil cryer
Ah!  That got me going, thanks so much!  I've also created a all_Text
field in my schema where I can dump a bunch of other fields so they're
search-able.  Again, I appreciate all the above replies.

P

On Fri, Jan 9, 2009 at 10:48 AM, Shalin Shekhar Mangar
 wrote:
> You were searching for "1899" which is the value of the "date" field in the
> document you added. You need to specify q=date:1899 to search on the date
> field.
>
> You can also use the "" element in schema.xml to specify
> the field on which you'd like to search if no field name is specified in the
> query. Typically, one creates a catch-all field which copies data from all
> the fields you want to search on.
>
> http://wiki.apache.org/solr/SchemaXml#head-b80c539a0a01eef8034c3776e49e8fe1c064f496
>
> Also look at the DisMax queries:
>
> http://wiki.apache.org/solr/DisMaxRequestHandler
>
> On Fri, Jan 9, 2009 at 8:35 PM, phil cryer  wrote:
>>
>> Otis
>> Thanks for your reply, I wrote out a long email explaining the steps I
>> took, and the results, but it was returned by the Solr-user email
>> server stamped as spam.  I've put my note on pastebin, you can see it
>> here: http://pastebin.cryer.us/pastebin.php?show=m359e2e47
>>
>> I'd appreciate any feedback, I know I'm close to getting this working,
>> just can't see what I'm missing.
>>
>> Thank you
>>
>> P
>>
>> On Thu, Jan 8, 2009 at 4:19 PM, Otis Gospodnetic
>>  wrote:
>> > Phil,
>> >
>> > The easiest thing to do at this stage in Solr learning experience is to
>> > restart Solr (servlet container) and redo the search.  Results shouls start
>> > showing up then because this will effectively reopen the index.
>> >
>> >
>> > Otis
>> > --
>> > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>> >
>> >
>> >
>> > - Original Message 
>> >> From: phil cryer 
>> >> To: solr-user@lucene.apache.org
>> >> Sent: Thursday, January 8, 2009 5:00:29 PM
>> >> Subject: Beginner: importing own data
>> >>
>> >> So I have Solr running, I've run through the tutorials online, can
>> >> import data from the example xml and see the results, so it works!
>> >> Now, I take some xml data I have, convert it over to the add / doc
>> >> type that the demo ones are, run it and find out which fields aren't
>> >> defined in schema.xml, I add them there until they're all there and I
>> >> can finally import my own xml into solr w/o error.  But, when I go to
>> >> query solr, it's not there.  Again, I'm using the same procedure that
>> >> I used on the example xml files, and they did the 'commit' at the end,
>> >> so I'm doing something wrong.
>> >>
>> >> Is that all I need to do, define my fields in schema.xml and then
>> >> import via post.jar?  It seems to work, but no results are ever found
>> >> by solr.  I'm open to trying any debugging or whatever, I need to
>> >> figure this out before I can start learning solr.
>> >>
>> >> Thanks
>> >>
>> >> P
>> >
>> >
>
>
>
> --
> Regards,
> Shalin Shekhar Mangar.
>


Re: UUID field type documentation and ExtractingRequestHandler

2009-01-10 Thread Chris Hostetter
: The UUID field type is not documented on the Wiki.

*many* things are not documented on the wiki ... the javadocs are the 
primary source of info about what fieldtypes and analysis factories and 
such are available.

the wiki docs are primarily a place for people to add "tips & tricks" or 
extra information about using various features beyond just the simple 
basics of it's options/params.




-Hoss