Re: Split one string into many fields

2007-01-22 Thread Ryan McKinley
looks like we wont save the discussion for later :) At this point though, I can't for the life of me remeber what Ryan said to convince me that it made sense to have a DocumentParser concept that UpdateHandlers could delegate to -- as opposed to the UpdateHandler doing it directly :) We wer

Re: Split one string into many fields

2007-01-22 Thread Chris Hostetter
: > ...When we get to it, I'd like to hear why it (things like PDF parsing) : > should be inside Solr rather than outside using our update interfaces : : Same here. I wouldn't way that i think it *should* be inside of Solr, just that it *could* be inside of Solr. the use case i imagine is wh

Re: Split one string into many fields

2007-01-22 Thread Bertrand Delacretaz
On 1/22/07, Yonik Seeley <[EMAIL PROTECTED]> wrote: ...When we get to it, I'd like to hear why it (things like PDF parsing) should be inside Solr rather than outside using our update interfaces Same here. I haven't had time to follow the recent (rich) design discussions about this stuff, b

Re: Split one string into many fields

2007-01-21 Thread Yonik Seeley
On 1/21/07, Ryan McKinley <[EMAIL PROTECTED]> wrote: Deep within the "Update Plugin" discussion, Hoss and I agreed that adding an interface and registry for DocumentParsers is a good idea: interface SolrDocumentParser { Document parse(ContentStream content); } SolrDocumentParser parser = cor

Re: Split one string into many fields

2007-01-21 Thread Ryan McKinley
> > In the case I'm looking at, it would be cleaner and more safe to have > it on the server side... Safer? It precludes adding a subject with a ';' in it... well, in *this* case it is :) An aside: your need sounds like it's part of that much bigger issue of processing documents and splitti

Re: Split one string into many fields

2007-01-21 Thread Yonik Seeley
On 1/21/07, Ryan McKinley <[EMAIL PROTECTED]> wrote: > > > > I want something that is equivalent to splitting the string on the > > client side and filling multiple *fields* not just tokens. > > Oh, I was talking about indexing only. > aaah. > Why is it that multiple fields are needed? Multipl

Re: Split one string into many fields

2007-01-21 Thread Ryan McKinley
> > I want something that is equivalent to splitting the string on the > client side and filling multiple *fields* not just tokens. Oh, I was talking about indexing only. aaah. Why is it that multiple fields are needed? Multiple tokens are indistinguishable from multiple fields during searc

Re: Split one string into many fields

2007-01-21 Thread Yonik Seeley
On 1/21/07, Ryan McKinley <[EMAIL PROTECTED]> wrote: Maybe the name is wrong, but it is something to tell the updateHandler to use the tokenizer and filters (normally used for analysis) to convert the single field into many fields. I want something that is equivalent to splitting the string on t

Re: Split one string into many fields

2007-01-21 Thread Ryan McKinley
On 1/21/07, Yonik Seeley <[EMAIL PROTECTED]> wrote: On 1/21/07, Ryan McKinley <[EMAIL PROTECTED]> wrote: > Are you suggesting something like this: > > > sortMissingLast="true" omitNorms="true"> > > > > > > ... > > Exactly, except fo

Re: Split one string into many fields

2007-01-21 Thread Yonik Seeley
On 1/21/07, Ryan McKinley <[EMAIL PROTECTED]> wrote: Are you suggesting something like this: ... Exactly, except for that bit... what's that? -Yonik

Re: Split one string into many fields

2007-01-21 Thread Ryan McKinley
Are you suggesting something like this: ... On 1/21/07, Yonik Seeley <[EMAIL PROTECTED]> wrote: On 1/21/07, Ryan McKinley <[EMAIL PROTECTED]> wrote: > Is there any easy way to split a string into a multi-field on the server: From an ind

Re: Split one string into many fields

2007-01-21 Thread Yonik Seeley
On 1/21/07, Ryan McKinley <[EMAIL PROTECTED]> wrote: Is there any easy way to split a string into a multi-field on the server: From an indexing perspective, yes... just assign a tokenizer that splits on ';' I don't think we currently have such as configurable Tokenizer though. The (hypothetic

Split one string into many fields

2007-01-21 Thread Ryan McKinley
Is there any easy way to split a string into a multi-field on the server: given: subject1; subject2; subject- 3 I would like: subject1 subject2 subject- 3 Thanks for any pointers ryan