looks like we wont save the discussion for later :)
At this point though, I can't for the life of me remeber what Ryan said to
convince me that it made sense to have a DocumentParser concept that
UpdateHandlers could delegate to -- as opposed to the UpdateHandler doing
it directly :)
We wer
: > ...When we get to it, I'd like to hear why it (things like PDF parsing)
: > should be inside Solr rather than outside using our update interfaces
:
: Same here.
I wouldn't way that i think it *should* be inside of Solr, just that it
*could* be inside of Solr. the use case i imagine is wh
On 1/22/07, Yonik Seeley <[EMAIL PROTECTED]> wrote:
...When we get to it, I'd like to hear why it (things like PDF parsing)
should be inside Solr rather than outside using our update interfaces
Same here.
I haven't had time to follow the recent (rich) design discussions
about this stuff, b
On 1/21/07, Ryan McKinley <[EMAIL PROTECTED]> wrote:
Deep within the "Update Plugin" discussion, Hoss and I agreed that
adding an interface and registry for DocumentParsers is a good idea:
interface SolrDocumentParser
{
Document parse(ContentStream content);
}
SolrDocumentParser parser = cor
>
> In the case I'm looking at, it would be cleaner and more safe to have
> it on the server side...
Safer? It precludes adding a subject with a ';' in it...
well, in *this* case it is :)
An aside: your need sounds like it's part of that much bigger issue of
processing documents and splitti
On 1/21/07, Ryan McKinley <[EMAIL PROTECTED]> wrote:
> >
> > I want something that is equivalent to splitting the string on the
> > client side and filling multiple *fields* not just tokens.
>
> Oh, I was talking about indexing only.
>
aaah.
> Why is it that multiple fields are needed? Multipl
>
> I want something that is equivalent to splitting the string on the
> client side and filling multiple *fields* not just tokens.
Oh, I was talking about indexing only.
aaah.
Why is it that multiple fields are needed? Multiple tokens are
indistinguishable from multiple fields during searc
On 1/21/07, Ryan McKinley <[EMAIL PROTECTED]> wrote:
Maybe the name is wrong, but it is something to tell the updateHandler
to use the tokenizer and filters (normally used for analysis) to
convert the single field into many fields.
I want something that is equivalent to splitting the string on t
On 1/21/07, Yonik Seeley <[EMAIL PROTECTED]> wrote:
On 1/21/07, Ryan McKinley <[EMAIL PROTECTED]> wrote:
> Are you suggesting something like this:
>
>
> sortMissingLast="true" omitNorms="true">
>
>
>
>
>
> ...
>
>
Exactly, except fo
On 1/21/07, Ryan McKinley <[EMAIL PROTECTED]> wrote:
Are you suggesting something like this:
...
Exactly, except for that bit... what's that?
-Yonik
Are you suggesting something like this:
...
On 1/21/07, Yonik Seeley <[EMAIL PROTECTED]> wrote:
On 1/21/07, Ryan McKinley <[EMAIL PROTECTED]> wrote:
> Is there any easy way to split a string into a multi-field on the server:
From an ind
On 1/21/07, Ryan McKinley <[EMAIL PROTECTED]> wrote:
Is there any easy way to split a string into a multi-field on the server:
From an indexing perspective, yes... just assign a tokenizer that splits on ';'
I don't think we currently have such as configurable Tokenizer though.
The (hypothetic
Is there any easy way to split a string into a multi-field on the server:
given:
subject1; subject2; subject- 3
I would like:
subject1
subject2
subject- 3
Thanks for any pointers
ryan
13 matches
Mail list logo