>
> In the case I'm looking at, it would be cleaner and more safe to have
> it on the server side...
Safer? It precludes adding a subject with a ';' in it...
well, in *this* case it is :)
An aside: your need sounds like it's part of that much bigger issue of
processing documents and splitting them up into multiple fields, or at
least processing certain fields in a way that can add other fields.
Yes, it is. I'm working with data that is almost structured, but I'd
like to have some level of validation and reprocessing before sticking
it in solr. I'll use SOLR-104 as that seems like the right thing.
I'm not sure what a general solution would look like in that case.
For example, you might have a field called "mail-headers", and want
that split up into multiple fields.
Another longer term thing to keep our eye on is UIMA (added to the
Apache incubator not that long ago).
Deep within the "Update Plugin" discussion, Hoss and I agreed that
adding an interface and registry for DocumentParsers is a good idea:
interface SolrDocumentParser
{
Document parse(ContentStream content);
}
SolrDocumentParser parser = core.getDocumentParse( "text/html");
This would let update plugins share (pluggable) logic for how to
convert a single stream into a single document... this is more then
we are talking about doing now, but something (else) to keep in mind.