Hi Hoss,

Thanks for the input!

Something rather strange happened. I fixed my regex such that instead
of returning just 1,000 ... it would return 1,000.00 and voila it
worked! So Parsing group separators is already supported apparently
then ... its just that the format is also looking for a
decimal-separator and digits after that ... weird huh?

                <field column="price"
                       regex=".*?\$(\d*\.\d*)"
                       sourceColName="rawDescription"
                       />

- Pulkit

On Fri, Sep 16, 2011 at 10:53 AM, Chris Hostetter
<hossman_luc...@fucit.org> wrote:
>
> : It is pretty obvious from this that the "sdouble" schema fieldtype is
> : not setup to parse out group-separators from a number.
>
> correct.  the numeric (and date) field types are all designed to deal with
> conversion of the canonical string represetantion.
>
> : 1) Then my question is which type pf schema fieldtype will parse out
> : the comma group-separator from 1,000?
>
> that depends on how you wnat to interpret/use those values..
>
> : 2) Also, shouldn't we think about making locale based parsing be part
> : of this stack trace as well?
>
> Not in the field types.
>
> 1) adding extra parse logic there would be inefficient for people who are
> only ever sending well formed data.
> 2) as a client/server setup, it would be a bad idea for hte server to
> assume the client is using the same locale
>
> The right place in the stack for this type of logic would be in an
> UpdateProcessor (for indexing docs) or in a
> QueryParser/DocTransformer (for querying / writing back values in the
> results).
>
> Solr could certainly use some more generla purpose UpdateProcessors for
> parsing various non-canonical input formats (we've talked about one for
> doing rule based SimpleDateParsing as well) if you'd like to take a stab
> at writting one and contributing it.
>
>
> -Hoss
>

Reply via email to