A field type based on BigDecimal could be useful, but that would be a fair 
amount more work.

Double is usually sufficient for big data analysis, especially if you are doing 
simple aggregates (which is most of what Solr can do). 

If you want to do something fancier, you’ll need a database, not a search 
engine. As I usually do, I’ll recommend MarkLogic, which is pretty awesome 
stuff. Solr would not be in my top handful of solutions for big data analysis.

Personally, I’d stuff it all in JSON in Amazon S3 and run map-reduce against 
it. If you need to do something like that, you could store a JSON blob in Solr 
with the exact values, and use approximate fields to narrow things down. Of 
course, MarkLogic has a graceful interface to Hadoop.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

On May 19, 2015, at 4:09 PM, Erick Erickson <erickerick...@gmail.com> wrote:

> Well, double is all you've got, so that's what you have to work with.
> _Every_ float is an approximation when you get out to some number of
> decimal places, so you don't really have any choice. Of course it'll
> affect the result. The question is whether it affects the result
> enough to matter which is application-specific.
> 
> Best,
> Erick
> 
> On Tue, May 19, 2015 at 12:05 PM, Vishal Swaroop <vishal....@gmail.com> wrote:
>> Also 10481.5711458735456*79* indexes to 10481.571145873546 using double
>> <fieldType name="double" class="solr.TrieDoubleField" precisionStep="0"
>> positionIncrementGap="0" omitNorms="false"/>
>> 
>> On Tue, May 19, 2015 at 2:57 PM, Vishal Swaroop <vishal....@gmail.com>
>> wrote:
>> 
>>> Thanks Erick... I can ignore the trailing zeros
>>> 
>>> I am indexing data from Vertica database... Though *double *is very close
>>> but it SOLR indexes 14 digits after decimal
>>> e.g. actual db value is 15 digits after decimal i.e. 249.81735425382405*2*
>>> 
>>> SOLR indexes 14 digits after decimal i.e.     249.81735425382405
>>> 
>>> As these values will be used for big data analysis, so I am wondering if
>>> it might impact the result.
>>> <fieldType name="double" class="solr.TrieDoubleField" precisionStep="0"
>>> positionIncrementGap="0" omitNorms="false"/>
>>> 
>>> Any suggestions ?
>>> 
>>> Regards
>>> 
>>> 
>>> On Tue, May 19, 2015 at 1:41 PM, Erick Erickson <erickerick...@gmail.com>
>>> wrote:
>>> 
>>>> Why do you want to keep trailing zeros? The original input is
>>>> preserved in the "stored" portion and will be returned if you specify
>>>> the field in your "fl" list. I'm assuming here that you're looking at
>>>> the actual indexed terms, and don't really understand why the trailing
>>>> zeros are important
>>>> 
>>>> Do not use strings.
>>>> 
>>>> Best
>>>> Erick
>>>> 
>>>> On Tue, May 19, 2015 at 10:22 AM, Vishal Swaroop <vishal....@gmail.com>
>>>> wrote:
>>>>> Thank you John and Jack...
>>>>> 
>>>>> Looks like double is much closer... it removes trailing zeros...
>>>>> a) Is there a way to keep trailing zeros
>>>>> double : 194.846189733028000 indexes to 194.846189733028
>>>>> <fieldType name="double" class="solr.TrieDoubleField" precisionStep="0"
>>>>> positionIncrementGap="0" omitNorms="false"/>
>>>>> 
>>>>> b) If I use "String" then will there be issue doing range query
>>>>> 
>>>>> float
>>>>> <fieldType name="float" class="solr.TrieFloatField" precisionStep="0"
>>>>> positionIncrementGap="0" omitNorms="false"/>
>>>>> 277.677836785372000 indexes to 277.67783
>>>>> 
>>>>> 
>>>>> 
>>>>> On Tue, May 19, 2015 at 11:56 AM, Jack Krupansky <
>>>> jack.krupan...@gmail.com>
>>>>> wrote:
>>>>> 
>>>>>> "double" (solr.TrieDoubleField) gives more precision
>>>>>> 
>>>>>> See:
>>>>>> 
>>>>>> 
>>>> https://lucene.apache.org/solr/5_1_0/solr-core/org/apache/solr/schema/TrieDoubleField.html
>>>>>> 
>>>>>> -- Jack Krupansky
>>>>>> 
>>>>>> On Tue, May 19, 2015 at 11:27 AM, Vishal Swaroop <vishal....@gmail.com
>>>>> 
>>>>>> wrote:
>>>>>> 
>>>>>>> Please suggest which numeric field type to use so that I can get
>>>> complete
>>>>>>> value.
>>>>>>> 
>>>>>>> e.g value in database is : 194.846189733028000
>>>>>>> 
>>>>>>> If I index it as float SOLR indexes it as 194.84619 where as I need
>>>>>>> complete value i.e 194.846189733028000
>>>>>>> I will also be doing range query on this field.
>>>>>>> 
>>>>>>> <fieldType name="float" class="solr.TrieFloatField" precisionStep="0"
>>>>>>> positionIncrementGap="0"/>
>>>>>>> 
>>>>>>> <field name="value" type="float" indexed="true"  stored="true"
>>>>>>> multiValued="false" />
>>>>>>> 
>>>>>>> Regards
>>>>>>> 
>>>>>> 
>>>> 
>>> 
>>> 

Reply via email to