Re: Field for 'species' data?

Erick Erickson Sat, 05 Jul 2014 17:05:26 -0700

re: do this in an update processor or in other parts of the pipeline:

whichever is easier, the result will be the same. Personally I like
putting stuff like this in other parts of the pipeline if for no other reason
than the load isn't concentrated on the Solr machine.


In particular if you enrich the document in the pipeline, you can then
scale up indexing by having multiple processes running the pipeline on
multiple clients. Eventually, you'll hit the Solr node's limits, but it'll
be later than if you do all your processing there.

It may be a little easier to manage since you don't have to worry about
getting your custom Jars to the solr nodes as you would in the update
processor case.

But really, whatever is most convenient and meets your SLA. If you
are _already_ going to have a pipeline, there are fewer moving parts there....

Best,
Erick

On Sat, Jul 5, 2014 at 9:02 AM, Dan Bolser <dbolser....@gmail.com> wrote:
> The latter
> On 5 Jul 2014 16:39, "Jack Krupansky" <j...@basetechnology.com> wrote:
>
>> So, the immediate question is whether the value in the Solr source
>> document has the full taxonomy path for the species, or just parts, and
>> some external taxonomy definition must be consulted to "fill in" the rest
>> of the hierarchy path for that species.
>>
>> -- Jack Krupansky
>>
>> -----Original Message----- From: Dan Bolser
>> Sent: Saturday, July 5, 2014 10:36 AM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Field for 'species' data?
>>
>> One requirement is that the hierarchical facet implementation marches
>> whatever the Drupal ApacheSolr module does with taxonomy terms.
>>
>> The key thing is to add the taxonomy to the doc which only has one 'leaf'
>> term.
>> On 5 Jul 2014 15:01, "Jack Krupansky" <j...@basetechnology.com> wrote:
>>
>>  Focus on your data model and queries first, then you can decide on the
>>> implementation.
>>>
>>> Take a semi-complex example and manually break it down into field values
>>> and then write some queries, including filters, in English, that do the
>>> required navigation. Once you have a handle on what fields you need to
>>> populate, the analysis and processing details can be worked out.
>>>
>>> -- Jack Krupansky
>>>
>>> -----Original Message----- From: Dan Bolser
>>> Sent: Saturday, July 5, 2014 4:49 AM
>>> To: solr-user
>>> Subject: Re: Field for 'species' data?
>>>
>>> I'm super noob... Why choose to write it add a custom update request
>>> processor rather than an analysis pipeline?
>>>
>>> Cheers, Dan.
>>> On 5 Jul 2014 03:45, "Alexandre Rafalovitch" <arafa...@gmail.com> wrote:
>>>
>>>  Do that with a custom update request processor.
>>>
>>>>
>>>> Just remember Solr is there to find things not to preserve structure. So
>>>> mangle your data until you can find it.
>>>>
>>>> Also check if SirenDB would fit your requirements if you want to encode
>>>> the
>>>> information as complex structure.
>>>>
>>>> Regards,
>>>>     Alex
>>>>
>>>>
>>>>
>>>
>>

Re: Field for 'species' data?

Reply via email to