Re: embedded documents

Michael Pitsounis Tue, 26 Aug 2014 16:24:07 -0700

hello,

I posted the source  code and quick installation on
http://www.solrfromscratch.com/2014/08/20/embedded-documents-in-solr/


The parser supports deeply nested maps/arrays.

Tell me if you have any problems with the download.


M.





On Mon, Aug 25, 2014 at 10:29 PM, Jack Krupansky <j...@basetechnology.com>
wrote:

> And a comparison to Elasticsearch would be helpful, since ES gets a lot of
> mileage from their super-easy JSON support. IOW, how much of the ES
> "advantage" is eliminated.
>
> -- Jack Krupansky
>
> -----Original Message----- From: Noble Paul
> Sent: Monday, August 25, 2014 1:59 PM
> To: solr-user@lucene.apache.org
> Subject: Re: embedded documents
>
> The simplest use case is to dump the entire json using split=/&f=/** . i am
> planning to add an alias for the same (SOLR-6343) .
>
> The nested docs is missing now and we will need to add it. A ticket needs
> to be opened
>
>
> On Mon, Aug 25, 2014 at 6:45 AM, Jack Krupansky <j...@basetechnology.com>
> wrote:
>
>  Thanks, Erik, but... I've read that Jira several times over the past
>> month, it is is far too cryptic for me to make any sense out of what it is
>> really trying to do. A simpler approach is clearly needed.
>>
>> My perception of SOLR-6304 is not that it indexes a single JSON object as
>> a single Solr document, but that it generates a collection of separate
>> documents, somewhat analogous to Lucene block/child documents, but... not
>> quite.
>>
>> I understood the request on this message thread to be the flattening of a
>> single nested JSON object to a single Solr document.
>>
>> IMHO, we need to be trying to make Solr more automatic and more
>> approachable, not an even more complicated "toolkit".
>>
>> -- Jack Krupansky
>>
>> -----Original Message----- From: Erik Hatcher
>> Sent: Monday, August 25, 2014 9:32 AM
>>
>> To: solr-user@lucene.apache.org
>> Subject: Re: embedded documents
>>
>> Jack et al - there’s now this, which is available in the any-minute
>> release of Solr 4.10: https://issues.apache.org/jira/browse/SOLR-6304
>>
>> Erik
>>
>> On Aug 25, 2014, at 5:01 AM, Jack Krupansky <j...@basetechnology.com>
>> wrote:
>>
>>  That's a completely different concept, I think - the ability to return a
>>
>>> single field value as a structured JSON object in the "writer", rather
>>> than
>>> simply "loading" from a nested JSON object and distributing the key
>>> values
>>> to normal Solr fields.
>>>
>>> -- Jack Krupansky
>>>
>>> -----Original Message----- From: Bill Bell
>>> Sent: Sunday, August 24, 2014 7:30 PM
>>> To: solr-user@lucene.apache.org
>>> Subject: Re: embedded documents
>>>
>>> See my Jira. It supports it via json.fsuffix=_json&wt=json
>>>
>>> http://mail-archives.apache.org/mod_mbox/lucene-dev/
>>> 201304.mbox/%3CJIRA.12641293.1365394604231.125944.1365397875874@arcas%3E
>>>
>>> Bill Bell
>>> Sent from mobile
>>>
>>>
>>>  On Aug 24, 2014, at 6:43 AM, "Jack Krupansky" <j...@basetechnology.com>
>>>
>>>> wrote:
>>>>
>>>> Indexing and query of raw JSON would be a valuable addition to Solr, so
>>>> maybe you could simply explain more precisely your data model and
>>>> transformation rules. For example, when multi-level nesting occurs, what
>>>> does your loader do?
>>>>
>>>> Maybe if the fielld names were derived by concatenating the full path of
>>>> JSON key names, like titles_json.FR, field_naming nesting could be
>>>> handled
>>>> in a fully automated manner.
>>>>
>>>> I had been thinking of filing a Jira proposing exactly that, so that
>>>> even the most deeply nested JSON maps could be supported, although
>>>> combinations of arrays and maps would be problematic.
>>>>
>>>> -- Jack Krupansky
>>>>
>>>> -----Original Message----- From: Michael Pitsounis
>>>> Sent: Wednesday, August 20, 2014 7:14 PM
>>>> To: solr-user@lucene.apache.org
>>>> Subject: embedded documents
>>>>
>>>> Hello everybody,
>>>>
>>>> I had a requirement to store complicated json documents in solr.
>>>>
>>>> i have modified the JsonLoader to accept complicated json documents with
>>>> arrays/objects as values.
>>>>
>>>> It stores the object/array and then flatten it and  indexes the fields.
>>>>
>>>> e.g  basic example document
>>>>
>>>> {
>>>>      "titles_json":{"FR":"This is the FR title" , "EN":"This is the EN
>>>> title"} ,
>>>>      "id": 1000003,
>>>>      "guid": "3b2f2998-85ac-4a4e-8867-beb551c0b3c6"
>>>> }
>>>>
>>>> It will store titles_json:{"FR":"This is the FR title" , "EN":"This is
>>>> the
>>>> EN title"}
>>>> and then index fields
>>>>
>>>> titles.FR:"This is the FR title"
>>>> titles.EN:"This is the EN title"
>>>>
>>>>
>>>> Do you see any problems with this approach?
>>>>
>>>>
>>>>
>>>> Regards,
>>>> Michael Pitsounis
>>>>
>>>>
>>>
>>>
>
> --
> -----------------------------------------------------
> Noble Paul
>

Re: embedded documents

Reply via email to