Re: Best practice to setup schemas for documents having different structures

Vishal Sharma Thu, 06 Nov 2014 20:34:14 -0800

Thanks for the response guys! Appreciate it.

*Vishal Sharma** Team Lead,                   Grazitti Interactive*T: +1
650 641 1754
E: vish...@grazitti.com
www.grazitti.com [image: Description: LinkedIn]
<http://www.linkedin.com/company/grazitti-interactive>[image: Description:
Twitter] <https://twitter.com/grazitti>[image: fbook]
<https://www.facebook.com/grazitti.interactive>









On Wed, Nov 5, 2014 at 11:09 PM, Ryan Cooke <r...@docurated.com> wrote:

> We define all fields as wildcard fields with a suffix indicating field
> type. Then we can use something like Java annotations to map pojo variables
> to field types to append the correct suffix. This allows us to use one very
> generic schema among all of our collections and we rarely need to update
> it. Our inspiration for this method comes from the ruby library Sunspot.
>
> - Ryan
>
>
>
> ---
> Ryan Cooke
> VP of Engineering
> Docurated
> (646) 535-4595
>
> On Wed, Nov 5, 2014 at 9:59 AM, Erick Erickson <erickerick...@gmail.com>
> wrote:
>
> > It Depends (tm).
> >
> > You have a lot of options, and it all depends on your data and
> > use-case. In general, there is very little cost involved when a doc
> > does _not_ use a field you've defined in a schema. That is, if you
> > have 100's of fields defined and only use 10, the other 90 don't take
> > up space in each doc. There is some overhead with many many fields,
> > but probably not so you'd notice.
> >
> > 1> you could have a single schema that contains all your fields and
> > use it amongst a bunch of indexes (cores). This is particularly easy
> > in the new "configset" pattern.
> >
> > 2> You could have a single schema that contains all your fields and
> > use it in a single index. That index could contain all your different
> > docs with, say, a "type" field to let you search subsets easily.
> >
> > 3> You could have a different schema for each index and put all of the
> > docs in the same index.
> >
> > <1> I don't really like at all. If you're going to have different
> > indexes, I think it's far easier to maintain if there are individual
> > schemas.
> >
> > Between, <2> and <3> it's a tossup. <2> will skew the relevance
> > calculations because all the terms are in a single index. So your
> > relevance calculations for students will be influenced by the terms in
> > courses docs and vice-versa. That said, you may not notice as it's
> > subtle.
> >
> > I generally prefer <3> but I've seen <2> serve as well.
> >
> > Best,
> > Erick
> >
> > On Tue, Nov 4, 2014 at 9:34 PM, Vishal Sharma <vish...@grazitti.com>
> > wrote:
> > > This is something I have been thinking for a long time now.
> > >
> > > What is the best practice for setting up the Schemas for documents
> having
> > > different fields?
> > >
> > > Should we just create one schema with lot of fields or multiple schemas
> > for
> > > different data structures?
> > >
> > > Here is an example: I have two objects students and courses:
> > >
> > > Student:
> > >
> > >    - Student Name
> > >    - Student Registration number
> > >    - Course Enrolled for
> > >
> > > Course:
> > >
> > >    - Course ID
> > >    - Course Name
> > >    - Course duration
> > >
> > > What should the ideal schema setup should look like?
> > >
> > > Any guidance would is strongly appreciated.
> > >
> > >
> > >
> > > *Vishal Sharma** Team Lead,                   Grazitti Interactive*T:
> +1
> > > 650 641 1754
> > > E: vish...@grazitti.com
> > > www.grazitti.com [image: Description: LinkedIn]
> > > <http://www.linkedin.com/company/grazitti-interactive>[image:
> > Description:
> > > Twitter] <https://twitter.com/grazitti>[image: fbook]
> > > <https://www.facebook.com/grazitti.interactive>
> >
>

Re: Best practice to setup schemas for documents having different structures

Reply via email to