We define all fields as wildcard fields with a suffix indicating field type. Then we can use something like Java annotations to map pojo variables to field types to append the correct suffix. This allows us to use one very generic schema among all of our collections and we rarely need to update it. Our inspiration for this method comes from the ruby library Sunspot.
- Ryan --- Ryan Cooke VP of Engineering Docurated (646) 535-4595 On Wed, Nov 5, 2014 at 9:59 AM, Erick Erickson <erickerick...@gmail.com> wrote: > It Depends (tm). > > You have a lot of options, and it all depends on your data and > use-case. In general, there is very little cost involved when a doc > does _not_ use a field you've defined in a schema. That is, if you > have 100's of fields defined and only use 10, the other 90 don't take > up space in each doc. There is some overhead with many many fields, > but probably not so you'd notice. > > 1> you could have a single schema that contains all your fields and > use it amongst a bunch of indexes (cores). This is particularly easy > in the new "configset" pattern. > > 2> You could have a single schema that contains all your fields and > use it in a single index. That index could contain all your different > docs with, say, a "type" field to let you search subsets easily. > > 3> You could have a different schema for each index and put all of the > docs in the same index. > > <1> I don't really like at all. If you're going to have different > indexes, I think it's far easier to maintain if there are individual > schemas. > > Between, <2> and <3> it's a tossup. <2> will skew the relevance > calculations because all the terms are in a single index. So your > relevance calculations for students will be influenced by the terms in > courses docs and vice-versa. That said, you may not notice as it's > subtle. > > I generally prefer <3> but I've seen <2> serve as well. > > Best, > Erick > > On Tue, Nov 4, 2014 at 9:34 PM, Vishal Sharma <vish...@grazitti.com> > wrote: > > This is something I have been thinking for a long time now. > > > > What is the best practice for setting up the Schemas for documents having > > different fields? > > > > Should we just create one schema with lot of fields or multiple schemas > for > > different data structures? > > > > Here is an example: I have two objects students and courses: > > > > Student: > > > > - Student Name > > - Student Registration number > > - Course Enrolled for > > > > Course: > > > > - Course ID > > - Course Name > > - Course duration > > > > What should the ideal schema setup should look like? > > > > Any guidance would is strongly appreciated. > > > > > > > > *Vishal Sharma** Team Lead, Grazitti Interactive*T: +1 > > 650 641 1754 > > E: vish...@grazitti.com > > www.grazitti.com [image: Description: LinkedIn] > > <http://www.linkedin.com/company/grazitti-interactive>[image: > Description: > > Twitter] <https://twitter.com/grazitti>[image: fbook] > > <https://www.facebook.com/grazitti.interactive> >