HI Eli Thats true there are many cases where you don't need composite indexes, as per the documentation I provided a link to. However the specific example you gave does. (Now maybe you don't actually plan to use the specific example you provided but I don't have anything else to go on)
Now I did actually try it myself before posting \ and the specific indexes I mentioned get created in index.yaml. And if you run the dev_server with --require_indexes and the indexes in question are not present You get NeedIndexError: This query requires a composite index that is not defined. You must update the index.yaml file in your application root. This query needs this index: meStats properties: - name: y2009 - name: June Rgds T On Nov 4, 5:37 am, Eli <[email protected]> wrote: > I suggest you watch the IO talk where Brett Slatkin discusses Merge > Joins and pre-computing ranges. > > http://www.youtube.com/watch?v=AgaL6NGpkB8 > > Watch the last half (past 34 min).. and maybe pay attention to the > section that's just after (41 minutes). > > This implies you do not need composite indexes (or to create any new > indexes beyond the default ones) for all sorts of queries if you > construct your data in the right way. > > I will test this out tonight to provide a proof of concept. > > On Nov 3, 10:12 am, Tim Hoffman <[email protected]> wrote: > > > Hi > > > On Nov 3, 10:26 pm, Eli Jones <[email protected]> wrote: > > > > I haven't done any testing on this yet since I'd have to fill up tens > > > of gigs of information to see real live performance numbers. > > > > I'm hoping the implicit partitioning makes it so that one doesn't need > > > manually created indexes (just thedefault ones.) > > > > The example I showed would be a schema for storing a daily int statistic. > > > > The 'June' column entries would show the day of that month and the > > > 'y2009' column would have the 6 value since June is the 6th month of > > > the year. > > > > If I wanted stats for June, my select would look like this: > > > > Select * From meStats Where y2009 = 6 AND June > 15 > > > But the minute you do this ">" you will then need an index that looks > > like > > > - kind: meStats > > properties: > > - name: y2009 > > - name: June > > > and so on for every year month combination where you do a > > > comparison. > > > I think you should have a read about how indexes are created and > > accessed before you try optimising something that probably doesn't > > need it. > > > Note the rules from defining index > > dochttp://code.google.com/appengine/docs/python/datastore/queriesandinde... > > > Other forms of queries require their indexes to be specified in > > index.yaml, including: > > > * queries with multiple sort orders > > * queries with a sort order on keys in descending order > > * queries with one or more inequality filters on a property and > > one or more equality filters over other properties > > * queries with inequality filters and ancestor filters > > > You fall into the third rule. Which as I said eariler will mean you > > need to manually specify in index.yaml a massive number of indexes > > > Rgds > > > T > > > > This would/should implicitly hit the june rows for 2009 and get the > > > stats for every day after the 15th. > > > > You could munge around your column names and the values inserted to > > > get different data reporting behaviour.. > > > > The main, potential value is the implicit partitioning (where you > > > don't need to manually define a bunch of schemas up front). > > > > On 11/3/09, Tim Hoffman <[email protected]> wrote: > > > > > Hi > > > > > Have you tried this? > > > > > For starters you can't assign values to numbers. > > > > > ie no matter what you do you can't assign 2009 = 'abc' > > > > > You would need to use some other identifier as you mentioned and then > > > > specify something like > > > > year_2009 = db.IntegerProperty(name=2009) or something similiar. > > > > > I also see a problem with this strategy with regard to index > > > > definitions. > > > > Whilst running the SDK the indexes will get created as you define data > > > > however once you are running > > > > in real google environment you will need to make sure you have already > > > > defined all possible indexes that you > > > > plan to use before you create any new data (or reindex everything), > > > > which means indexes for all years you plan to hold data for and > > > > search, > > > > and months, and combinations of the two. > > > > > I am not sure this is a particularly good approach, but then I am not > > > > sure I get what you are actually doing. > > > > > Have you compared the performance of lookups between the two > > > > strategies, also remembering if you are actually interested in year/ > > > > month then you are > > > > actually using composite indexes, I wonder if you will ever use the > > > > month only index (apart from comparing months with months for all > > > > years in no particular order) > > > > > Rgds > > > > > T > > > > > On Nov 3, 12:22 am, Eli <[email protected]> wrote: > > > >> Here's something I've been wondering about Expando. > > > > >> Say you define an Expando model like so: > > > > >> class meStats(db.Expando): > > > >> meNumber = db.IntegerProperty(required=True) > > > > >> And, then you begin populating it like so: > > > > >> meEntity1 = meStats(meNumber = 200, > > > >> June = 14, > > > >> 2009 = 6) > > > > >> meEntity.put() > > > > >> meEntity2 = meStats(meNumber = 381, > > > >> July = 21, > > > >> 2009 = 7) > > > > >> meEntity2.put() > > > > >> ..and so on. > > > > >> The "July" column only has indexes for entities that have "July" > > > >> defined.. correct? So, in effect, I am creating a partitioned index > > > >> for a table that can grow indefinitely.. and each time I get to a new > > > >> year/month combo, I am inserting into new indexes..? (instead of > > > >> inserting into an ever increasing, monolithic "Month" column index..) > > > > >> Mainly, I'm packing the pertinent information into the column names > > > >> and column values (instead of making the column name just some dummy > > > >> value like "Month").. this allows me to implicitly create the > > > >> partitioned table/index (I think of it as a partitioned index since it > > > >> is, schematically [as far as I'm concerned], one table.) > > > > >> You could give the columns better names.. maybe "June_Day" and maybe > > > >> "2009_Month" if you wanted... > > > > >> Does this make sense? Have I misunderstood how Expando handles > > > >> indexes? > > > > >> Another way to word this question would be: > > > > >> Is there a difference between the indexes created for the June and > > > >> July entries in the above Expando model and the below Model models: > > > > >> class meJune09Stats(db.Model): > > > >> meNumber = db.IntegerProperty(required=True) > > > >> June = db.IntegerProperty(required=True) > > > >> 2009 = db.IntegerProperty(required=True) > > > > >> class meJuly09Stats(db.Model): > > > >> meNumber = db.IntegerProperty(required=True) > > > >> July = db.IntegerProperty(required=True) > > > >> 2009 = db.IntegerProperty(required=True) > > > > >> Thanks for any information. > > > > -- > > > Sent from my mobile device --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~----------~----~----~----~------~----~------~--~---
