On Fri, 2006-12-01 at 11:58 -0800, Chris Hengge wrote: > Ok, well... I think people lost the scope of my question.. I'm happy > using the first method that was given to my post, I have stated in two > emails that the order doesn't matter.. > > What I asked was why the order was changed, or more directly, what is > the command actually doing to my data? I'm sure the order isn't > totally random, but based on how the items are checked and dropped. > > The reason I care is because I'm just nosey like that and what to know > what it is doing differently then the method I mentioned in the start > of this thread. Your original method stepped through list1 and tested each element for existence in list2. Since you stated that the proportion of duplicates is low, most searches of list2 are unsuccessful and require checking every element of list2 from beginning to end. For long lists the search time adds up.
sets and dictionaries are hash based. The location in the collection is based on a hash value. A check for membership computes the hash which is used to pinpoint the location in the collection. There are issues where different elements have identical hashes, but those are handled efficiently. So searching for matches is dramatically faster than stepping through a long list. I believe the ordering within sets and dictionaries happens to match the hash ordering. for x in aset: print hash(x) comes out in ascending order on the 3 sets I checked. That's not really a surprise. In terms of correct programs, the order of set and dictionary items should be viewed as arbitrary. > > Never did I question the validity of the answer the first reply gave > me, it works for what I need, not only that, it works well for what I > need. I never put any stipulation on the order of the final data, so I > didn't expect an answer that was order related. > > On 12/1/06, Tor Hildrum <[EMAIL PROTECTED]> wrote: > On 11/30/06, John Fouhy <[EMAIL PROTECTED]> wrote: > > > For the same reason that dictionaries don't preserve order. > > Basically, sets are (I think) implemented using a hash > table. You can > > read about hash tables on wikipedia (or many other places), > but one of > > the components of a hash table is a function mapping keys to > integers > > in a particular range. > > Why not just call a sigar for a sigar. > > A set is a set, it may be implemented using a hash or it may > be > implemed using some other datastructure. It could be > implemented using > lists which preserves order, all though that doesn't make much > sense. > How it is implemented does not really matter here. > > http://en.wikipedia.org/wiki/Set > > If you want a collection of ordered objects, you don't want a > set. Not > even if the current implementation of sets in Python did > preserve > order. Doing so could not be considered as anything else than > a ugly > hack or exploitation of the current implementation. And would > be > likely to break in the future. > > Tor > _______________________________________________ > Tutor maillist - Tutor@python.org > http://mail.python.org/mailman/listinfo/tutor > > _______________________________________________ > Tutor maillist - Tutor@python.org > http://mail.python.org/mailman/listinfo/tutor -- Lloyd Kvam Venix Corp _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor