Re: [PATCH 1/4] Add an abstract incremental hash data type

Trevor Saunders Wed, 23 Jul 2014 18:51:13 -0700

On Thu, Jul 24, 2014 at 02:00:24AM +0200, Oleg Endo wrote:
> On Wed, 2014-07-23 at 11:37 +0200, Richard Biener wrote:
> > On Fri, Jul 18, 2014 at 3:08 AM, Trevor Saunders <tsaund...@mozilla.com> 
> > wrote:
> > > On Thu, Jul 17, 2014 at 06:36:31AM +0200, Andi Kleen wrote:
> > >> On Wed, Jul 16, 2014 at 10:40:53PM -0400, Trevor Saunders wrote:
> > >> >
> > >> > > + public:
> > >> > > +
> > >> > > +  /* Start incremential hashing, optionally with SEED.  */
> > >> > > +  void begin (hashval_t seed = 0)
> > >> > > +  {
> > >> > > +    val = seed;
> > >> >
> > >> > why isn't this the ctor?
> > >>
> > >> It's standard for hash classes to have explicit begin()/end().
> > >> All the existing ones I've seen work this way.
> > >
> > >  I only know of one vaguelly similar thing
> > >  http://mxr.mozilla.org/mozilla-central/source/mfbt/SHA1.h#37  which
> > >  doesn't do that, and a bunch of people doing something doesn't
> > >  necessarily mean it makes sense.  Now there may be a good reason it
> > >  does make sense, but unless these other people need begin() to be
> > >  fallible I don't see it.
> > 
> > I agree with Trevor here.  Please make begin() the constructor.
> > Btw, what will be the way to plug in an alternative hash function?
> > That is, there doesn't seem to be a separation of interface
> > and implementation in your patch (like with a template or a base-class
> > you inherit from).
> 
> Isn't that what boost / std hash is actually doing?  Maybe something
> like the attached example could be an improvement?


I don't really see why that makes it any easier to plug in a different
hash algorithm

> As with boost / std hash, the hash function is a template functor that
> can be specialized on various types.  The 'incremental_hash' simply
> plugs together a hash value combiner and a hash function.  Types that
> are not implemented by the hash function (in this example) are caught at
> compile time, since implicit conversions do not take place.  However, it
> should also possible to write hash functions with a bunch of operator ()
> overloads to utilize implicit conversions.

at that point it seems like its simpler and equivelent to just overload
a global function.

> One advantage of this construct is that new types can be added along
> with hash function specializations, without the need to touch any of the
> existing hashing facilities.
> 
> I don't know how/whether this would fit with the already existing hash
> map stuff (hash-table.h, hash-map.h).  It seems those don't support user
> specified hash functions.

they do, see e.g. graphite-htab.h

Trev

> 
> Cheers,
> Oleg
> 

> typedef unsigned int hashval_t;
> typedef long long HOST_WIDE_INT;
> typedef unsigned int size_t;
> 
> // -----------------------------
> // hash function and some specializations
> 
> struct raw_data
> {
>   const void* data;
>   size_t len;
> 
>   raw_data (const void* d, size_t l) : data (d), len (l) { }
> 
>   template <typename T>
>   explicit raw_data (const T& val) : data (&val), len (sizeof (T)) { }
> };
> 
> template <typename T> struct my_hash;
> 
> template <> struct my_hash<unsigned int>
> {
>   hashval_t operator () (unsigned int val);
> };
> 
> template <> struct my_hash<HOST_WIDE_INT>
> {
>   hashval_t operator () (HOST_WIDE_INT val);
> };
> 
> template <> struct my_hash<raw_data>
> {
>   hashval_t operator () (raw_data val);
> };
> 
> template <> struct my_hash<const void*>
> {
>   hashval_t operator () (const void* val);
> };
> 
> template <typename T> struct my_hash<T*>
> {
>   hashval_t operator () (T* val)
>   {
>     return my_hash<const void*> () (val);
>   }
> };
> 
> template <typename T> struct my_hash<const T*>
> {
>   hashval_t operator () (const T* val)
>   {
>     return my_hash<const void*> () (val);
>   }
> };
> 
> // -----------------------------
> // a possible hash combiner
> struct my_hash_combiner
> {
>   hashval_t operator () (hashval_t a, hashval_t b);
> };
> 
> 
> // -----------------------------
> // generic incremental hash
> 
> template <template<typename> class HashFunction, class HashValueCombiner>
> class incremental_hash
> {
> public:
>   incremental_hash (void) : m_value (0) { }
>   incremental_hash (hashval_t seed) : m_value (seed) { }
> 
>   hashval_t value (void) const { return m_value; }
> 
>   template <typename T> incremental_hash& add (const T& val)
>   {
>     m_value = HashValueCombiner () (m_value, HashFunction<T> () (val));
>     return *this;
>   }
> 
> private:
>   hashval_t m_value;
> };
> 
> // hash function of an incremental_hash is its current value, regardless
> // of the actual hash function used.
> template <template<typename> class Func, class Combiner>
> struct my_hash< incremental_hash<Func, Combiner> >
> {
>   hashval_t operator () (const incremental_hash<Func, Combiner>& val)
>   {
>     return val.value ();
>   }
> };
> 
> 
> // -----------------------------
> // use case
> 
> struct my_object
> {
>   int a, b, c, d;
> };
> 
> typedef incremental_hash<my_hash, my_hash_combiner> my_inc_hash;
> 
> int main (int argc, const char* argv[])
> {
>   my_object some_obj;
> 
>   my_inc_hash hash;
> 
>   hash.add (5U)
>       .add<unsigned int> (234)     // explicit hash func, implicit conversion.
>       .add (HOST_WIDE_INT (2135))
>       .add (raw_data (argv, argc)) // raw data hash
>       .add (raw_data (some_obj))   // iterative_hash_object macro
>       .add (argv)                  // pointer hash
>       .add (argv[0]);              // pointer hash
> 
>   my_inc_hash hash2 (0x5533);
>   hash2.add (123U)
>        .add (hash);  // 'merge' two incremental hashes.
> 
>   return hash2.value ();
> }

Re: [PATCH 1/4] Add an abstract incremental hash data type

Reply via email to