[PATCH] start of rewrite of unordered_{set,map}

2015-09-16 Thread Geoff Pike
This change is the start of a rewrite of unordered_set and
unordered_map.

(By the way, I am new to GCC and do not have SVN write access.)

SUMMARY OF MOTIVATION FOR THE CHANGE

1. Speed.  Linked lists are slow.  Without them, we save memory,
allowing us to default to a lower load factor; or, the savings could
be RAM, if desired.  Initial tests are showing a 20%ish improvement on
several basic benchmarks (with FDO, on x86-64).

2. Defense to hash flooding. This benefit is harder to analyze, but I
think I'm making a reasonable proposal.

PLAN

I plan to spread the changs out over multiple patches.  This first
patch has no effect USE_HYBRID_WILD_CUCKOO_FOR_UNORDERED_SET is
defined to 1 (or similar).  The patch isn't ready yet to turn that on
by default; I intend to suggest flipping the default after this and
other to-be-written patches are submitted.  That said, I would love to
get feedback, so I can understand any substantial flaws sooner rather
than later.

DETAILS

"Hash flooding" is an attempt by an adversary to flood a hash table
with keys that cause an unusual amount of time (or RAM or ...) to be
consumed.  For example, an adversary might present many keys with the
same hash code.

Here are several common ideas to combat hash flooding, and counter
arguments to each:

1. A traditional hash table with a "great" hash function.  Counter
argument: no matter what hash function is selected, there is still a
timing attack that can force O(n^1.5) computational steps by the
attackee via only n steps by the adversary.  Also, "great" hash
functions aren't cheap.

2. A hash table that uses linked lists in lightly-occupied buckets and
balanced binary trees in heavily-occupied buckets.  Counter argument:
Worst case O(n log n) time with perhaps a lot of wasted RAM as well.
Also, this introduces the need for an ordering on the elements.

3. De-amortized Cuckoo with "great" hash functions.  Counter argument:
when hash-flooded, the "queue" or "stash" in de-amortized Cuckoo can become
long enough that a lot of basic operations that much longer than normal.
Also, "great" hash functions aren't cheap.

On general principles, it would be nice to avoid a performance penalty
in the happy, common cases where no hash flooding is in progress.
Furthermore, if hash flooding is in progress, access to keys not
provided by the adversary should still be (mostly) fast.

My belief is we can overcome all these issues and still be backward
compatible with the requirements of C++'s unordered_set and unordered_map.
I haven't decided exactly how to tackle unordered_multi{map,set} yet, but
I don't forsee any huge problems there.

Another key feature of the scheme I've begun implementing is that,
while it's preferable to let it work with a family of hash functions
(like Cuckoo does), if only one is available, as will be the case in a
lot of existing code, then it still works.  When offered a single hash
function to use, the code creates a family of hash functions by using
a random post-processing step, i.e., the output of the given hash
function is rehashed by a randomly-selected hash-code-to-hash-code
function.  This isn't great for avoiding hash flooding, but I suspect
it's better than nothing.

Comments in the patch contain additional algorithmic details etc.

KNOWN ISSUES

o The C++ standard says very little about what local iterators are
supposed to do.  What it seems to require: begin(bucket_number) takes
O(1) time, and if two elements have the same hash code then a single
local iteration should contain both.  The current patch takes that at
face value, and simply makes begin(0) yield a local iterator that
contains everything (like begin()) and begin(n) == end(n) for n != 0.
Unfortunately, although the standard doesn't say one may assume it,
there may be existing code that assumes that begin(n) has something to
do with the elements whose hash code is n mod bucket_count(). If my
shortcut isn't acceptable then I'll have to switch to a more complex,
slower scheme.  That'll hurt a bit, but I don't think it'll destroy
the value of what I'm proposing.  Luckily, local iterators are an
obscure, seldom-used feature.

o Allocators need to be addressed (currently the prototype just uses
new and delete for everything).

o Other details need to be addressed, notably: swapping exactly what
should be swapped in the swap method, making begin() take O(1) time,
and properly handling exceptions thrown by the hasher and other code
that we call.

o Portability needs to be addressed.

o I'm still working on switching everything over to GCC coding style

o Some optimizations (notably caching hash codes) need to be investigated/added.

o Performance gains without FDO aren't nearly as impressive. I haven't
yet looked into this discrepancy.

Index: libstdc++-v3/include/Makefile.am
===
--- libstdc++-v3/include/Makefile.am (revision 227597)
+++ libstdc++-v3/include/Makefile.am (working copy)
@@ -96,6 +96

Re: [PATCH] start of rewrite of unordered_{set,map}

2015-09-16 Thread Geoff Pike
Silly me, pasting the patch at the bottom of my original message
didn't work. Patch is attached.

Also, to clarify, I am primarily seeking high-level comments; I am new
here and don't want to waste anybody's time.
Index: libstdc++-v3/include/Makefile.am
===
--- libstdc++-v3/include/Makefile.am(revision 227597)
+++ libstdc++-v3/include/Makefile.am(working copy)
@@ -96,6 +96,7 @@
${bits_srcdir}/codecvt.h \
${bits_srcdir}/concept_check.h \
${bits_srcdir}/cpp_type_traits.h \
+   ${bits_srcdir}/cuckoo.h \
${bits_srcdir}/deque.tcc \
${bits_srcdir}/enable_special_members.h \
${bits_srcdir}/forward_list.h \
Index: libstdc++-v3/include/Makefile.in
===
--- libstdc++-v3/include/Makefile.in(revision 227597)
+++ libstdc++-v3/include/Makefile.in(working copy)
@@ -386,6 +386,7 @@
${bits_srcdir}/codecvt.h \
${bits_srcdir}/concept_check.h \
${bits_srcdir}/cpp_type_traits.h \
+   ${bits_srcdir}/cuckoo.h \
${bits_srcdir}/deque.tcc \
${bits_srcdir}/enable_special_members.h \
${bits_srcdir}/forward_list.h \
Index: libstdc++-v3/include/bits/cuckoo.h
===
--- libstdc++-v3/include/bits/cuckoo.h  (revision 0)
+++ libstdc++-v3/include/bits/cuckoo.h  (working copy)
@@ -0,0 +1,3103 @@
+// hybrid wild cuckoo hashtable implementation -*- C++ -*-
+
+// Copyright (C) 2010-2015 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+// .
+
+/** @file bits/cuckoo.h
+ *  This is an internal header file, included by other library headers.
+ *  Do not attempt to use it directly. @headername{cuckoo}
+ * This file implements Hybrid Wild Cuckoo and Wild Cuckoo.  The latter is used
+ * as a building block for Hybrid Wild Cuckoo when all the bells and whistles
+ * are required (as they are for a C++ standards-compliant 
unordered_{set,map}).
+ * It may also be interesting on its own.
+ *
+ * For simplicity we use one table (as is also done in some variants of 
Cuckoo).
+ * We also force the table size to be a power of 2 in this implementation,
+ * though there is no theoretical reason to do so.  The number of elements in
+ * the table is _M_num_elements in the code, but called "n" here; the number of
+ * buckets is _M_num_buckets in the code, but called "b" here.
+ *
+ * As usual in such discussions, the big-O bounds stated below
+ * assume that hashing an element is O(1).  We also ignore the bitset
+ * size and z, parameters typically set to small integers, when
+ * discussing big-O bounds.  E.g., we never write O(z); we write O(1)
+ * for that.
+ *
+ * In Wild Cuckoo, the algorithm for inserting an item is similar to Cuckoo
+ * except that the set of possible "nests" for an item can grow dynamically.
+ * In particular, the algorithm for find-or-insert-if-not-present (often called
+ * "insert" among C++ programmers) is as follows:
+ *  1. Compute g1 = H1(key) and g2 = H2(key), where H1 and H2 are hash 
functions.
+ *  2. Let h1 = g1 % b and h2 = g2 % b
+ *  3. Look for the key in all its possible nests.  Typically these are just
+ *  h1, h1^1, h1^2, ..., h1^z, h2, h2^1, h2^2, ..., and h2^z.  (Say, for z = 
4.)
+ *  However, we also keep a map of  pairs to information on additional
+ *  possible nests, which can be anywhere in the table.  Pairs with additional
+ *  possible nests are called "extended."  More details on this
+ *  are below.  A key point is that, unlike in all previous
+ *  multiple-choice-hashing schemes, a key with any hash codes can appear
+ *  anywhere in the table...  Hence the "Wild" in Wild Cuckoo.
+ *  4. If the key is not found in any of its nests then it must be inserted,
+ *  into a empty nest if possible.  If no nest is free then we have two
+ *  choices:
+ *o If the size of all extensions to  is below the "extension cap,"
+ *  a constant, and either  is e

Re: [PATCH] start of rewrite of unordered_{set,map}

2015-09-18 Thread Geoff Pike
Thanks Jonathan! I will work on what you suggest.

Regarding bug 55815: thank you for pointing that out. I'll update the
bug this weekend.

Geoff