On Mon, Jan 12, 2015 at 06:26:14PM +0100, Tom van der Woerdt wrote: > > On 12 Jan 2015, at 16:25, Philipp Winter <p...@nymity.ch> wrote: > > Versions | Amount total | Amount w/o duplicate hosts > > ---------+---------------+--------------------------- > > 1 and 2 | 34,648 (9%) | 21,552 (23%) > > We debugged this last week on IRC, as 1,2 is an invalid combination > according to the specification. After correlating the ip addresses, we > concluded that this is GFW scanning and not actual client usage.
I'm sure some of the 1+2 is GFW scanning, but probably not all of it. Mainstream tor definitely sends 1+2 when using a v2 handshake. https://gitweb.torproject.org/tor.git/tree/src/or/connection_or.c?id=b0c32106b3559b4ee9fabfb1a49e2e328c850305#n2122 /** Array of recognized link protocol versions. */ static const uint16_t or_protocol_versions[] = { 1, 2, 3, 4 }; /** Number of versions in <b>or_protocol_versions</b>. */ static const int n_or_protocol_versions = (int)( sizeof(or_protocol_versions)/sizeof(uint16_t) ); /** Send a VERSIONS cell on <b>conn</b>, telling the other host about the * link protocol versions that this Tor can support. * * If <b>v3_plus</b>, this is part of a V3 protocol handshake, so only * allow protocol version v3 or later. If not <b>v3_plus</b>, this is * not part of a v3 protocol handshake, so don't allow protocol v3 or * later. **/ int connection_or_send_versions(or_connection_t *conn, int v3_plus) { var_cell_t *cell; int i; int n_versions = 0; const int min_version = v3_plus ? 3 : 0; const int max_version = v3_plus ? UINT16_MAX : 2; tor_assert(conn->handshake_state && !conn->handshake_state->sent_versions_at); cell = var_cell_new(n_or_protocol_versions * 2); cell->command = CELL_VERSIONS; for (i = 0; i < n_or_protocol_versions; ++i) { uint16_t v = or_protocol_versions[i]; if (v < min_version || v > max_version) continue; set_uint16(cell->payload+(2*n_versions), htons(v)); ++n_versions; } cell->payload_len = n_versions * 2; connection_or_write_var_cell_to_buf(cell, conn); conn->handshake_state->sent_versions_at = time(NULL); var_cell_free(cell); return 0; } > Are you sure you are deduplicating correctly? That's a lot of hosts. Even if it were only GFW probing, GFW rarely uses duplicate IPs, except for a few. Most IPs you will only see once or twice over the course of months. David Fifield _______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev