[Declassifying this discussion and posting on [tor-dev]] David Goulet <dgou...@ev0ke.net> writes:
> Hello HS elves! > > I wrote a document to organize my thought and also list what we have in > the bug tracker right now about HS behaviours that we want to > understand/measure/assess/track. > > It's a bit long but you can pass the first section describing the > tickets and go right into the How and The Work to be done. > > Nick, you will see there is a SponsorS component but I didn't go into > hard details there. We all know we need a testing network but for now > I'm more focuses on making sure we can collect the right data (for HS). > > Very important part I would like feedback on is the "HS health service" > for which I would like that we all agree of it's usefulness and way to > do it properly. > > Cheers! > David > > This document describes the methodology and technical details of an hidden > service measurement framework/tool/<insert what this is>. > > NOTE: This is NOT intended to be run in the real Tor network. ONLY testing. > > Why and What > ----- > > The goal is to answer some questions we have regarding HS behaviours. Most > of them have a ticket assigned to them but needs an experiment or/and added > feature(s) so we can measure what we need. > > - Is rend_cache_clean_v2_descs_as_dir cutoff crazy high? > https://trac.torproject.org/projects/tor/ticket/13207 > > In order to address this, it seems we need a way to measure all the > interactions with the cache of an HSDir and a client. We need to assess > the rend cache cleanup timing values which will also helps with the upload > and refetch timings. > > - What's the average number of hsdir fetches before we get the hsdesc? > https://trac.torproject.org/projects/tor/ticket/13208 > > Using the control port for that is trivial but this needs a testing > network to be setup and has actual load on it. > > It could also be setup as a feature of an "HS health measurement tool" > with a client fetching over and over the same .onion address randomly over > time. > > - Write a hidden service hsdir health measurer > https://trac.torproject.org/projects/tor/ticket/13209 > > This is a useful one, being able to correlate relay churn and HS desc. > fetch. This one needs more brainstorming on how we could setup some sort > of client or service that report/logs the results on crunching the > consensus for HSDir for a specific .onion address that we know and > control. > > - Refactor rend_client_refetch_v2_renddesc() > https://trac.torproject.org/projects/tor/ticket/13223 > > Insure correctness of this very important function that do fetches for the > client. It's in there that the HSDir (with replicas) are looped on so the > descriptor can be fetched. > > - Maybe we want three preemptive internal circs for hidden services? > https://trac.torproject.org/projects/tor/ticket/13239 > > That's pretty trivial to measure and quantify with the tracing > instrumentation added in Tor. No need for a new feature but an experiment > has to be designed to measure 2 internal circuits versus 3. > > - rend_consider_services_upload() sets initial next_upload_time which is > clobbered when first intro point established? > https://trac.torproject.org/projects/tor/ticket/13483 > > Do the RendPostPeriod option is working correctly. What's the exact > relation in time of service->desc_is_dirty and upload time of a new > descriptor. > > - Do we have edge cases with rend_consider_descriptor_republication()? Can > we refactor it to be cleaner? > https://trac.torproject.org/projects/tor/ticket/13484 > > This is a core function that is called every second so we should make sure > it behaving as expected and not trying to do uneeded upload. > Hello, nice list of tickets. Here are some more ideas if you are looking for more brainstorming action. There is #3733 which is about a behavior that affects performance and could benefit from a testing network. And there is #8950 which is about the number of IPs per hidden service. It's very unclear whether this functionality works as intended or whether it's a good privacy idea. And there is also #13222 but it's probably easier to hack the solution here, than to measure its severity. > > How > ----- > > Here are some steps I think are needed to be able to measure and answer the > Why section. > 2> 1) Dump the uploaded/fetched HS in a human readable way. > * Allows us to track descriptor over time while testing and analyse them > afterwards by correlating events with a readable desc. This kind of > feature will also be useful for people crawling HS on SponsorR. > * Should be a control event like for instance (ONLY client side): > > setconf HSDESC_DUMP /tmp/my/dir > > 2) On how many HSDir (including replicas) have been probed for one > single .onion request. (Which should be repeated a lot for significant > results.) > * Why have we probed 1 or 5? > * What made us retry? Failure code? > * Did the descriptor was actually alive on the HSDir? If not, when did > it move? (Correlate timings between HSdir and client in a testing network) > > 3) HS desc cache tracker. We want to know, very precisely, how things are > moving in the cache especially on the HSDir cache side. > * When and why an HS desc is removed? > * Why it hasn't been stored in the cache? > * Count and when a descriptor is requested. > > 4) Track the HS descriptor upload. Log at what time it was done. Use this > to correlate with RendPostPeriod or when desc_is_dirty is set. Also should > be correlate with the actual state of the HSDir. Did it already have it? > Is the HSDir gone? > > > What to be done > ---------------- > > * Collect data > > "Collect it all" --> https://i.imgur.com/tVXAcGGl.jpg > > It's clear that we have to collect more data from the HS subsystem. Most of > it can be collected through the control port but some are missing. > Measuring precise timing of HS actions (for instance let say descriptor > store) is not possible with the control port right now and also might not be > that relevant since the job of this feature is to report high level events > and push command to the tor daemon. > > Tracing should be used here with a set of events added to the HS subsystem > to collect the information we need so it can be analyzed after the > experiment is run. This is only for performance measurement, the rest should > as much as possible use the control port. > > * Testing network (much SponsorS) > > Once we are able to extract all the data we need, time to design experiment > that allows us to run scenarios and collect/analyze what we want. A scenario > could be this example with a set of questions we want to answer going with > it: > > * 50 clients randomly accessing an HS in a busy tor network. > - What is the failure rate of desc. fetch, RP establishment, ...? > - What are the timings of each component of the HS subsystem? > - What are the outliers of the whole process of establishing a connection > to the HS? > - How much relay churn affected HS reachability. > > And dump a human readable report/graphs whatever is useful for us to > investiguate or assess the HS functionnalities. > > * HS health service > > ref: https://trac.torproject.org/projects/tor/ticket/13209 > > What about a web page that prints the result of: > > 1) Fetch last 3 concensuses (thus 3 hours) > 2) Find the union of all HSDir responsible for a.onion (we control that > HS service and should be up at all time else the results are meaningless.) > 3) Fetch the descriptor on each of them > 4) Graph/log how many of them had it thus giving us a probability of > reaching the HS within a time period. > > So 3) is the tricky one. There are multiple ways of achieving that possibly: > > i) New SOCKS command to tor that a client could use. > - Command would have an onion address with it and the reply should be 0 > or 1 (successful attempt or not) with the HSDir fingerprint with it. > > ii) Control event. > > setconf HSDESC_FETCH_ALL <this_is_a.onion> > [...] > Prints out the results as they come in with the HSDir information. > > iii) A weird way of doing this with an option "tor --fetch-on-all-hs-dir > this_address.onion", print out the results and quit. > > I much prefer i) and ii) here. Not sure which one is best though. Hm, I think I like (ii) here. It doesn't seem to be much more work than (i) and a few researchers have been asking for such functionality for years. _______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev