Hi, > On 15 May 2019, at 22:40, David Goulet <dgou...@torproject.org> wrote: > >> On 08 May (13:27:31), Iain Learmonth wrote: >> Hi All, >> >> I'm working on #28322 to improve the monitoring of Tor Metrics services, >> but this also has the side effect of monitoring network health. For >> example, we'd like to know when Onionoo messes up and starts reporting >> zero relays, but we also get to learn for free in the same check how >> many relays we have and alert if that number does something weird. >> >> What would be the most useful checks to add here? >> >> * Range of expected total relays >> * Range of expected relays with Guard flag >> * Range of expected relays with Exit flag >> * Range of expected consensus weight in each position > > For all of them, what could be reported is if a large fraction disappears all > the sudden. > > Loosing for instance 500 relays at once is something worth our attention imo. > Same goes with Exit relays... if we drop from 900 to 500, it is scary. > > For the consensus weight, I would report the outliers. Maybe someone is gaming > us and so a HUGE values compared to our top usual 10 means something is up. > > As what are the good values, I don't know but I think you can probably figure > out the average relay we loose/gain every day and scale that like 3 times for > a warning?
Maybe it's also worth checking how many times each rule would trigger in the past year? If the statistics are normally distributed, you could use 4 standard deviations, so that each rule (falsely) triggers about once a year. T _______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev