Re: [tor-talk] Double-checking a couple questions about node churn rate

stn Wed, 08 Oct 2014 12:24:12 -0700

this is "gold" to me.

what i mean is i can do some stats analysis for the torproject if the dataset 
exists and there is a question defined.    this churn question is a nice 
example.

the datasets are identified
instructions exist to parse the data (willing to try)
delimeters or data tags are given (r lines p lines)
question clearly stated.

_to give more people access to the data_
a wishlist might include a general tool for parsing data and perhaps exporting 
to common formats (csv libreoffice spreadsheet etc).

you might be surprised with the breadth and pool of people who can help once 
they can easily import a dataset into their stats, matlab or spreadsheet 
package they happen to know. 

decent psychology programmes require a good understanding of stastical analysis 
for example.  many psych students out there.  i've seen some social psych profs 
give matlab homework for example.  regarding social networks.

i guess if you know how social networking data can be amassed and interpreted 
you can figure out how to obfuscate that data too.  just a thought. 

 i'd still like to see tools that put out garbage that look like data useful to 
marketing and sales but i digress.    maybe everone looking average within 0.5 
SD can work.  since cookies were invented.  

_my skills are "menh"_ in the large scheme of things but i dont mind the work 
and i take it seriously.

multiple regression i can do fairly easily but multiple X and multiple Y ( eg 
factor analysis, principle components analysis, cluster, discriminant function 
analysis) would be a project that would take a good reference and study.  

i am willing but skills are old.  

i will try parsing some data on my own and see if i can "massage" the format to 
enter a stats package or two.  

if a dataset can enter open/libre/neo office spreadsheet file i can export from 
there to what i need.  CSV tab  variant or perhaps a user defined  delimiter 
method.  i'll have to check.  my s/w is old but useful.

as i don't know tor well enough to come up with questions of my own questions 
that help the torproject learn about itself in the wild like this interest me 
allot if they are helpful to the projects' needs.

best
steve

On Oct 7, 2014, at 10:04 PM, Karsten Loesing wrote:

> If you want to do some more analysis, fetch the latest consensus
> tarball(s) and write a script that compares contained fingerprints ("r"
> lines) for the churn question and exit policy summaries ("p" lines) for
> the exit-policy-change question:
> 
> https://collector.torproject.org/archive/relay-descriptors/consensuses/

-- 
tor-talk mailing list - tor-talk@lists.torproject.org
To unsubscribe or change other settings go to
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-talk

Re: [tor-talk] Double-checking a couple questions about node churn rate

Reply via email to