It has been shown that it is possible to "fingerprint" a person using their writing style (preference for certain words, spelling mistakes, eccentricities in grammar, etc.) thereby using this fingerprint to determine whether or not a person authored an anonymous document to a high degree of statistical certainty. The process of analyzing/fingerprinting a person's writing style is called stylometry It has been shown that it is possible to perform stylometry on sample sizes of up to 100,000 authors with a surprising degree of success. I hope that you will all agree that this poses a significant threat to the preservation of the anonymity of Tor users. Please see the following document for more information on the threat stylometry poses to privacy, freedom of speech, and, more specifically, Tor users: http://www.cs.berkeley.edu/~dawnsong/papers/2012%20On%20the%20Feasibility%20of%20Internet-Scale%20Author%20Identification.pdf
Several members of the online privacy community have expressed interest in a tool that helps circumvent stylometry, as seen on the Tails bug tracker and in a few threads on tor-talk. There is a tool called Anonymouth that sets out to do this by pointing out stylometric "giveaways" in input text, but it is quite unstable, and aimed at researchers rather than your everyday end-user, making it quite difficult to use. For this reason, I am attempting to replicate the functionality of Anonymouth in a stripped down, easy-to-use Python application, which I believe may someday be suitable for prepackaging in the Tails OS and inclusion in Debian repositories. Development will begin in mid-January 2015 at the latest; source code will be made available under the MIT license on May 1st 2015. As much as I would like to reach it earlier, I am developing this software as part of my senior thesis at my college, and must not accept outside code contributions until I have turned in my project for grading. It is my hope that I and any other interested developers will continue to work on this project long after May 1st. In the spirit of meeting the needs of the privacy community, I am interested in hearing what potential users might have to say about the design of such a tool. As of now, I envision this tool as a GUI desktop application that provides suggestions for preserving anonymity much like Anonymouth, although this will be targeted at Tails/Tor users rather than researchers. I hope to at least partially automate the anonymization process as well, perhaps automatically substituting certain words with synonyms or slightly adjusting the structure of a sentence in order to get rid of glaring indicators of writing style. Please contact pagea (at) allegheny.edu if you would like to be notified once the source code is available. For a (very rough) idea of what I hope to accomplish with this project, please see a draft of my research proposal here: https://pdf.yt/d/HsAyoE0VGCYsnVxU I look forward to reading your comments. Cheers, Alden Page -- tor-talk mailing list - tor-talk@lists.torproject.org To unsubscribe or change other settings go to https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-talk