command line micro wiki written in Python

2017-01-31 Thread Paul Wolf
I've created a command line utility for managing text files. It's written in 
Python: 

https://github.com/paul-wolf/yewdoc-client

It makes heavy use of the fantastic Click module by Armin Ronacher: 
http://click.pocoo.org/5/

This can be thought of in different ways: 

* A micro-wiki

* A note-taking application

* A file manager

The key aspects are 

* Entirely command-line driven

* Text documents only with a slight preference for Markdown

* Make it easy to operate on documents without having to remember where they 
are or the exact names

* Editor agnostic (vim, emacs, Sublime, Atom, etc.)

Here's how to create a document: 

yd edit "shopping list"

After editing, saving, closing, you can find again: 

➜  yd ls -l shop
   15981278  md   61   2017-01-20 12:15:43   shopping list

While some people might gasp at the idea of a command line wiki, I find using 
the command line with a text editor like emacs the best workflow. I also like 
not having to worry about where a file of this kind is located. You can also 
use it to track configuration files: 

yd take ~/.emacs --symlink

Now, I can edit this. Because it's a link to the actual file - having used the 
`--symlink` option, the configuration file will be updated: 

➜  yd ls -l emacs
  1c608cd7  md  113   2016-12-16 10:53:24   emacs
   ln 183a5b80 txt 5608   2017-01-15 12:59:39   
/Users/paul/.emacs

Using the cloud sync options lets me get my current config file wherever I am, 
completely up-to-date. For a more wiki-like experience, I can load all the 
documents with a common tag so:

yd browse wolf

This converts all the documents tagged with 'wolf' and loads them in a browser 
with a simple navigation sidebar.

Convert a Markdown document called "Python3" to .odt: 

➜ yd convert Python3 odt
python3.odt

There is an optional cloud syncronisation feature that connects to a specific 
endpoint, https://doc.yew.io: 

➜ yd sync

This requires registration at https://doc.yew.io, which can also be done via 
command line. But that is entirely optional.

I'd be interested in feedback if there is any interest in using this kind of 
utility. I'll expose a python API so bulk operations can be done easily on the 
command line or via scripts.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: command line micro wiki written in Python

2017-02-02 Thread paul . wolf
On Tuesday, 31 January 2017 23:39:41 UTC, Ben Finney  wrote:

> The Python community has a stronger (?) preference for reStructuredText
> format. Can that be the default?
> 
> That is, I want my text files to be named ‘foo’ (no suffix) or ‘foo.txt’
> (because they're primarily text), and have the default be to parse them
> as reStructuredText.

Good point. It should at least be possible to provide an arbitrary default so 
you can have rst as default for new files. And perhaps as you say also start 
with rst as the main default. 

Regarding no suffix, whatever the default is gets used for newly created files, 
but not for the 'take' command which I use mainly to link to (with --symlink) 
config files, like:

yd take ~/.emacs

for instance. 

-- 
https://mail.python.org/mailman/listinfo/python-list


Template language for random string generation

2014-08-08 Thread Paul Wolf
This is a proposal with a working implementation for a random string generation 
template syntax for Python. `strgen` is a module for generating random strings 
in Python using a regex-like template language. Example: 

>>> from strgen import StringGenerator as SG
>>> SG("[\l\d]{8:15}&[\d]&[\p]").render()
u'F0vghTjKalf4^mGLk'

The template ([\l\d]{8:15}&[\d]&[\p]) generates a string from 8 to 15 
characters in length with letters, digits. It is guaranteed to have at least 
one digit (maybe more) and exactly one punctuation character. 

If you look at various forums, like Stackoverflow, on how to generate random 
strings with Python, especially for passwords and other hopefully secure 
tokens, you will see dozens of variations of this: 

   >>> import random
   >>> import string
   >>> mypassword = ''.join(random.choice(string.ascii_uppercase + 
string.digits) for x in range(10))

There is nothing wrong with this (it's the right answer and is very fast), but 
it leads developers to constantly:

* Use cryptographically weak methods
* Forget that the above does not guarantee a result that includes the different 
classes of characters
* Doesn't include variable length or minimum length output
* It's a lot of typing and the resulting code is vastly different each time 
making it hard to understand what features were implemented, especially for 
those new to the language
* You can extend the above to include whatever requirements you want, but it's 
a constant exercise in wheel reinvention that is extremely verbose, error prone 
and confusing for exactly the same purposes each time

This application (generation of random strings for passwords, vouchers, secure 
ids, test data, etc.) is so general, it seems to beg for a general solution. 
So, why not have a standard way of expressing these using a simple template 
language? 

strgen: 

* Is far less verbose than commonly offered solutions
* Trivial editing of the pattern lets you incorporate additional important 
features (variable length, minimum length, additional character classes, etc.)
* Uses a pattern language superficially similar to regular expressions, so it's 
easy to learn
* Uses SystemRandom class (if available, or falls back to Random)
* Supports > 2.6 through 3.3
* Supports unicode
* Uses a parse tree, so you can have complex - nested - expressions to do 
tricky data generation tasks, especially for test data generation

In my opinion, it would make using Python for this application much easier and 
more consistent for very common requirements. The template language could 
easily be a cross-language standard like regex.  

You can `pip install strgen`. 

It's on Github: https://github.com/paul-wolf/strgen

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Template language for random string generation

2014-08-08 Thread Paul Wolf
On Friday, 8 August 2014 10:22:33 UTC+1, Chris Angelico  wrote:
> But I eyeballed your code, and I'm seeing a lot of
> u'string' prefixes, which aren't supported on 3.0-3.2 (they were
> reinstated in 3.3 as per PEP 414), so a more likely version set would
> 
> be 2.6+, 3.3+. What's the actual version support?
> ChrisA

I'm going to have to assume you are right that I only tested on 3.3, skipping > 
2.7 and < 3.3. I'll create an issue for that. 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Template language for random string generation

2014-08-08 Thread Paul Wolf
On Friday, 8 August 2014 12:20:36 UTC+1, Ned Batchelder  wrote:
> On 8/8/14 5:42 AM, Paul Wolf wrote:
> 

> Don't bother trying to support <=3.2.  It will be far more difficult 
> 
> than it is worth in terms of adoption of the library.
> 
> Also, you don't need to write a "proposal" for your library. You've 
> 
> written the library, and it's on PyPI.  You aren't trying to add it to 

Thanks for that. I'll follow that advice. 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Template language for random string generation

2014-08-08 Thread Paul Wolf
On Friday, 8 August 2014 12:29:09 UTC+1, Chris Angelico  wrote:
> Debian Wheezy can spin up a Python 3 from source anyway, and
> 
> presumably ditto for any other Linux distro that's distributing 3.1 or
> 
> 3.2; most other platforms should have a more modern Python available
> 
> one way or another.
> 
> 
> 
> ChrisA

Yes, agreed. I'll update the version info. 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Template language for random string generation

2014-08-08 Thread Paul Wolf
On Friday, 8 August 2014 23:03:18 UTC+1, Ian  wrote:
> On Fri, Aug 8, 2014 at 3:01 AM, Paul Wolf  wrote:
> 
> > * Uses SystemRandom class (if available, or falls back to Random)
> A simple improvement would be to also allow the user to pass in a
> Random object

That is not a bad idea. I'll create an issue for it. 

It is a design goal to use the standard library within the implementation so 
users have a guarantee about exactly how the data is generated. But your 
suggestion is not inconsistent with that. 

> 
> Have you given any thought to adding a validation mode, where the user
> provides a template and a string and wants to know if the string
> matches the template?

Isn't that what regular expressions are? Or do you have a clarifying use case? 

strgen is provided as the converse of regular expressions. 

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Template language for random string generation

2014-08-10 Thread Paul Wolf
On Sunday, 10 August 2014 13:43:04 UTC+1, Devin Jeanpierre  wrote:
> On Fri, Aug 8, 2014 at 2:01 AM, Paul Wolf  wrote:
> 
> > This is a proposal with a working implementation for a random string 
> > generation template syntax for Python. `strgen` is a module for generating 
> > random strings in Python using a regex-like template language. Example:
> 
> >
> 
> > >>> from strgen import StringGenerator as SG
> 
> > >>> SG("[\l\d]{8:15}&[\d]&[\p]").render()
> 
> > u'F0vghTjKalf4^mGLk'
> 
> 
> 
> Why aren't you using regular expressions? I am all for conciseness,
> 
> but using an existing format is so helpful...
> 
> 
> 
> Unfortunately, the equivalent regexp probably looks like
> 
> r'(?=.*[0-9])(?=.*[A-Z])(?=.*[a-z])[a-zA-Z0-9]{8:15}'
> 
> 
> 
> (I've been working on this kind of thing with regexps, but it's still
> 
> incomplete.)
> 
> 
> 
> > * Uses SystemRandom class (if available, or falls back to Random)
> 
> 
> 
> This sounds cryptographically weak. Isn't the normal thing to do to
> 
> use a cryptographic hash function to generate a pseudorandom sequence?
> 
> 
> 
> Someone should write a cryptographically secure pseudorandom number
> 
> generator library for Python. :(
> 
> 
> 
> (I think OpenSSL comes with one, but then you can't choose the seed.)
> 
> 
> 
> -- Devin

> Why aren't you using regular expressions?

I guess you answered your own question with your example: 

* No one will want to write that expression
* The regex expression doesn't work anyway
* The purpose of regex is just too different from the purpose of strgen

The purpose of strgen is to make life easier for developers and provide 
benefits that get pushed downstream (to users of the software that gets 
produced with it). Adopting a syntax similar to regex is only necessary or 
useful to the extent it achieves that. 

I should also clarify that when I say the strgen template language is the 
converse of regular expressions, this is the case conceptually, not formally. 
Matching text strings is fundamentally different from producing randomized 
strings. For instance, a template language that validates the output would have 
to do frequency analysis. But that is getting too far off the purpose of 
strgen, although such a mechanism would certainly have its place. 

> This sounds cryptographically weak.

Whether using SystemRandom is cryptographically weak is not something I'm 
taking up here. Someone already suggested allowing the class to accept a 
different random source provider. That's an excellent idea. I wanted to make 
sure strgen does whatever they would do anyway hand-coding using the Python 
Standard Library except vastly more flexible, easier to edit and shorter. 
strgen is two things: a proposed standard way of expressing a string generation 
specification that relies heavily on randomness and a wrapper around the 
standard library. I specifically did not want to try to write better 
cryptographic routines. 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Template language for random string generation

2014-08-10 Thread Paul Wolf
On Sunday, 10 August 2014 17:47:48 UTC+1, Ian  wrote:
> On Sun, Aug 10, 2014 at 10:34 AM, Paul Wolf  wrote:
> 
> > For instance, a template language that validates the output would have to 
> > do frequency analysis. But that is getting too far off the purpose of 
> > strgen, although such a mechanism would certainly have its place.
> 
> 
> 
> I don't think that would be necessary. The question being asked with
> 
> validation is "can this string be generated from this template", not
> 
> "is this string generated from this template with relatively high
> 
> probability".

Sorry, I meant frequency incidence within a produced string. And I understood 
Devin's point to be: For any given strgen expression that produces a set of 
strings, is there always a regex expression that captures the exact same set. 
And therefore is it not theoretically the case (leaving aside verbosity) that 
one of the syntaxes is superfluous (strgen). I think that is an entirely valid 
and interesting question. I'd have said before that it is not the case, but now 
I'm not so sure. I would still be sure that the strgen syntax is more fit for 
purpose for generating strings than regex on the basis of easy-of-use.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Template language for random string generation

2014-08-10 Thread Paul Wolf
On Sunday, 10 August 2014 17:31:01 UTC+1, Steven D'Aprano  wrote:
> Devin Jeanpierre wrote:
> 
> 
> 
> > On Fri, Aug 8, 2014 at 2:01 AM, Paul Wolf  wrote:
> 
> >> This is a proposal with a working implementation for a random string
> 
> >> generation template syntax for Python. `strgen` is a module for
> 
> >> generating random strings in Python using a regex-like template language.
> 
> >> Example:
> 
> >>
> 
> >> >>> from strgen import StringGenerator as SG
> 
> >> >>> SG("[\l\d]{8:15}&[\d]&[\p]").render()
> 
> >> u'F0vghTjKalf4^mGLk'
> 
> > 
> 
> > Why aren't you using regular expressions? I am all for conciseness,
> 
> > but using an existing format is so helpful...
> 
> 
> 
> You've just answered your own question:
> 
> 
> 
> > Unfortunately, the equivalent regexp probably looks like
> 
> > r'(?=.*[0-9])(?=.*[A-Z])(?=.*[a-z])[a-zA-Z0-9]{8:15}'
> 
> 
> 
> Apart from being needlessly verbose, regex syntax is not appropriate because
> 
> it specifies too much, specifies too little, and specifies the wrong
> 
> things. It specifies too much: regexes like ^ and $ are meaningless in this
> 
> case. It specifies too little: there's no regex for the "shuffle operator".
> 
> And it specifies the wrong things: regexes like (?= ...) as used in your
> 
> example are for matching, not generating strings, and it isn't clear
> 
> what "match any character but don't consume any of the string" means when
> 
> generating strings.
> 
> 
> 
> Personally, I think even the OP's specified language is too complex. For
> 
> example, it supports literal text, but given the use-case (password
> 
> generators) do we really want to support templates like "password[\d]"? I
> 
> don't think so, and if somebody did, they can trivially say "password" +
> 
> SG('[\d]').render().
> 
> 
> 
> Larry Wall (the creator of Perl) has stated that one of the mistakes with
> 
> Perl's regular expression mini-language is that the Huffman coding is
> 
> wrong. Common things should be short, uncommon things can afford to be
> 
> longer. Since the most common thing for password generation is to specify
> 
> character classes, they should be short, e.g. d rather than [\d] (one
> 
> character versus four).
> 
> 
> 
> The template given could potentially be simplified to:
> 
> 
> 
> "(LD){8:15}&D&P"
> 
> 
> 
> where the round brackets () are purely used for grouping. Character codes
> 
> are specified by a single letter. (I use uppercase to avoid the problem
> 
> that l & 1 look very similar. YMMV.) The model here is custom format codes
> 
> from spreadsheets, which should be comfortable to anyone who is familiar
> 
> with Excel or OpenOffice. If you insist on having the facility to including
> 
> literal text in your templates, might I suggest:
> 
> 
> 
> "'password'd"  # Literal string "password", followed by a single digit.
> 
> 
> 
> but personally I believe that for the use-case given, that's a mistake.
> 
> 
> 
> Alternatively, date/time templates use two-character codes like %Y %m etc,
> 
> which is better than 
> 
> 
> 
> 
> 
> 
> 
> > (I've been working on this kind of thing with regexps, but it's still
> 
> > incomplete.)
> 
> > 
> 
> >> * Uses SystemRandom class (if available, or falls back to Random)
> 
> > 
> 
> > This sounds cryptographically weak. Isn't the normal thing to do to
> 
> > use a cryptographic hash function to generate a pseudorandom sequence?
> 
> 
> 
> I don't think that using a good, but not cryptographically-strong, random
> 
> number generator to generate passwords is a serious vulnerability. What's
> 
> your threat model? Attacks on passwords tend to be one of a very few:
> 
> 
> 
> - dictionary attacks (including tables of common passwords and 
> 
>   simple transformations of words, e.g. 'pas5w0d');
> 
> 
> 
> - brute force against short and weak passwords;
> 
> 
> 
> - attacking the hash function used to store passwords (not the password
> 
>   itself), e.g. rainbow tables;
> 
> 
> 
> - keyloggers or some other way of stealing the password (including
> 
>   phishing sites and the ever-popular "beat them with a lead pipe 
> 
>   until they give up the password");
> 
&