Re: 'Straße' ('Strasse') and Python 2

2014-01-12 Thread Peter Otten
[email protected] wrote:

 sys.version
> 2.7.6 (default, Nov 10 2013, 19:24:18) [MSC v.1500 32 bit (Intel)]
 s = 'Straße'
 assert len(s) == 6
 assert s[5] == 'e'
 
> 
> jmf

Signifying nothing. (Macbeth)

Python 2.7.2+ (default, Jul 20 2012, 22:15:08) 
[GCC 4.6.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> s = "Straße"
>>> assert len(s) == 6
Traceback (most recent call last):
  File "", line 1, in 
AssertionError
>>> assert s[5] == "e"
Traceback (most recent call last):
  File "", line 1, in 
AssertionError


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: 'Straße' ('Strasse') and Python 2

2014-01-12 Thread Stefan Behnel
Peter Otten, 12.01.2014 09:31:
> [email protected] wrote:
> 
>> >>> sys.version
>> 2.7.6 (default, Nov 10 2013, 19:24:18) [MSC v.1500 32 bit (Intel)]
>> >>> s = 'Straße'
>> >>> assert len(s) == 6
>> >>> assert s[5] == 'e'
>> >>>
>>
>> jmf
> 
> Signifying nothing. (Macbeth)
> 
> Python 2.7.2+ (default, Jul 20 2012, 22:15:08) 
> [GCC 4.6.1] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> s = "Straße"
> >>> assert len(s) == 6
> Traceback (most recent call last):
>   File "", line 1, in 
> AssertionError
> >>> assert s[5] == "e"
> Traceback (most recent call last):
>   File "", line 1, in 
> AssertionError

The point I think he was trying to make is that Linux is better than
Windows, because the latter fails to fail on these assertions for some reason.

Stefan :o)


-- 
https://mail.python.org/mailman/listinfo/python-list


The war on Leakey is over.

2014-01-12 Thread Thrinassodon

===
 >BREAKING NEWS!
===
 >
IT WAS A DARK AND STORMY NIGHT, RICHARD LEAKEY and Peter Nyikos, Paul
Gans, and Desertphile were scoping Thrinaxodon's house for signs of the
Devonian human fossils to burn them and save the from being broke by
burning the fossils and keeping the multi-billion dollar industry of
evolution going.
 >
However, what they didn't know was Thrinaxodon and his FBI buddies were
armed and ready to attack. Carter shout "The evolutionists are coming,
the evolutionists are coming!"
 >
Thrinaxodon opened the window and started shooting, Gans was severely
wounded and left to die. Nyikos ran for his life and Desertphile ran
with Nyikos. Nyikos was subsequently shot by the police.
 >
===
 >
MAN AS OLD AS FROGS!
https://groups.google.com/forum/#!topic/sci.bio.paleontology/buAVigqX9Ts
 >
===
 >
Leakey continued shooting and killing several officers. Thrinaxodon then
jumped out and kicked Leakey's ass before he even knew it. However,
Leakey jumped and shot Thrinaxodon in the stomach, only to be shot back
by Thrinaxodon.
 >
Then Leakey was arrested.
 >

 >
TO FIND OUT HOW MAN IS AS OLD AS FROGS, VISIT:
http://thrinaxodon.wordpress.com/faq
--
Thrinaxodon, The Ultimate Defender of USENET
--
https://mail.python.org/mailman/listinfo/python-list


[newbie] starting geany from within idle does not work

2014-01-12 Thread Jean Dupont
I'm using the latest Raspbian on a Raspberry Pi and I'd like to start IDLE so 
that it uses Geany instead of Leafpad. This seems at first sight a trivial task:
Perform a rightmouse click on the IDLE-icon-->Open with: Geany (in stead of the 
default Leafpad)-->OK
LXTerminal-->lxpanelctl restart

However if I then click on IDLE followed by File-->New Window a Leafpad-session 
is opened and not a Geany-session
Is there a workaround for it?

thanks in advance
jean
p.s. I posted this question before in the Raspberry Pi forum but nobody seems
to know the answer
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: 'Straße' ('Strasse') and Python 2

2014-01-12 Thread Ned Batchelder

On 1/12/14 2:50 AM, [email protected] wrote:

sys.version

2.7.6 (default, Nov 10 2013, 19:24:18) [MSC v.1500 32 bit (Intel)]

s = 'Straße'
assert len(s) == 6
assert s[5] == 'e'



jmf



Dumping random snippets of Python sessions here is useless.  If you are 
trying to make a point, you have to put some English around it.  You 
know what is in your head, but we do not.


--
Ned Batchelder, http://nedbatchelder.com

--
https://mail.python.org/mailman/listinfo/python-list


Python: 404 Error when trying to login a webpage by using 'urllib' and 'HTTPCookieProcessor'

2014-01-12 Thread KMeans Algorithm
I'm trying to log in a webpage by using 'urllib' and this piece of code

-
import urllib2,urllib,os

opener = urllib2.build_opener(urllib2.HTTPCookieProcessor())
login = urllib.urlencode({'username':'john', 'password':'foo'})
url = "https://www.mysite.com/loginpage";
req = urllib2.Request(url, login)
try:
resp = urllib2.urlopen(req)
print resp.read()
except urllib2.HTTPError, e:
print ":( Error = " + str(e.code)


But I get a "404" error (Not Found). The page 
"https://www.mysite.com/loginpage"; does exist (note please the httpS, since I'm 
not sure if this the key of my problem).

If I try with

---
resp = urllib2.urlopen(url)   

(with no 'login' data), it works ok but, obviously, I'm not logged in.

What am I doing wrong? Thank you very much.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: 'Straße' ('Strasse') and Python 2

2014-01-12 Thread Mark Lawrence

On 12/01/2014 09:00, Stefan Behnel wrote:

Peter Otten, 12.01.2014 09:31:

[email protected] wrote:


sys.version

2.7.6 (default, Nov 10 2013, 19:24:18) [MSC v.1500 32 bit (Intel)]

s = 'Straße'
assert len(s) == 6
assert s[5] == 'e'



jmf


Signifying nothing. (Macbeth)

Python 2.7.2+ (default, Jul 20 2012, 22:15:08)
[GCC 4.6.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.

s = "Straße"
assert len(s) == 6

Traceback (most recent call last):
   File "", line 1, in 
AssertionError

assert s[5] == "e"

Traceback (most recent call last):
   File "", line 1, in 
AssertionError


The point I think he was trying to make is that Linux is better than
Windows, because the latter fails to fail on these assertions for some reason.

Stefan :o)




The point he's trying to make is that he also reads the pythondev 
mailing list, where Steven D'Aprano posted this very example, stating it 
is "Python 2 nonsense".  Fixed in Python 3.  Don't mention... :)


--
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.


Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


[ANN] Oktest.py 0.12.0 released - a new-style testing library

2014-01-12 Thread Makoto Kuwata
Hi,

I released Oktest 0.12.0.
https://pypi.python.org/pypi/Oktest/

Oktest is a new-style testing library for Python.

## unittest
self.assertEqual(x, y)
self.assertNotEqual(x, y)
self.assertGreaterEqual(x, y)
self.assertIsInstance(obj, cls)
self.assertRegexpMatches(text, rexp)

## Oktest.py
ok (x) == y
ok (x) != y
ok (x) >= y
ok (obj).is_a(cls)
ok (text).match(rexp)


Install
  $ easy_install oktest

User's Guide
  http://www.kuwata-lab.com/oktest/oktest-py_users-guide.html

Changes
  http://www.kuwata-lab.com/oktest/oktest-py_CHANGES.txt


Highlight on this release
-

This release contains new and important enhancements.

* [enhance] `ok (actual) == expected' reports unified diff.
  Example::

AssertionError:
--- expected
+++ actual
@@ -1,3 +1,3 @@
 {'email': '[email protected]',
- 'gender': 'Female',
+ 'gender': 'female',
  'username': 'Haruhi'}

* [enhance] @at_end decorator registers callback which is called
  at end of test case. ::

  @test("example to remove temporary file automatically")
  def _(self):
## create dummy file
with open('dummy.txt', 'w') as f:
  f.write("blablabla")
## register callback to delete dummy file at end of test case
@at_end
def _():
  os.unlink(tmpfile)
## do test
with open(tmpfile) as f:
  ok (f.read()) == "blablabla"

* [enhance] New assertions for WebOb/Werkzeug response object. ::

 ok (resp).is_response(200)  # status code
 ok (resp).is_response((302, 303))   # status code
 ok (resp).is_response('200 OK') # status line
 ok (resp).is_response(200, 'image/jpeg')# content-type
 ok (resp).is_response(200, re.compile(r'^image/(jpeg|png|gif)$'))
 ok (resp).is_response(302).header("Location", "/")  # header
 ok (resp).is_response(200).json({"status": "OK"})   # json data
 ok (resp).is_response(200).body("Hello")   # response body
 ok (resp).is_response(200).body(re.compile(".*?"))

* [bugfix] @todo decorator now supports fixture injection. ::

 @test('example')
 @todo # error on previous version but not on this release
 def _(self, x):
 assert False


You can see all enhancements and changes. See
  http://www.kuwata-lab.com/oktest/oktest-py_CHANGES.txt


Have fun!

--
regards,
makoto kuwata
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python: 404 Error when trying to login a webpage by using 'urllib' and 'HTTPCookieProcessor'

2014-01-12 Thread Chris Angelico
On Sun, Jan 12, 2014 at 11:17 PM, KMeans Algorithm  wrote:
> What am I doing wrong? Thank you very much.

I can't say what's actually wrong, but I have a few ideas for getting
more information out of the system...

> opener = urllib2.build_opener(urllib2.HTTPCookieProcessor())

You don't do anything with this opener - could you have a cookie problem?

> req = urllib2.Request(url, login)
>
> But I get a "404" error (Not Found). The page 
> "https://www.mysite.com/loginpage"; does exist (note please the httpS, since 
> I'm not sure if this the key of my problem).
>
> If I try with
>
> ---
> resp = urllib2.urlopen(url)
> 
> (with no 'login' data), it works ok but, obviously, I'm not logged in.

Note that adding a data parameter changes the request from a GET to a
POST. I'd normally expect the server to respond 404 to both or
neither, but it's theoretically possible.

It's also possible that you're getting redirected, and that (maybe
because cookies aren't being retained??) the destination is 404. I'm
not familiar with urllib2, but if you get a response object back, you
can call .geturl() on it - no idea how that goes with HTTP errors,
though.

You may want to look at the exception's .reason attribute - might be
more informative than .code.

As a last resort, try firing up Wireshark or something and watch
exactly what gets sent and received. I went looking through the docs
for a "verbose" mode or a "debug" setting but can't find one - that'd
be ideal if it exists, though.

Hope that's of at least some help!

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python: 404 Error when trying to login a webpage by using 'urllib' and 'HTTPCookieProcessor'

2014-01-12 Thread Chris Angelico
On Sun, Jan 12, 2014 at 11:17 PM, KMeans Algorithm  wrote:
> The page "https://www.mysite.com/loginpage"; does exist

PS. If it's not an intranet site and the URL isn't secret, it'd help
if we could actually try things out. One of the tricks I like to use
is to access the same page with a different program/library - maybe
wget, or bare telnet, or something like that. Sometimes one succeeds
and another doesn't, and then you dig into the difference (once I
found that a web server failed unless the request headers were in a
particular order - that was a pain to (a) find, and (b) work around!).

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: [ANN] Oktest.py 0.12.0 released - a new-style testing library

2014-01-12 Thread Roy Smith
In article ,
 Makoto Kuwata  wrote:

> Hi,
> 
> I released Oktest 0.12.0.
> https://pypi.python.org/pypi/Oktest/

Wow, this looks neat.  We use nose, but I'm thinking your ok() style 
exceptions should work just fine with nose.  Just the notational 
convenience alone is worth it.  Typing "ok(x) <= 1" is so much nicer 
than "assert_less(x, 1)".

Unfortunately, I was unable to download it.  There seems to be a broken 
redirect at pypi:

$ wget 
http://pypi.python.org/packages/source/O/Oktest/Oktest-0.12.0.tar.gz
--2014-01-12 07:57:01--  
http://pypi.python.org/packages/source/O/Oktest/Oktest-0.12.0.tar.gz
Resolving pypi.python.org (pypi.python.org)... 199.27.72.184, 
199.27.72.185
Connecting to pypi.python.org (pypi.python.org)|199.27.72.184|:80... 
connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: 
https://pypi.python.org/packages/source/O/Oktest/Oktest-0.12.0.tar.gz 
[following]
--2014-01-12 07:57:01--  
https://pypi.python.org/packages/source/O/Oktest/Oktest-0.12.0.tar.gz
Connecting to pypi.python.org (pypi.python.org)|199.27.72.184|:443... 
connected.
HTTP request sent, awaiting response... 404 Not Found
2014-01-12 07:57:02 ERROR 404: Not Found.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Open Question - I'm a complete novice in programming so please bear with me...Is python equivalent to C, C++ and java combined?

2014-01-12 Thread Rotwang

On 12/01/2014 05:58, Chris Angelico wrote:

[...]

(BTW, is there no better notation than six nested for/range for doing
6d6? I couldn't think of one off-hand, but it didn't really much
matter anyway.)


If you're willing to do an import, then how about this:

>>> from itertools import product
>>> len([x for x in product(range(1, 7), repeat = 6) if sum(x) < 14])/6**6
0.03587962962962963
--
https://mail.python.org/mailman/listinfo/python-list


Python example source code

2014-01-12 Thread ngangsia akumbo
where can i find example source code by topic?
Any help please
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Open Question - I'm a complete novice in programming so please bear with me...Is python equivalent to C, C++ and java combined?

2014-01-12 Thread Chris Angelico
On Mon, Jan 13, 2014 at 1:36 AM, Rotwang  wrote:
> On 12/01/2014 05:58, Chris Angelico wrote:
>>
>> (BTW, is there no better notation than six nested for/range for doing
>> 6d6? I couldn't think of one off-hand, but it didn't really much
>> matter anyway.)
>
>
> If you're willing to do an import, then how about this:
>
 from itertools import product
 len([x for x in product(range(1, 7), repeat = 6) if sum(x) < 14])/6**6
> 0.03587962962962963

Should have known itertools would have something. That's a little more
complicated to try to explain, but it's a lot shorter.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python example source code

2014-01-12 Thread bob gailer

On 1/12/2014 9:37 AM, ngangsia akumbo wrote:

where can i find example source code by topic?
There are several Python tutorials on the internet. They have good code 
examples.


Most modules also have module-specific examples.

There are also some web sites that may address your needs. I will leave 
it to others to list these.


What do you mean by topic? Give us some examples of topics and maybe we 
can help more.

--
https://mail.python.org/mailman/listinfo/python-list


Re:Python example source code

2014-01-12 Thread Dave Angel
 ngangsia akumbo  Wrote in message:
> where can i find example source code by topic?
> Any help please
> 

http://code.activestate.com/recipes/langs/python/

http://code.activestate.com/recipes/sets/2-python-cookbook-edition-2/

http://shop.oreilly.com/product/mobile/0636920027072.do

http://www.mindviewinc.com/Books/Python3Patterns/Index.php

https://pypi.python.org/pypi
 
-- 
DaveA



Android NewsGroup Reader
http://www.piaohong.tk/newsgroup

-- 
https://mail.python.org/mailman/listinfo/python-list


Dawkins arrested

2014-01-12 Thread Thrinassodon


 >BREAKING NEWS!

 >
THRINAXODON SPEARHEADED THE ASSAULT ON RICHARD DAWKINS, KNOWN FOR
SUPPRESSION OF VALID RESEARCH OF HUMAN ORIGINS FOR YEARS, JUST TO GET A
BUCK OUT OF BRAINWASHING CHILDREN'S LIVES INTO THE SCAM OF EVOLUTION.
 >
Dawkins was charged with OVER 9000! complaints of mind-control, torture,
and pyramid schemes where he got millions of dollars out of the American
populace over the scam of evolution.
 >
This is what Dawkins said when he faced the charges, "SHIT! How am I
going to get money now!" Later, his charges were reduced to probation.
He is now broke.
 >
According to Thrinaxodon, PHD, an expert on human origins, "The loss of
Dawkins is a great blow to the evolutionist establishment, with no
figurehead the scientific establishement that has been dominating
American politics for 150 years is now falling down under it's own
weight. How are people like AronRa or James Watson going to get money
now? No-one knows."
 >
When Dawkins was asked how the scientific establishement was going to
rebuild, he said "I don't know. Maybe, just maybe, we'll have to move on
to another scam. Like the Big Bang."
 >
===
  >
MAN AS OLD AS FROGS!
https://groups.google.com/forum/#!topic/sci.bio.paleontology/buAVigqX9Ts
  >
TO FIND OUT HOW MAN IS AS OLD AS FROGS, VISIT:
http://thrinaxodon.wordpress.com/faq
--
Thrinaxodon, The Ultimate Defender of USENET
--
https://mail.python.org/mailman/listinfo/python-list


extracting string.Template substitution placeholders

2014-01-12 Thread Eric S. Johansson
As part of speech recognition accessibility tools that I'm building, I'm 
using string.Template. In order to construct on-the-fly grammar, I need 
to know all of the identifiers before the template is filled in. what is 
the best way to do this?


can string.Template handle recursive expansion i.e. an identifier 
contains a template.


Thanks
--- eric
--
https://mail.python.org/mailman/listinfo/python-list


Re: Python example source code

2014-01-12 Thread ngangsia akumbo
On Sunday, January 12, 2014 4:06:13 PM UTC+1, Dave Angel wrote:
> ngangsia akumbo  Wrote in message:
> 


Thanks bro

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python: 404 Error when trying to login a webpage by using 'urllib' and 'HTTPCookieProcessor'

2014-01-12 Thread xDog Walker
On Sunday 2014 January 12 04:42, Chris Angelico wrote:
> As a last resort, try firing up Wireshark or something and watch
> exactly what gets sent and received. I went looking through the docs
> for a "verbose" mode or a "debug" setting but can't find one - that'd
> be ideal if it exists, though.

I think you can set debug on httplib before using urllib to get the header 
traffic printed. I don't recall exactly how to do it though.

-- 
Yonder nor sorghum stenches shut ladle gulls stopper torque wet 
strainers.

-- 
https://mail.python.org/mailman/listinfo/python-list


Problem writing some strings (UnicodeEncodeError)

2014-01-12 Thread Paulo da Silva
Hi!

I am using a python3 script to produce a bash script from lots of
filenames got using os.walk.

I have a template string for each bash command in which I replace a
special string with the filename and then write the command to the bash
script file.

Something like this:

shf=open(bashfilename,'w')
filenames=getfilenames() # uses os.walk
for fn in filenames:
...
cmd=templ.replace("",fn)
shf.write(cmd)

For certain filenames I got a UnicodeEncodeError exception at
shf.write(cmd)!
I use utf-8 and have # -*- coding: utf-8 -*- in the source .py.

How can I fix this?

Thanks for any help/comments.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Problem writing some strings (UnicodeEncodeError)

2014-01-12 Thread Albert-Jan Roskam

On Sun, 1/12/14, Paulo da Silva  wrote:

 Subject: Problem writing some strings (UnicodeEncodeError)
 To: [email protected]
 Date: Sunday, January 12, 2014, 4:36 PM
 
 Hi!
 
 I am using a python3 script to produce a bash script from
 lots of
 filenames got using os.walk.
 
 I have a template string for each bash command in which I
 replace a
 special string with the filename and then write the command
 to the bash
 script file.
 
 Something like this:
 
 shf=open(bashfilename,'w')
 filenames=getfilenames() # uses os.walk
 for fn in filenames:
     ...
     cmd=templ.replace("",fn)
     shf.write(cmd)
 
 For certain filenames I got a UnicodeEncodeError exception
 at
 shf.write(cmd)!
 I use utf-8 and have # -*- coding: utf-8 -*- in the source
 .py.
 
 How can I fix this?
 
 Thanks for any help/comments.
 

==> what is the output of locale.getpreferredencoding(False)? That is the 
default value of the "encoding" parameter of the open function.
 shf=open(bashfilename,'w', encoding='utf-8') might work, though on my Linux 
macine  locale.getpreferredencoding(False) returns utf-8.
help(open)
...
   In text mode, if encoding is not specified the encoding used is platform
dependent: locale.getpreferredencoding(False) is called to get the
current locale encoding. (For reading and writing raw bytes use binary
mode and leave encoding unspecified.)
...


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Problem writing some strings (UnicodeEncodeError)

2014-01-12 Thread Peter Otten
Paulo da Silva wrote:

> I am using a python3 script to produce a bash script from lots of
> filenames got using os.walk.
> 
> I have a template string for each bash command in which I replace a
> special string with the filename and then write the command to the bash
> script file.
> 
> Something like this:
> 
> shf=open(bashfilename,'w')
> filenames=getfilenames() # uses os.walk
> for fn in filenames:
> ...
> cmd=templ.replace("",fn)
> shf.write(cmd)
> 
> For certain filenames I got a UnicodeEncodeError exception at
> shf.write(cmd)!
> I use utf-8 and have # -*- coding: utf-8 -*- in the source .py.
> 
> How can I fix this?
> 
> Thanks for any help/comments.

You make it harder to debug your problem by not giving the complete 
traceback. If the error message contains 'surrogates not allowed' like in 
the demo below

>>> with open("tmp.txt", "w") as f:
... f.write("\udcef")
... 
Traceback (most recent call last):
  File "", line 2, in 
UnicodeEncodeError: 'utf-8' codec can't encode character '\udcef' in 
position 0: surrogates not allowed

you have filenames that are not valid UTF-8 on your harddisk. 

A possible fix would be to use bytes instead of str. For that you need to 
open `bashfilename` in binary mode ("wb") and pass bytes to the os.walk() 
call. 

Or you just go and fix the offending names.


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python example source code

2014-01-12 Thread Joel Goldstick
On Sun, Jan 12, 2014 at 10:13 AM, ngangsia akumbo  wrote:
>
> On Sunday, January 12, 2014 4:06:13 PM UTC+1, Dave Angel wrote:
> > ngangsia akumbo  Wrote in message:
> >
>
>
> Thanks bro
>
> --
> https://mail.python.org/mailman/listinfo/python-list


Don't forget Python Module of the Week  pymotw.com/‎




-- 
Joel Goldstick
http://joelgoldstick.com
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: python first project

2014-01-12 Thread Emile van Sebille

On 01/11/2014 09:14 PM, ngangsia akumbo wrote:


 From all indication it is a very huge project.


Yep -- I built such a system in the late 70's with a team of seven over 
two-three years.  Then modifications and improvements continued over the 
next 20 years keeping about 2-4 programmers busy full time.



How much do you thing all this will cost if we were to put the system all 
complete.



A lot.  In today's dollars a million or two to do it right at a 
minimalist level.  Going for the gold will be much more.


IMHO you'd be better off researching the existing software market for an 
application suite the 'best fits' their needs and allows for 
customization to fine tune things.


I'm now working with OpenERP which is python based and is OSS with a 
subscription model to ensure an upgrade path.  It already has most of 
what you're looking for built in or available as third party addons and 
is of a quality that you couldn't hope to attain in years of effort. 
Which reflects the millions they've invested.


see http://www.openerp.com for more.

For an example of a commercially available entry level alternative costs 
check out:


http://www.erpsoftwareblog.com/2012/10/microsoft-dynamics-gp-2013-pricing-and-costs/

Overall a much better choice than starting from scratch.

That said, it wouldn't surprise me that the CEO hasn't already looked 
into alternatives and been put off by the costs involved.  (S)he is 
trying to cheap their way through things by deluding themselves into a 
its-not-that-big-a-problem way of thinking that I wouldn't involve 
myself in that train wreck.


Call me a sceptic -- it's true.  :)

HTH,

Emile




--
https://mail.python.org/mailman/listinfo/python-list


Re: python first project

2014-01-12 Thread ngangsia akumbo
On Sunday, January 12, 2014 5:37:41 PM UTC+1, Emile van Sebille wrote:
> On 01/11/2014 09:14 PM, ngangsia akumbo wrote:
 
 

 
> For an example of a commercially available entry level alternative costs 
 
 check out:
 

> That said, it wouldn't surprise me that the CEO hasn't already looked 
 
 into alternatives and been put off by the costs involved.  (S)he is 
 
 trying to cheap their way through things by deluding themselves into a 
 
 its-not-that-big-a-problem way of thinking that I wouldn't involve 
 
 myself in that train wreck.
 Call me a sceptic -- it's true.  :)
 

HAHAHAHAH, LOL THAT IS TRUE YOU SPOKE LIKE A MAGICIAN. 

WHEN I START PUTTING THE CODE UP FOR STOCK/BOOKKEEPING
I WILL NEED YOUR ASSISTANCE.

THANKS
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python example source code

2014-01-12 Thread Emile van Sebille

On 01/12/2014 06:37 AM, ngangsia akumbo wrote:

where can i find example source code by topic?


I'd recommend http://effbot.org/librarybook/  even though it's v2 
specific and somewhat dated.


Emile



--
https://mail.python.org/mailman/listinfo/python-list


Re: Problem writing some strings (UnicodeEncodeError)

2014-01-12 Thread Emile van Sebille

On 01/12/2014 07:36 AM, Paulo da Silva wrote:

Hi!

I am using a python3 script to produce a bash script from lots of
filenames got using os.walk.

I have a template string for each bash command in which I replace a
special string with the filename and then write the command to the bash
script file.

Something like this:

shf=open(bashfilename,'w')
filenames=getfilenames() # uses os.walk
for fn in filenames:
...
cmd=templ.replace("",fn)
shf.write(cmd)

For certain filenames I got a UnicodeEncodeError exception at
shf.write(cmd)!
I use utf-8 and have # -*- coding: utf-8 -*- in the source .py.

How can I fix this?


Not sure exactly, but I'd try


shf=open(bashfilename,'wb')

as a start.

HTH,

Emile


--
https://mail.python.org/mailman/listinfo/python-list


Re: parametized unittest

2014-01-12 Thread CraftyTech
On Saturday, January 11, 2014 11:34:30 PM UTC-5, Roy Smith wrote:
> In article ,
> 
>  "W. Trevor King"  wrote:
> 
> 
> 
> > On Sat, Jan 11, 2014 at 08:00:05PM -0800, CraftyTech wrote:
> 
> > > I'm finding it hard to use unittest in a for loop.  Perhaps something 
> > > like:
> 
> > > 
> 
> > > for val in range(25):
> 
> > >   self.assertEqual(val,5,"not equal)
> 
> > > 
> 
> > > The loop will break after the first failure.  Anyone have a good
> 
> > > approach for this?  please advise.
> 
> > 
> 
> > If Python 3.4 is an option, you can stick to the standard library and
> 
> > use subtests [1].
> 
> 
> 
> Or, as yet another alternative, if you use nose, you can write test 
> 
> generators.
> 
> 
> 
> https://nose.readthedocs.org/en/latest/writing_tests.html#test-generators

Thank you all for the feedback.  I now have what I need.  Cheers 

On Saturday, January 11, 2014 11:34:30 PM UTC-5, Roy Smith wrote:
> In article ,
> 
>  "W. Trevor King"  wrote:
> 
> 
> 
> > On Sat, Jan 11, 2014 at 08:00:05PM -0800, CraftyTech wrote:
> 
> > > I'm finding it hard to use unittest in a for loop.  Perhaps something 
> > > like:
> 
> > > 
> 
> > > for val in range(25):
> 
> > >   self.assertEqual(val,5,"not equal)
> 
> > > 
> 
> > > The loop will break after the first failure.  Anyone have a good
> 
> > > approach for this?  please advise.
> 
> > 
> 
> > If Python 3.4 is an option, you can stick to the standard library and
> 
> > use subtests [1].
> 
> 
> 
> Or, as yet another alternative, if you use nose, you can write test 
> 
> generators.
> 
> 
> 
> https://nose.readthedocs.org/en/latest/writing_tests.html#test-generators

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python example source code

2014-01-12 Thread ngangsia akumbo
On Sunday, January 12, 2014 5:38:03 PM UTC+1, Joel Goldstick wrote:
> On Sun, Jan 12, 2014 at 10:13 AM, ngangsia akumbo  wrote:
> 

> Don't forget Python Module of the Week  pymotw.com/


Thanks

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python example source code

2014-01-12 Thread ngangsia akumbo
On Sunday, January 12, 2014 5:52:19 PM UTC+1, Emile van Sebille wrote:
> On 01/12/2014 06:37 AM, ngangsia akumbo wrote:
> 

> I'd recommend http://effbot.org/librarybook/  even though it's v2 
 
 specific and somewhat dated.

Thank very much , it is very nice
-- 
https://mail.python.org/mailman/listinfo/python-list


Data peeping function?

2014-01-12 Thread Thor Whalen
The first thing I do once I import new data (as a pandas dataframe) is to 
.head() it, .describe() it, and then kick around a few specific stats according 
to what I see.

But I'm not satisfied with .describe(). Amongst others, non-numerical columns 
are ignored, and off-the-shelf stats will be computed for any numerical column.

I've been shopping around for a "data peeping" function that would:

(1) Have a hands-off mode where simply typing
   diagnose_this(data)
the function would figure things out on its own, and notify me when in doubt. 
For example, would assume that any string data with not too many unique values 
should be considered categorical and appropriate statistics erected.

(2) Perform standard diagnoses and print them out. For example, (a) missing 
values? (b) heterogeneously formatted data? (c) columns with only one unique 
value? etc.

(3) Be parametrizable, if I so choose.

Does anyone know of such a function?
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python example source code

2014-01-12 Thread memilanuk

On 01/12/2014 06:37 AM, ngangsia akumbo wrote:

where can i find example source code by topic?
Any help please



nullege.com is usually helpful...

--
https://mail.python.org/mailman/listinfo/python-list


Re: Problem writing some strings (UnicodeEncodeError)

2014-01-12 Thread Paulo da Silva
Em 12-01-2014 16:23, Peter Otten escreveu:
> Paulo da Silva wrote:
> 
>> I am using a python3 script to produce a bash script from lots of
>> filenames got using os.walk.
>>
>> I have a template string for each bash command in which I replace a
>> special string with the filename and then write the command to the bash
>> script file.
>>
>> Something like this:
>>
>> shf=open(bashfilename,'w')
>> filenames=getfilenames() # uses os.walk
>> for fn in filenames:
>> ...
>> cmd=templ.replace("",fn)
>> shf.write(cmd)
>>
>> For certain filenames I got a UnicodeEncodeError exception at
>> shf.write(cmd)!
>> I use utf-8 and have # -*- coding: utf-8 -*- in the source .py.
>>
>> How can I fix this?
>>
>> Thanks for any help/comments.
> 
> You make it harder to debug your problem by not giving the complete 
> traceback. If the error message contains 'surrogates not allowed' like in 
> the demo below
> 
 with open("tmp.txt", "w") as f:
> ... f.write("\udcef")
> ... 
> Traceback (most recent call last):
>   File "", line 2, in 
> UnicodeEncodeError: 'utf-8' codec can't encode character '\udcef' in 
> position 0: surrogates not allowed

That is the situation. I just lost it and it would take a few houres to
repeat the situation. Sorry.


> 
> you have filenames that are not valid UTF-8 on your harddisk. 
> 
> A possible fix would be to use bytes instead of str. For that you need to 
> open `bashfilename` in binary mode ("wb") and pass bytes to the os.walk() 
> call. 
This is my 1st time with python3, so I am confused!

As much I could understand it seems that os.walk is returning the
filenames exactly as they are on disk. Just bytes like in C.

My template is a string. What is the result of the replace command? Is
there any change in the filename from os.walk contents?

Now, if the result of the replace has the replaced filename unchanged
how do I "convert" it to bytes type, without changing its contents, so
that I can write to the bashfile opened with "wb"?


> 
> Or you just go and fix the offending names.
This is impossible in my case.
I need a bash script with the names as they are on disk.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: 'Straße' ('Strasse') and Python 2

2014-01-12 Thread MRAB

On 2014-01-12 08:31, Peter Otten wrote:

[email protected] wrote:


sys.version

2.7.6 (default, Nov 10 2013, 19:24:18) [MSC v.1500 32 bit (Intel)]

s = 'Straße'
assert len(s) == 6
assert s[5] == 'e'



jmf


Signifying nothing. (Macbeth)

Python 2.7.2+ (default, Jul 20 2012, 22:15:08)
[GCC 4.6.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.

s = "Straße"
assert len(s) == 6

Traceback (most recent call last):
   File "", line 1, in 
AssertionError

assert s[5] == "e"

Traceback (most recent call last):
   File "", line 1, in 
AssertionError



The point is that in Python 2 'Straße' is a bytestring and its length
depends on the encoding of the source file. If the source file is UTF-8
then 'Straße' is a string literal with 7 bytes between the single
quotes.
--
https://mail.python.org/mailman/listinfo/python-list


Re: python first project

2014-01-12 Thread MRAB

On 2014-01-12 06:04, Chris Angelico wrote:

On Sun, Jan 12, 2014 at 4:14 PM, ngangsia akumbo 
wrote:

What options do you think i can give the Ceo. Because from what you
have outline, i think i will like to follow your advice.

If it is just some recording data stuff then some spreadsheet can
do the work.

From all indication it is a very huge project.

How much do you thing all this will cost if we were to put the
system all complete.


If you currently do all your bills and things on paper, then this
job is going to be extremely daunting. Even if you don't write a
single line of code (ie you buy a ready-made system), you're going to
have to convert everybody to doing things the new way. In that case,
I would recommend getting some people together to discuss exactly
what you need to do, and then purchase an accounting, warehousing, or
inventory management system, based on what you actually need it to
do.

On the other hand, if it's already being done electronically, your
job is IMMENSELY easier. Easier, but more complex to describe,
because what you're really asking for is a program that will get
certain data out of your accounting/inventory management system and
display it. The difficulty of that job depends entirely on what
you're using for that data entry.


You should also consider whether you need to do it all at once or could
do it incrementally. Look at what functionality you might want and where
you might get the greatest benefit and start there. Doing it that way
will reduce the chances of you committing a lot of resources (time and
money) building a system, only to find at the end that you either left
something out or added something that you didn't really need after all.
--
https://mail.python.org/mailman/listinfo/python-list


Re: Problem writing some strings (UnicodeEncodeError)

2014-01-12 Thread Peter Otten
Paulo da Silva wrote:

> Em 12-01-2014 16:23, Peter Otten escreveu:
>> Paulo da Silva wrote:
>> 
>>> I am using a python3 script to produce a bash script from lots of
>>> filenames got using os.walk.
>>>
>>> I have a template string for each bash command in which I replace a
>>> special string with the filename and then write the command to the bash
>>> script file.
>>>
>>> Something like this:
>>>
>>> shf=open(bashfilename,'w')
>>> filenames=getfilenames() # uses os.walk
>>> for fn in filenames:
>>> ...
>>> cmd=templ.replace("",fn)
>>> shf.write(cmd)
>>>
>>> For certain filenames I got a UnicodeEncodeError exception at
>>> shf.write(cmd)!
>>> I use utf-8 and have # -*- coding: utf-8 -*- in the source .py.
>>>
>>> How can I fix this?
>>>
>>> Thanks for any help/comments.
>> 
>> You make it harder to debug your problem by not giving the complete
>> traceback. If the error message contains 'surrogates not allowed' like in
>> the demo below
>> 
> with open("tmp.txt", "w") as f:
>> ... f.write("\udcef")
>> ...
>> Traceback (most recent call last):
>>   File "", line 2, in 
>> UnicodeEncodeError: 'utf-8' codec can't encode character '\udcef' in
>> position 0: surrogates not allowed
> 
> That is the situation. I just lost it and it would take a few houres to
> repeat the situation. Sorry.
> 
> 
>> 
>> you have filenames that are not valid UTF-8 on your harddisk.
>> 
>> A possible fix would be to use bytes instead of str. For that you need to
>> open `bashfilename` in binary mode ("wb") and pass bytes to the os.walk()
>> call.
> This is my 1st time with python3, so I am confused!
> 
> As much I could understand it seems that os.walk is returning the
> filenames exactly as they are on disk. Just bytes like in C.

No, they are decoded with the preferred encoding. With UTF-8 that can fail, 
and if it does the surrogateescape error handler replaces the offending 
bytes with special codepoints:

>>> import os
>>> with open(b"\xe4\xf6\xfc", "w") as f: f.write("whatever")
... 
8
>>> os.listdir()
['\udce4\udcf6\udcfc']

You can bypass the decoding process by providing a bytes argument to 
os.listdir() (or os.walk() which uses os.listdir() internally):

>>> os.listdir(b".")
[b'\xe4\xf6\xfc']

To write these raw bytes into a file the file has of course to be binary, 
too.

> My template is a string. What is the result of the replace command? Is
> there any change in the filename from os.walk contents?
> 
> Now, if the result of the replace has the replaced filename unchanged
> how do I "convert" it to bytes type, without changing its contents, so
> that I can write to the bashfile opened with "wb"?
> 
> 
>> 
>> Or you just go and fix the offending names.
> This is impossible in my case.
> I need a bash script with the names as they are on disk.

I think instead of the hard way sketched out above it will be sufficient to 
specify the error handler when opening the destination file

shf = open(bashfilename, 'w', errors="surrogateescape")

but I have not tried it myself. Also, some bytes may need to be escaped, 
either to be understood by the shell, or to address security concerns:

>>> import os
>>> template = "ls "
>>> for filename in os.listdir():
... print(template.replace("", filename))
... 
ls foo; rm bar


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Open Question - I'm a complete novice in programming so please bear with me...Is python equivalent to C, C++ and java combined?

2014-01-12 Thread Grant Edwards
On 2014-01-11, pintreo mardi  wrote:

> Hi, I've just begun to learn programming, I have an open question for
> the group: Is the Python language an all in one computer language
> which could replace C, C++, Java etc..

No.  Python can not replace C in a number of application areas:

 * Bare-metal applications without an OS.

 * Low-resource applications with limited memory (like a few KB).

 * Device driver and kernel modules for OSes like Linux, Unix, (and,
   AFAIK, Windows).

 * Computationally intensive applications where there isn't a library
   available written C or FORTRAN to do the heavy lifting.

For general application programming on a server or PC, then Python can
replace many/most uses of C/C++/Java.


-- 
Grant Edwards   grant.b.edwardsYow! Look into my eyes and
  at   try to forget that you have
  gmail.coma Macy's charge card!
-- 
https://mail.python.org/mailman/listinfo/python-list


efficient way to process data

2014-01-12 Thread Larry Martell
I have an python app that queries a MySQL DB. The query has this form:

SELECT a, b, c, d, AVG(e), STD(e), CONCAT(x, ',', y) as f
FROM t
GROUP BY a, b, c, d, f

x and y are numbers (378.18, 2213.797 or 378.218, 2213.949 or
10053.490, 2542.094).

The business issue is that if either x or y in 2 rows that are in the
same a, b, c, d group are within 1 of each other then they should be
grouped together. And to make it more complicated, the tolerance is
applied as a rolling continuum. For example, if the x and y in a set
of grouped rows are:

row 1: 1.5, 9.5
row 2: 2.4, 20.8
row 3: 3.3, 40.6
row 4: 4.2, 2.5
row 5: 5.1, 10.1
row 6: 6.0, 7.9
row 7: 8.0, 21.0
row 8: 100, 200

1 through 6 get combined because all their X values are within the
tolerance of some other X in the set that's been combined. 7's Y value
is within the tolerance of 2's Y, so that should be combined as well.
8 is not combined because neither the X or Y value is within the
tolerance of any X or Y in the set that was combined.

AFAIK, there is no way to do this in SQL. In python I can easily parse
the data and identify the rows that need to be combined, but then I've
lost the ability to calculate the average and std across the combined
data set. The only way I can think of to do this is to remove the
grouping from the SQL and do all the grouping and aggregating myself.
But this query often returns 20k to 30k rows after grouping. It could
easily be 80k to 100k rows or more that I have to process if I remove
the grouping and I think that will end up being very slow.

Anyone have any ideas how I can efficiently do this?

Thanks!
-larry
-- 
https://mail.python.org/mailman/listinfo/python-list


python query on firebug extention

2014-01-12 Thread JAI PRAKASH SINGH
 hello

i am working on selenium module of python, i know how to make
extension  of firebug with selenium, but i  want to know how to use
firebug  extension with request module / mechanize . i search a lot
but unable to find it , please help .

i want technique similar like :-

from selenium import webdriver

fp = webdriver.FirefoxProfile()

fp.add_extension(extension='firebug-.8.4.xpi')
fp.set_preference("extensions.firebug.currentVersion", "1.8.4")
browser = webdriver.Firefox(firefox_profile=fp)


in request module or mechanize module
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Problem writing some strings (UnicodeEncodeError)

2014-01-12 Thread Paulo da Silva
> 
> I think instead of the hard way sketched out above it will be sufficient to 
> specify the error handler when opening the destination file
> 
> shf = open(bashfilename, 'w', errors="surrogateescape")
This seems to fix everything!
I tried with a small test set and it worked.

> 
> but I have not tried it myself. Also, some bytes may need to be escaped, 
> either to be understood by the shell, or to address security concerns:
> 

Since I am puting the file names between "", the only char that needs to
be escaped is the " itself.

I'm gonna try with the real thing.

Thank you very much for the fixing and for everything I have learned here.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: efficient way to process data

2014-01-12 Thread Petite Abeille

On Jan 12, 2014, at 8:23 PM, Larry Martell  wrote:

> AFAIK, there is no way to do this in SQL.

Sounds like a job for window functions (aka analytic functions) [1][2].

[1] http://www.postgresql.org/docs/9.3/static/tutorial-window.html
[2] 
http://docs.oracle.com/cd/E11882_01/server.112/e26088/functions004.htm#SQLRF06174
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Problem writing some strings (UnicodeEncodeError)

2014-01-12 Thread Peter Otten
Paulo da Silva wrote:

>> but I have not tried it myself. Also, some bytes may need to be escaped,
>> either to be understood by the shell, or to address security concerns:
>>
> 
> Since I am puting the file names between "", the only char that needs to
> be escaped is the " itself.

What about the escape char?

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python: 404 Error when trying to login a webpage by using 'urllib' and 'HTTPCookieProcessor'

2014-01-12 Thread Terry Reedy

On 1/12/2014 7:17 AM, KMeans Algorithm wrote:


But I get a "404" error (Not Found). The page 
"https://www.mysite.com/loginpage"; does exist


Firefox tells me the same thing. If that is a phony address, you should 
have said so.



--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list


Re: efficient way to process data

2014-01-12 Thread Chris Angelico
On Mon, Jan 13, 2014 at 6:53 AM, Petite Abeille
 wrote:
> On Jan 12, 2014, at 8:23 PM, Larry Martell  wrote:
>
>> AFAIK, there is no way to do this in SQL.
>
> Sounds like a job for window functions (aka analytic functions) [1][2].

That's my thought too. I don't think MySQL has them, though, so it's
either going to have to be done in Python, or the database back-end
will need to change. Hard to say which would be harder.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python example source code

2014-01-12 Thread Denis McMahon
On Sun, 12 Jan 2014 06:37:18 -0800, ngangsia akumbo wrote:

> where can i find example source code by topic?
> Any help please

You don't want to be looking at source code yet, you want to be talking 
to the users of the system you're trying to design to find out what their 
requirements are.

-- 
Denis McMahon, [email protected]
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: efficient way to process data

2014-01-12 Thread Chris Angelico
On Mon, Jan 13, 2014 at 6:23 AM, Larry Martell  wrote:
> I have an python app that queries a MySQL DB. The query has this form:
>
> SELECT a, b, c, d, AVG(e), STD(e), CONCAT(x, ',', y) as f
> FROM t
> GROUP BY a, b, c, d, f
>
> x and y are numbers (378.18, 2213.797 or 378.218, 2213.949 or
> 10053.490, 2542.094).
>
> The business issue is that if either x or y in 2 rows that are in the
> same a, b, c, d group are within 1 of each other then they should be
> grouped together. And to make it more complicated, the tolerance is
> applied as a rolling continuum. For example, if the x and y in a set
> of grouped rows are:
>
> row 1: 1.5, 9.5
> row 2: 2.4, 20.8
> row 3: 3.3, 40.6
> row 4: 4.2, 2.5
> row 5: 5.1, 10.1
> row 6: 6.0, 7.9
> row 7: 8.0, 21.0
> row 8: 100, 200
>
> 1 through 6 get combined because all their X values are within the
> tolerance of some other X in the set that's been combined. 7's Y value
> is within the tolerance of 2's Y, so that should be combined as well.
> 8 is not combined because neither the X or Y value is within the
> tolerance of any X or Y in the set that was combined.

Trying to get my head around this a bit more. Are columns a/b/c/d
treated as a big category (eg type, brand, category, model), such that
nothing will ever be grouped that has any difference in those four
columns? If so, we can effectively ignore them and pretend we have a
table with exactly one set (eg stick a WHERE clause onto the query
that stipulates their values). Then what you have is this:

* Aggregate based on proximity of x and y
* Emit results derived from e

Is that correct?

So here's my way of writing it.

* Subselect: List all values for x, in order, and figure out which
ones are less than the previous value plus one
* Subselect: Ditto, for y.
* Outer select: Somehow do an either-or group. I'm not quite sure how
to do that part, actually!

A PGSQL window function would cover the two subselects - at least, I'm
fairly sure it would. I can't quite get the whole thing, though; I can
get a true/false flag that says whether it's near to the previous one
(that's easy), and creating a grouping column value should be possible
from that but I'm not sure how.

But an either-or grouping is a bit trickier. The best I can think of
is to collect all the y values for each group of x values, and then if
any two groups 'overlap' (ie have points within 1.0 of each other),
merge the groups. That's going to be seriously tricky to do in SQL, I
think, so you may have to go back to Python on that one.

My analysis suggests that, whatever happens, you're going to need
every single y value somewhere. So it's probably not worth trying to
do any grouping/aggregation in SQL, since you need to further analyze
all the individual data points. I can't think of any way better than
just leafing through the whole table (either in Python or in a stored
procedure - if you can run your script on the same computer that's
running the database, I'd do that, otherwise consider a stored
procedure to reduce network transfers) and building up mappings.

Of course, "I can't think of a way" does not equate to "There is no
way". There may be some magic trick that I didn't think of, or some
arcane incantation that gets what you want. Who knows? If you can
produce an ASCII art Mandelbrot set [1] in pure SQL, why not this!

ChrisA

[1] http://wiki.postgresql.org/wiki/Mandelbrot_set
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Problem writing some strings (UnicodeEncodeError)

2014-01-12 Thread Paulo da Silva
Em 12-01-2014 20:29, Peter Otten escreveu:
> Paulo da Silva wrote:
> 
>>> but I have not tried it myself. Also, some bytes may need to be escaped,
>>> either to be understood by the shell, or to address security concerns:
>>>
>>
>> Since I am puting the file names between "", the only char that needs to
>> be escaped is the " itself.
> 
> What about the escape char?
> 
Just this fn=fn.replace('"','\\"')

So far I didn't find any problem, but the script is still running.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: efficient way to process data

2014-01-12 Thread Larry Martell
On Sun, Jan 12, 2014 at 2:53 PM, Petite Abeille
 wrote:
>
> On Jan 12, 2014, at 8:23 PM, Larry Martell  wrote:
>
>> AFAIK, there is no way to do this in SQL.
>
> Sounds like a job for window functions (aka analytic functions) [1][2].
>
> [1] http://www.postgresql.org/docs/9.3/static/tutorial-window.html
> [2] 
> http://docs.oracle.com/cd/E11882_01/server.112/e26088/functions004.htm#SQLRF06174

Unfortunately, MySQL does not support this.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: efficient way to process data

2014-01-12 Thread Larry Martell
On Sun, Jan 12, 2014 at 5:18 PM, Chris Angelico  wrote:
> On Mon, Jan 13, 2014 at 6:53 AM, Petite Abeille
>  wrote:
>> On Jan 12, 2014, at 8:23 PM, Larry Martell  wrote:
>>
>>> AFAIK, there is no way to do this in SQL.
>>
>> Sounds like a job for window functions (aka analytic functions) [1][2].
>
> That's my thought too. I don't think MySQL has them, though, so it's
> either going to have to be done in Python, or the database back-end
> will need to change. Hard to say which would be harder.

Changing the database is not feasible.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: efficient way to process data

2014-01-12 Thread Larry Martell
On Sun, Jan 12, 2014 at 5:43 PM, Dennis Lee Bieber
 wrote:
> On Sun, 12 Jan 2014 14:23:17 -0500, Larry Martell 
> declaimed the following:
>
>>I have an python app that queries a MySQL DB. The query has this form:
>>
>>SELECT a, b, c, d, AVG(e), STD(e), CONCAT(x, ',', y) as f
>>FROM t
>>GROUP BY a, b, c, d, f
>>
>>x and y are numbers (378.18, 2213.797 or 378.218, 2213.949 or
>>10053.490, 2542.094).
>>
>
> Decimal (Numeric) or floating/real. If the latter, the internal 
> storage
> may not be exact (378.18 and 378.17999 may both "display" as
> 378.18, but will not match for grouping).

In the database they are decimal. They are being converted to char by
the CONCAT(x, ',', y).

>>The business issue is that if either x or y in 2 rows that are in the
>>same a, b, c, d group are within 1 of each other then they should be
>>grouped together. And to make it more complicated, the tolerance is
>>applied as a rolling continuum. For example, if the x and y in a set
>>of grouped rows are:
>>
> As I understand group by, it will first group by "a", WITHIN the "a"
> groups it will then group by "b"... Probably not a matter germane to the
> problem as you are concerning yourself with the STRING representation of
> "x" and "y" with a comma delimiter -- which is only looked at if the
> "a,b,c,d" are equal... Thing is, a string comparison is going to operate
> strictly left to right -- it won't even see your "y" value unless all the
> "x" value is equal.

Yes, that is correct. The original requirement was to group by (X, Y),
so the CONCAT(x, ',', y) was correct and working. Then the requirement
was change to apply the tolerance as I described.

>
> You may need to operate using subselects... So that you can specify
> something like
>
> where   abs(s1.x -s2.x) < tolerance or abs(s1.y-s2.y) < tolerance
> and (s1.a = s2.a ... s1.d = s2.d)
>
> s1/s1 are the subselects (you may need a primary key <> primary key to
> avoid having it output a record where the two subselects are for the SAME
> record -- or maybe not, since you /do/ want that record also output). Going
> to be a costly query since you are basically doing
>
> foreach r1 in s1
> foreach r2 in s2
> emit r2 when...

Speed is an issue here, and while the current query performs well, in
my experience subqueries and self joins do not. I'm going to try and
do it all in python and see how it performs. The other option is to
pre-process the data on the way into the database. Doing that will
eliminate some of the data partitioning as all of the data that could
be joined will be in the same input file. I'm just not sure if it will
OK to actually munge the data. I'll find that out tomorrow.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: efficient way to process data

2014-01-12 Thread Larry Martell
On Sun, Jan 12, 2014 at 6:27 PM, Chris Angelico  wrote:
> On Mon, Jan 13, 2014 at 6:23 AM, Larry Martell  
> wrote:
>> I have an python app that queries a MySQL DB. The query has this form:
>>
>> SELECT a, b, c, d, AVG(e), STD(e), CONCAT(x, ',', y) as f
>> FROM t
>> GROUP BY a, b, c, d, f
>>
>> x and y are numbers (378.18, 2213.797 or 378.218, 2213.949 or
>> 10053.490, 2542.094).
>>
>> The business issue is that if either x or y in 2 rows that are in the
>> same a, b, c, d group are within 1 of each other then they should be
>> grouped together. And to make it more complicated, the tolerance is
>> applied as a rolling continuum. For example, if the x and y in a set
>> of grouped rows are:
>>
>> row 1: 1.5, 9.5
>> row 2: 2.4, 20.8
>> row 3: 3.3, 40.6
>> row 4: 4.2, 2.5
>> row 5: 5.1, 10.1
>> row 6: 6.0, 7.9
>> row 7: 8.0, 21.0
>> row 8: 100, 200
>>
>> 1 through 6 get combined because all their X values are within the
>> tolerance of some other X in the set that's been combined. 7's Y value
>> is within the tolerance of 2's Y, so that should be combined as well.
>> 8 is not combined because neither the X or Y value is within the
>> tolerance of any X or Y in the set that was combined.
>
> Trying to get my head around this a bit more. Are columns a/b/c/d
> treated as a big category (eg type, brand, category, model), such that
> nothing will ever be grouped that has any difference in those four
> columns? If so, we can effectively ignore them and pretend we have a
> table with exactly one set (eg stick a WHERE clause onto the query
> that stipulates their values). Then what you have is this:
>
> * Aggregate based on proximity of x and y
> * Emit results derived from e
>
> Is that correct?

There will be multiple groups of a/b/c/d. I simplified the query for
the purposes of posting my question. There is a where clause with
values that come from user input. None, any, or all of a, b, c, or d
could be in the where clause.

> So here's my way of writing it.
>
> * Subselect: List all values for x, in order, and figure out which
> ones are less than the previous value plus one
> * Subselect: Ditto, for y.
> * Outer select: Somehow do an either-or group. I'm not quite sure how
> to do that part, actually!
>
> A PGSQL window function would cover the two subselects - at least, I'm
> fairly sure it would. I can't quite get the whole thing, though; I can
> get a true/false flag that says whether it's near to the previous one
> (that's easy), and creating a grouping column value should be possible
> from that but I'm not sure how.
>
> But an either-or grouping is a bit trickier. The best I can think of
> is to collect all the y values for each group of x values, and then if
> any two groups 'overlap' (ie have points within 1.0 of each other),
> merge the groups. That's going to be seriously tricky to do in SQL, I
> think, so you may have to go back to Python on that one.
>
> My analysis suggests that, whatever happens, you're going to need
> every single y value somewhere. So it's probably not worth trying to
> do any grouping/aggregation in SQL, since you need to further analyze
> all the individual data points. I can't think of any way better than
> just leafing through the whole table (either in Python or in a stored
> procedure - if you can run your script on the same computer that's
> running the database, I'd do that, otherwise consider a stored
> procedure to reduce network transfers) and building up mappings.
>
> Of course, "I can't think of a way" does not equate to "There is no
> way". There may be some magic trick that I didn't think of, or some
> arcane incantation that gets what you want. Who knows? If you can
> produce an ASCII art Mandelbrot set [1] in pure SQL, why not this!
>
> ChrisA
>
> [1] http://wiki.postgresql.org/wiki/Mandelbrot_set

Thanks for the reply. I'm going to take a stab at removing the group
by and doing it all in python. It doesn't look too hard, but I don't
know how it will perform.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: efficient way to process data

2014-01-12 Thread Chris Angelico
On Mon, Jan 13, 2014 at 2:35 PM, Larry Martell  wrote:
> Thanks for the reply. I'm going to take a stab at removing the group
> by and doing it all in python. It doesn't look too hard, but I don't
> know how it will perform.

Well, if you can't switch to PostgreSQL or such, then doing it in
Python is your only option. There are such things as GiST and GIN
indexes that might be able to do some of this magic, but I don't think
MySQL has anything even remotely like what you're looking for.

So ultimately, you're going to have to do your filtering on the
database, and then all the aggregation in Python. And it's going to be
somewhat complicated code, too. Best I can think of is this, as
partial pseudo-code:

last_x = -999
x_map = []; y_map = {}
merge_me = []
for x,y,e in (SELECT x,y,e FROM t WHERE whatever ORDER BY x):
if xhttps://mail.python.org/mailman/listinfo/python-list


Re: extracting string.Template substitution placeholders

2014-01-12 Thread Steven D'Aprano
On Sun, 12 Jan 2014 10:08:31 -0500, Eric S. Johansson wrote:

> As part of speech recognition accessibility tools that I'm building, I'm
> using string.Template. In order to construct on-the-fly grammar, I need
> to know all of the identifiers before the template is filled in. what is
> the best way to do this?


py> import string
py> t = string.Template("$sub some $text $here")
py> t.template
'$sub some $text $here'

Now just walk the template for $ signs. Watch out for $$ which escapes 
the dollar sign. Here's a baby parser:

def get_next(text, start=0):
while True:
i = text.find("$", start)
if i == -1:
return
if text[i:i+2] == '$$':
start += i
continue
j = text.find(' ', i)
if j == -1:
j = len(text)
assert i < j
return (text[i:j], j)

start = 0
while start < len(t.template):
word, start = get_next(t.template, start)
print(word)


> can string.Template handle recursive expansion i.e. an identifier
> contains a template.

If you mean, recursive expand the template until there's nothing left to 
substitute, then no, not directly. You would have to manually expand the 
template yourself.


-- 
Steven
-- 
https://mail.python.org/mailman/listinfo/python-list