need some kind of "coherence index" for a group of strings
Hi there, apologies for the generic question. Here is my problem let's say that I have a list of lists of strings. list1:#strings are sort of similar to one another my_nice_string_blabla my_nice_string_blqbli my_nice_string_bl0bla my_nice_string_aru list2:#strings are mostly different from one another my_nice_string_blabla some_other_string yet_another_unrelated string wow_totally_different_from_others_too I would like an algorithm that can look at the strings and determine that strings in list1 are sort of similar to one another, while the strings in list2 are all different. Ideally, it would be nice to have some kind of 'coherence index' that I can exploit to separate lists given a certain threshold. I was about to concoct something using levensthein distance, but then I figured that it would be expensive to compute and I may be reinventing the wheel. Thanks in advance to python masters that may have suggestions... -- https://mail.python.org/mailman/listinfo/python-list
Re: need some kind of "coherence index" for a group of strings
On 11/3/2016 6:47 PM, [email protected] wrote: On Thursday, November 3, 2016 at 1:09:48 PM UTC-7, Neil D. Cerutti wrote: you may also be able to use some items "off the shelf" from Python's difflib. I wasn't aware of that module, thanks for the tip! difflib.SequenceMatcher.ratio() returns a numerical value which represents > the "similarity" between two strings. I don't see a precise definition of > "similar", but it may do what the OP needs. I may end up rolling my own algo, but thanks for the tip, this does seem like useful stuff indeed -- https://mail.python.org/mailman/listinfo/python-list
MemoryError and Pickle
Hi there, Python newbie here.
I am working with large files. For this reason I figured that I would
capture the large input into a list and serialize it with pickle for
later (faster) usage.
Everything has worked beautifully until today when the large data (1GB)
file caused a MemoryError :(
Question for experts: is there a way to refactor this so that data may
be filled/written/released as the scripts go and avoid the problem?
code below.
Thanks
data = list()
for line in sys.stdin:
try:
parts = line.strip().split("\t")
t = parts[0]
w = parts[1]
u = parts[2]
#let's retain in-memory copy of data
data.append({"ta": t,
"wa": w,
"ua": u
})
except IndexError:
print("Problem with line :"+line, file=sys.stderr)
pass
#time to save data object into a pickle file
fileObject = open(filename,"wb")
pickle.dump(data,fileObject)
fileObject.close()
--
https://mail.python.org/mailman/listinfo/python-list
Drowning in a teacup?
notorious pass by reference vs pass by value biting me in the backside here. Proceeding in order. I need to scan a list of strings. If one of the elements matches the beginning of a search keyword, that element needs to snap to the front of the list. I achieved that this way: for i in range(len(mylist)): if(mylist[i].startswith(key)): mylist = [mylist[i]] + mylist[:i] + mylist[i+1:] Since I need this code in multiple places, I placed it inside a function def bringOrderStringToFront(mylist, key): for i in range(len(mylist)): if(mylist[i].startswith(key)): mylist = [mylist[i]] + mylist[:i] + mylist[i+1:] and called it this way: if orderstring: bringOrderStringToFront(Tokens, orderstring) right? Nope, wrong! contrary to what I thought I had understood about how parameters are passed in Python, the function is acting on a copy(!) and my original list is unchanged. I fixed it this way: def bringOrderStringToFront(mylist, key): for i in range(len(mylist)): if(mylist[i].startswith(key)): mylist = [mylist[i]] + mylist[:i] + mylist[i+1:] return(mylist) and: if orderstring: Tokens = bringOrderStringToFront(Tokens, orderstring) but I'm left with a sour taste of not understanding what I was doing wrong. Can anyone elaborate? what's the pythonista way to do it right? Thanks -- https://mail.python.org/mailman/listinfo/python-list
Re: Drowning in a teacup?
On 04/01/2016 04:27 PM, Fillmore wrote: notorious pass by reference vs pass by value biting me in the backside here. Proceeding in order. Many thanks to all of those who replied! -- https://mail.python.org/mailman/listinfo/python-list
Most probably a stupid question, but I still want to ask
let's look at this:
$ python3.4
Python 3.4.0 (default, Apr 11 2014, 13:05:11)
[GCC 4.8.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> line1 = '"String1" | bla'
>>> parts1 = line1.split(" | ")
>>> parts1
['"String1"', 'bla']
>>> tokens1 = eval(parts1[0])
>>> tokens1
'String1'
>>> tokens1[0]
'S'
and now this
>>> line2 = '"String1","String2" | bla'
>>> parts2 = line2.split(" | ")
>>> tokens2 = eval(parts2[0])
>>> tokens2
('String1', 'String2')
>>> tokens2[0]
'String1'
>>> type(tokens1)
>>> type(tokens2)
>>>
the question is: at which point did the language designers decide to betray the
"path of least surprise" principle and create a 'discontinuity' in the language?
Open to the idea that I am getting something fundamentally wrong. I'm new to
Python...
Thanks
--
https://mail.python.org/mailman/listinfo/python-list
one-element tuples [Was: Most probably a stupid question, but I still want to ask]
Sorry guys. It was not my intention to piss off anyone...just trying to
understand how the languare works
I guess that the answer to my question is: there is no such thing as a
one-element tuple,
and Python will automatically convert a one-element tuple to a string... hence
the
behavior I observed is explained...
>>> a = ('hello','bonjour')
>>> b = ('hello')
>>> b
'hello'
>>> a
('hello', 'bonjour')
>>>
Did I get this right this time?
--
https://mail.python.org/mailman/listinfo/python-list
Re: Most probably a stupid question, but I still want to ask
On 04/10/2016 07:30 PM, Stephen Hansen wrote: There's nothing inconsistent or surprising going on besides you doing something vaguely weird and not really expressing what you find surprising. well, I was getting some surprising results for some of my data, so I can guarantee that I was surprised! apparently my 'discontinuity' is mappable to the fact that there's no such thing as one-element tuples in Python, and attempts to create one will result in a string (i.e. an object of a different kind!)... -- https://mail.python.org/mailman/listinfo/python-list
Re: one-element tuples [Was: Most probably a stupid question, but I still want to ask]
On 04/10/2016 08:13 PM, Fillmore wrote:
Sorry guys. It was not my intention to piss off anyone...just trying to
understand how the languare works
I guess that the answer to my question is: there is no such thing as a
one-element tuple,
and Python will automatically convert a one-element tuple to a string... hence
the
behavior I observed is explained...
>>> a = ('hello','bonjour')
>>> b = ('hello')
>>> b
'hello'
>>> a
('hello', 'bonjour')
>>>
Hold on a sec! it turns up that there is such thing as single-element tuples in
python:
>>> c = ('hello',)
>>> c
('hello',)
>>> c[0]
'hello'
>>> c[1]
Traceback (most recent call last):
File "", line 1, in
IndexError: tuple index out of range
>>>
So, my original question makes sense. Why was a discontinuation point
introduced by the language designer?
--
https://mail.python.org/mailman/listinfo/python-list
Re: one-element tuples
On 04/10/2016 08:31 PM, Ben Finney wrote: Can you describe explicitly what that “discontinuation point” is? I'm not seeing it. Here you go: >>> a = '"string1"' >>> b = '"string1","string2"' >>> c = '"string1","string2","string3"' >>> ea = eval(a) >>> eb = eval(b) >>> ec = eval(c) >>> type(ea) <--- HERE >>> type(eb) >>> type(ec) I can tell you that it exists because it bit me in the butt today... and mind you, I am not saying that this is wrong. I'm just saying that it surprised me. -- https://mail.python.org/mailman/listinfo/python-list
Re: Most probably a stupid question, but I still want to ask
funny, but it seems to me that you are taking it personally... thank god i even apologized in advance for what was most probably a stupid question.. On 04/10/2016 09:50 PM, Steven D'Aprano wrote: Fillmore, you should feel very pleased with yourself. All the tens of thousands of Python programmers, and millions of lines of code written in the language, but nobody until now was able to see what you alone had the intelligence and clarity of thought to spot. Well done! -- https://mail.python.org/mailman/listinfo/python-list
Re: one-element tuples
On 04/10/2016 09:36 PM, Ben Finney wrote: If the two examples give you different responses (one surprises you, the other does not), I would really like to know*what the surprise is*. What specifically did you expect, that did not happen? now that I get the role of commas it's not surprising anymore... thanks -- https://mail.python.org/mailman/listinfo/python-list
Re: one-element tuples
Thank you for trying to help, Martin. So:
On 04/10/2016 09:08 PM, Martin A. Brown wrote:
#1: I would not choose eval() except when there is no other
solution. If you don't need eval(), it may save you some
headache in the future, as well, to find an alternate way.
So, can we help you choose something other than eval()?
What are you trying to do with that usage?
so, I do not quite control the format of the file I am trying to parse.
it has the format:
"str1","str2",,"strN" => more stuff
:
in some cases there is just one "str" which is what created me problem.
The first "str1" has special meaning and, at times, it can be alone.
The way I handle this is:
parts = line.strip().split(" => ")
tokens = eval(parts[0])
if type(tokens) == str: #Handle case that there's only one token
columns.add(tokens)
rowTokenString = "__Empty__"
rows.add(rowTokenString)
value = parts[1][:2]
addCell(table, rowTokenString, tokens, value)
else:
columns.add(tokens[0])
rowTokenString = '"'+'","'.join(tokens[1:]) + '"'
rows.add(rowTokenString)
value = parts[1][:2]
addCell(table, rowTokenString, tokens[0],value)
which admittedly is not very elegant. If you have suggestions on how to avoid
the use
of eval() and still achieve the same, I would be delighted to hear them
--
https://mail.python.org/mailman/listinfo/python-list
Re: Most probably a stupid question, but I still want to ask
On 04/10/2016 11:54 PM, Steven D'Aprano wrote: On Mon, 11 Apr 2016 12:48 pm, Fillmore wrote: funny, but it seems to me that you are taking it personally... thank god i even apologized in advance for what was most probably a stupid question.. I hope you did get a laugh out of it, because it wasn't meant to be nasty. But it was meant to get you to think about statements about betraying principles and other inflammatory remarks. I did have a laugh. I don't think I talked about betraying principles. I just mentioned that in my newbie mind, I experienced what I perceived as a discontinuity. My limited understanding of what builds tuples and the (almost always to be avoided) use of eval() were the origin of my perplexities. I'll make sure I approach the temple of pythonistas bare-footed and with greater humility next time -- https://mail.python.org/mailman/listinfo/python-list
Re: one-element tuples
On 04/11/2016 12:10 AM, Ben Finney wrote: So, will we never get your statement of what surprised you between those examples? Clearly there is something of interest here. I'd like to know what the facts of the matter were; “beginner's mind” is a precious resource, not to be squandered. I thought I had made the point clear with the REPL session below. I had (what seemed to me like) a list of strings getting turned into a tuple. I was surprised that a single string wasn't turned into a single-element tuple. Now that I know that commas create tuples, but lack of commas don't, I'm not surprised anymore. >>> a = '"string1"' >>> b = '"string1","string2"' >>> c = '"string1","string2","string3"' >>> ea = eval(a) >>> eb = eval(b) >>> ec = eval(c) >>> type(ea) >>> type(eb) >>> type(ec) -- https://mail.python.org/mailman/listinfo/python-list
Re: one-element tuples
On 04/11/2016 10:10 AM, Grant Edwards wrote: What behaviour did you expect instead? That's still unclear. I must admit this is one of the best trolls I've seen in a while... shall I take it as a compliment? -- https://mail.python.org/mailman/listinfo/python-list
I have been dealing with Python for a few weeks...
...and I'm loving it. Sooo much more elegant than Perl...and so much less going back to the manual to lookup the syntax of simple data structures and operations... REPL is so useful and you guys rock too cheers -- https://mail.python.org/mailman/listinfo/python-list
Re: I have been dealing with Python for a few weeks...
On 04/14/2016 10:12 PM, justin walters wrote: On Thu, Apr 14, 2016 at 1:50 PM, Fillmore wrote: ...and I'm loving it. Sooo much more elegant than Perl...and so much less going back to the manual to lookup the syntax of simple data structures and operations... REPL is so useful and you guys rock too cheers -- https://mail.python.org/mailman/listinfo/python-list Good to hear you're enjoying it. Out of curiosity, what were you using Perl for? I am a programmer by education but have not programmed in years. Sometimes I need to get my point across to people in my team that something can be done. The fact that I could prototype in Perl (now Python) certainly makes it harder for others to argue that this or that cannot be done in (java/PHP) or that it would take a disproportionate amount of time to carry out... Thanks -- https://mail.python.org/mailman/listinfo/python-list
Python script reading from sys.stdin and debugger
Hello PyMasters! Long story short: cat myfile.txt | python -m pdb myscript.py doens't work (pdb hijacking stdin?). Google indicates that someone has fixed this with named pipes, but, call me stupid, I don't understand how I need to set up those pipes, how I need to modify my script and, above all, how I now need to invoke the program. Help and suggestions appreciated. I am using Python 3.4 on Cygwin and Ubuntu. Thanks -- https://mail.python.org/mailman/listinfo/python-list
reduction
My problem. I have lists of substrings associated to values: ['a','b','c','g'] => 1 ['a','b','c','h'] => 1 ['a','b','c','i'] => 1 ['a','b','c','j'] => 1 ['a','b','c','k'] => 1 ['a','b','c','l'] => 0 # <- Black sheep!!! ['a','b','c','m'] => 1 ['a','b','c','n'] => 1 ['a','b','c','o'] => 1 ['a','b','c','p'] => 1 I can check a bit of data against elements in this list and determine whether the value to be associated to the data is 1 or 0. I would like to make my matching algorithm smarter so I can reduce the total number of lists: ['a','b','c','l'] => 0 # If "l" is in my data I have a zero ['a','b','c'] => 1 # or a more generic match will do the job I am trying to think of a way to perform this "reduction", but I have a feeling I am reinventing the wheel. Is this a common problem that is already addressed by an existing module? I realize this is vague. Apologies for that. thank you -- https://mail.python.org/mailman/listinfo/python-list
Re: reduction
Thank you, guys. Your suggestions are avaluable. I think I'll go with the tree On 05/31/2016 10:22 AM, Fillmore wrote: My problem. I have lists of substrings associated to values: ['a','b','c','g'] => 1 ['a','b','c','h'] => 1 ['a','b','c','i'] => 1 ['a','b','c','j'] => 1 ['a','b','c','k'] => 1 ['a','b','c','l'] => 0 # <- Black sheep!!! ['a','b','c','m'] => 1 ['a','b','c','n'] => 1 ['a','b','c','o'] => 1 ['a','b','c','p'] => 1 I can check a bit of data against elements in this list and determine whether the value to be associated to the data is 1 or 0. I would like to make my matching algorithm smarter so I can reduce the total number of lists: ['a','b','c','l'] => 0 # If "l" is in my data I have a zero ['a','b','c'] => 1 # or a more generic match will do the job I am trying to think of a way to perform this "reduction", but I have a feeling I am reinventing the wheel. Is this a common problem that is already addressed by an existing module? I realize this is vague. Apologies for that. thank you -- https://mail.python.org/mailman/listinfo/python-list
loading trees...
Hi, problem for today. I have a batch file that creates "trees of data". I can save these trees in the form of python code or serialize them with something like pickle. I then need to run a program that loads the whole forest in the form of a dict() where each item will point to a dynamically loaded tree. What's my best way to achieve this? Pickle? or creating curtom python code? or maybe I am just reinventing the wheel and there are better ways to achieve this? The idea is that I'll receive a bit of data, determine which tree is suitable for handling it, and dispatch the data to the right tree for further processing... Thanks -- https://mail.python.org/mailman/listinfo/python-list
psss...I want to move from Perl to Python
I learned myself Perl as a scripting language over two decades ago. All through this time, I would revert to it from time to time whenever I needed some text manipulation and data analysis script. My problem? maybe I am stupid, but each time I have to go back and re-learn the syntax, the gotchas, the references and the derefercing, the different syntax between Perl 4 and Perl 5, that messy CPAN in which every author seems to have a different ideas of how things should be done I get this feeling I am wasting a lot of time restudying the wheel each tim... I look and Python and it looks so much more clean add to that that it is the language of choice of data miners... add to that that iNotebook looks powerful Does Python have Regexps? How was the Python 2.7 vs Python 3.X solved? which version should I go for? Do you think that switching to Python from Perl is a good idea at 45? Where do I get started moving from Perl to Python? which gotchas need I be aware of? Thank you -- https://mail.python.org/mailman/listinfo/python-list
Re: psss...I want to move from Perl to Python
So many answers. So much wisdom...thank you everyone On 01/28/2016 07:01 PM, Fillmore wrote: -- https://mail.python.org/mailman/listinfo/python-list
Re: psss...I want to move from Perl to Python
+1
On 1/29/2016 10:07 AM, Random832 wrote:
The main source of confusion is that $foo[5] is an element of @foo.
$foo{'x'} is an element of %foo. Both of these have absolutely nothing
to do with $foo.
--
https://mail.python.org/mailman/listinfo/python-list
Re: psss...I want to move from Perl to Python
I actually have a few followup question. - will iNotebook also work in Python 3? - What version of Python 3 do you recommend I install on Windows? - Is Python 3 available also for CygWin? - I use Ubuntu at home. Will I be able to install Python 3 with apt-get? will I need to uninstall previous versions? - Is there a good IDE that can be used for debugging? all free IDEs for Perl suck and it would be awesome if Python was better than that. Thanks On 1/28/2016 7:01 PM, Fillmore wrote: I learned myself Perl as a scripting language over two decades ago. All -- https://mail.python.org/mailman/listinfo/python-list
Re: psss...I want to move from Perl to Python
On 1/29/2016 4:30 PM, Rick Johnson wrote: People who are unwilling to "expanding their intellectual horizons" make me sick!!! did I miss something or is this aggressiveness unjustified? -- https://mail.python.org/mailman/listinfo/python-list
Re: psss...I want to move from Perl to Python
On 01/30/2016 05:26 AM, [email protected] wrote: Python 2 vs python 3 is anything but "solved". Python 3.5.1 is still suffering from the same buggy behaviour as in Python 3.0 . Can you elaborate? -- https://mail.python.org/mailman/listinfo/python-list
Cygwin and Python3
Hi, I am having a hard time making my Cygwin run Python 3.5 (or Python 2.7 for that matter). The command will hang and nothing happens. A cursory search on the net reveals many possibilities, which might mean a lot of trial and error, which I would very much like to avoid. Any suggestions on how I can get cygwin and Python3.5 to play together like brother and sister? thanks -- https://mail.python.org/mailman/listinfo/python-list
Re: Cygwin and Python3
On 2/9/2016 2:29 PM, [email protected] wrote: $ ls -l /usr/bin/python rm /usr/bin/python $ ln -s /usr/bin/python /usr/bin/python3.2m.exe $ /usr/bin/python --version Python 3.2.5 $ pydoc modules Still no luck (: ~ $ python --version Python 3.5.1 ~ $ python (..hangs indefinitely) ^C ~ $ pydoc modules -bash: pydoc: command not found ~ $ echo $PATH /usr/local/bin:/usr/bin:/cygdrive/c/Python27:/cygdrive/c/ Python27/Scripts:/cygdrive/c/Windows/system32:/cygdrive/ c/Windows:/cygdrive/c/Windows/System32/Wbem:/cygdrive/ c/Windows/System32/WindowsPowerShell/v1.0:/cygdrive/ c/Program Files (x86)/Common Files/Roxio Shared/OEM/ DLLShared:/cygdrive/c/Program Files (x86)/Common Files/ Roxio Shared/OEM/DLLShared:/cygdrive/c/Program Files (x86)/Common Files/Roxio Shared/OEM/12.0/ DLLShared:/cygdrive/c/Program Files (x86)/Roxio/OEM/ AudioCore:/cygdrive/c/unxutils/bin:/cygdrive/c/unxutils /usr/local/wbin:/cygdrive/c/strawberry/c/bin:/cygdrive/ c/strawberry/perl/site/bin:/cygdrive/c/strawberry/ perl/bin:/cygdrive/c/Program Files/Intel/WiFi/bin:/ cygdrive/c/Program Files/Common Files/Intel/ WirelessCommon:/cygdrive/c/Users/user/AppData/Local/ Programs/Python/Python35/Scripts:/cygdrive/c/Users/ user/AppData/Local/Programs/Python/Python35:%APPDATA% /Python/Scripts:/cygdrive/c/Program Files/Intel/WiFi/ bin:/cygdrive/c/Program Files/Common Files/Intel/ WirelessCommon -- https://mail.python.org/mailman/listinfo/python-list
Re: Cygwin and Python3
On 2/9/2016 3:30 PM, [email protected] wrote: When you run the cygwin installer you have the option of installing 2.7 > and 3.2.5, by default it will install 2.7 and 3.2 together. > After running the installer run whereis python and use the alternatives > to change it or use python3 instead of python #!/usr/bin/python3 Hope this helps. I see. I was trying to do it the Perl way. I simply linked the strawberry perl.exe from cygwin environemnt and it replaced the built in perl that sucked. OK. Backtrack. I'll try with a purely cygwin solution... Thank you -- https://mail.python.org/mailman/listinfo/python-list
Re: Cygwin and Python3
On 2/9/2016 4:47 PM, Fillmore wrote: On 2/9/2016 3:30 PM, [email protected] wrote: When you run the cygwin installer you have the option of installing 2.7 > and 3.2.5, by default it will install 2.7 and 3.2 together. > After running the installer run whereis python and use the alternatives > to change it or use python3 instead of python #!/usr/bin/python3 Hope this helps. I see. I was trying to do it the Perl way. I simply linked the strawberry perl.exe from cygwin environemnt and it replaced the built in perl that sucked. OK. Backtrack. I'll try with a purely cygwin solution... Thank you $ python --version Python 2.7.10 $ python3 --version Python 3.4.3 Thank you, Alvin -- https://mail.python.org/mailman/listinfo/python-list
Regex: Perl to Python
Hi, I'm trying to move away from Perl and go to Python.
Regex seems to bethe hardest challenge so far.
Perl:
while () {
if (/(\d+)\t(.+)$/) {
print $1." - ". $2."\n";
}
}
into python
pattern = re.compile(r"(\d+)\t(.+)$")
with open(fields_Indexfile,mode="rt",encoding='utf-8') as headerfile:
for line in headerfile:
#sys.stdout.write(line)
m = pattern.match(line)
print(m.group(0))
headerfile.close()
but I must be getting something fundamentally wrong because:
Traceback (most recent call last):
File "./slicer.py", line 30, in
print(m.group(0))
AttributeError: 'NoneType' object has no attribute 'group'
why is 'm' a None?
the input data has this format:
:
3 prop1
4 prop2
5 prop3
Thanks
--
https://mail.python.org/mailman/listinfo/python-list
Re: Regex: Perl to Python
Big thank you to everyone who offered their help! On 03/06/2016 11:38 PM, Fillmore wrote: -- https://mail.python.org/mailman/listinfo/python-list
Pythonic love
learning Python from Perl here. Want to do things as Pythonicly as possible.
I am reading a TSV, but need to skip the first 5 lines. The following
works, but wonder if there's a more pythonc way to do things. Thanks
ctr = 0
with open(prfile,mode="rt",encoding='utf-8') as pfile:
for line in pfile:
ctr += 1
if ctr < 5:
continue
allVals = line.strip().split("\t")
print(allVals)
--
https://mail.python.org/mailman/listinfo/python-list
breaking out of outer loops
I must be missing something simple because I can't find a way to break
out of a nested loop in Python.
Is there a way to label loops?
For the record, here's a Perl script of mine I'm trying to port...there
may be 'malformed' lines in a TSV file I'm parsing that are better
discarded than fixed.
my $ctr = 0;
OUTER:
while($line = ) {
$ctr++;
if ($ctr < 5) {next;}
my @allVals = split /\t/,$line;
my $newline;
foreach my $i (0..$#allVals) {
if ($i == 0) {
if ($allVals[0] =~ /[^[:print:]]/) {next OUTER;}
$newline = $allVals[0];
}
if (defined $headers{$i}) {
#if column not a number, skip line
if ($allVals[$i+1] !~ /^\d+$/) {next OUTER;}
$newline .= "\t".$allVals[$i+1];
}
}
print $newline."\n";
}
--
https://mail.python.org/mailman/listinfo/python-list
Re: Pythonic love
On 3/7/2016 6:03 PM, [email protected] wrote: On a side note, your "with open..." line uses inconsistent quoting. > You have "" on one string, but '' on another. Thanks. I'll make sure I flog myself three times later tonight... -- https://mail.python.org/mailman/listinfo/python-list
Re: breaking out of outer loops
On 3/7/2016 6:17 PM, Ian Kelly wrote: On Mon, Mar 7, 2016 at 4:09 PM, Fillmore wrote: I must be missing something simple because I can't find a way to break out of a nested loop in Python. Is there a way to label loops? No, you can't break out of nested loops, wow...this is a bit of a WTF moment to me :( apart from structuring your code such that return does what you want. Can you elaborate? apologies, but I'm new to python and trying to find my way out of perl -- https://mail.python.org/mailman/listinfo/python-list
Re: breaking out of outer loops
On 3/7/2016 6:29 PM, Rob Gaddi wrote:
You're used to Perl, you're used to exceptions being A Thing. This is
Python, and exceptions are just another means of flow control.
class MalformedLineError(Exception): pass
for line in file:
try:
for part in line.split('\t'):
if thispartisbadforsomereason:
raise MalformedLineError()
otherwisewedothestuff
except MalformedLineError:
pass
I am sure you are right, but adapting this thing here into something
that is a fix to my problem seems like abusing my 'system 2' (for those
who read a certain book by a guy called Daniel Kanheman :)
--
https://mail.python.org/mailman/listinfo/python-list
Re: breaking out of outer loops
On 3/7/2016 6:09 PM, Fillmore wrote: I must be missing something simple because I can't find a way to break out of a nested loop in Python. Thanks to everyone who has tried to help so far. I suspect this may be a case where I just need to get my head around a new paradigm -- https://mail.python.org/mailman/listinfo/python-list
Re: breaking out of outer loops
On 3/7/2016 7:08 PM, Chris Angelico wrote: Yep, which is why we're offering a variety of new paradigms. Because it's ever so much easier to get your head around three than one! We are SO helpful, guys. So helpful. :) not too dissimilarly from human languages, speaking a foreign language is more often than not a matter of learning to think differently... -- https://mail.python.org/mailman/listinfo/python-list
Other difference with Perl: Python scripts in a pipe
when I put a Python script in pipe with other commands, it will refuse to let go silently. Any way I can avoid this? $ python somescript.py | head -5 line 1 line 3 line 3 line 4 line 5 Traceback (most recent call last): File "./somescript.py", line 50, in sys.stdout.write(row[0]) BrokenPipeError: [Errno 32] Broken pipe Exception ignored in: <_io.TextIOWrapper name='' mode='w' encoding='UTF-8'> BrokenPipeError: [Errno 32] Broken pipe thanks -- https://mail.python.org/mailman/listinfo/python-list
Re: Other difference with Perl: Python scripts in a pipe
On 3/10/2016 4:46 PM, Ian Kelly wrote:
On Thu, Mar 10, 2016 at 2:33 PM, Fillmore wrote:
when I put a Python script in pipe with other commands, it will refuse to
let go silently. Any way I can avoid this?
What is your script doing? I don't see this problem.
ikelly@queso:~ $ cat somescript.py
import sys
for i in range(20):
sys.stdout.write('line %d\n' % i)
you are right. it's the with block :(
import sys
import csv
with open("somefile.tsv", newline='') as csvfile:
myReader = csv.reader(csvfile, delimiter='\t')
for row in myReader:
for i in range(20):
sys.stdout.write('line %d\n' % i)
$ ./somescript.py | head -5
line 0
line 1
line 2
line 3
line 4
Traceback (most recent call last):
File "./somescript.py", line 12, in
sys.stdout.write('line %d\n' % i)
BrokenPipeError: [Errno 32] Broken pipe
Exception ignored in: <_io.TextIOWrapper name='' mode='w'
encoding='UTF-8'>
BrokenPipeError: [Errno 32] Broken pipe
--
https://mail.python.org/mailman/listinfo/python-list
Re: Other difference with Perl: Python scripts in a pipe
On 3/10/2016 5:16 PM, Ian Kelly wrote: Interesting, both of these are probably worth bringing up as issues on the bugs.python.org tracker. I'm not sure that the behavior should be changed (if we get an error, we shouldn't just swallow it) but it does seem like a significant hassle for writing command-line text-processing tools. is it possible that I am the first one encountering this kind of issues? -- https://mail.python.org/mailman/listinfo/python-list
non printable (moving away from Perl)
Here's another handy Perl regex which I am not sure how to translate to
Python.
I use it to avoid processing lines that contain funny chars...
if ($string =~ /[^[:print:]]/) {next OUTER;}
:)
--
https://mail.python.org/mailman/listinfo/python-list
Re: Other difference with Perl: Python scripts in a pipe
On 3/10/2016 7:08 PM, INADA Naoki wrote: No. I see it usually. Python's zen says: Errors should never pass silently. Unless explicitly silenced. When failed to write to stdout, Python should raise Exception. You can silence explicitly when it's safe: try: print(...) except BrokenPipeError: os.exit(0) I don't like it. It makes Python not so good for command-line utilities -- https://mail.python.org/mailman/listinfo/python-list
Re: non printable (moving away from Perl)
On 03/11/2016 07:13 AM, Wolfgang Maier wrote:
One lesson for Perl regex users is that in Python many things can be solved
without regexes.
How about defining:
printable = {chr(n) for n in range(32, 127)}
then using:
if (set(my_string) - set(printable)):
break
seems computationally heavy. I have a file with about 70k lines, of which only 20 contain
"funny" chars.
ANy idea on how I can create a script that compares Perl speed vs. Python speed
in performing the cleaning operation?
--
https://mail.python.org/mailman/listinfo/python-list
Re: non printable (moving away from Perl)
On 3/11/2016 2:23 PM, MRAB wrote:
On 2016-03-11 00:07, Fillmore wrote:
Here's another handy Perl regex which I am not sure how to translate to
Python.
I use it to avoid processing lines that contain funny chars...
if ($string =~ /[^[:print:]]/) {next OUTER;}
:)
Python 3 (Unicode) strings have an .isprintable method:
mystring.isprintable()
my strings are UTF-8. Will it work there too?
--
https://mail.python.org/mailman/listinfo/python-list
issue with CVS module
I have a TSV file containing a few strings like this (double quotes are part of the string): '"pragma: CacheHandler=08616B7E907744E026C9F044250EA55844CCFD52"' After Python and the CVS module has read the file and re-printed the value, the string has become: 'pragma: CacheHandler=08616B7E907744E026C9F044250EA55844CCFD52' which is NOT good for me. I went back to Perl and noticed that Perl was correctly leaving the original string intact. This is what I am using to read the file: with open(file, newline='') as csvfile: myReader = csv.reader(csvfile, delimiter='\t') for row in myReader: and this is what I use to write the cell value sys.stdout.write(row[0]) Is there some directive I can give CVS reader to tell it to stop screwing with my text? Thanks -- https://mail.python.org/mailman/listinfo/python-list
Re: issue with CVS module
On 3/11/2016 3:05 PM, Joel Goldstick wrote:
Enter the python shell. Import csv
then type help(csv)
It is highly configurable
Possibly, but I am having a hard time letting it know that it should
leave each and every char alone, ignore quoting and just handle strings
as strings. I tried playing with the quoting related parameters, to no
avail:
Traceback (most recent call last):
File "./myscript.py", line 47, in
myReader = csv.reader(csvfile, delimiter='\t',quotechar='')
TypeError: quotechar must be set if quoting enabled
I tried adding CVS.QUOTE_NONE, but things get messy :(
Traceback (most recent call last):
File "./myscript.py", line 64, in
sys.stdout.write("\t"+row[h])
IndexError: list index out of range
Sorry for being a pain, but I am porting from Perl and split
/\t/,$line; was doing the job for me. Maybe I should go back to split on
'\t' for python too...
--
https://mail.python.org/mailman/listinfo/python-list
Re: issue with CVS module
On 3/11/2016 2:41 PM, Fillmore wrote:
Is there some directive I can give CVS reader to tell it to stop
screwing with my text?
OK, I think I reproduced my problem at the REPL:
>>> import csv
>>> s = '"Please preserve my doublequotes"\ttext1\ttext2'
>>> reader = csv.reader([s], delimiter='\t')
>>> for row in reader:
... print(row[0])
...
Please preserve my doublequotes
>>>
:(
How do I instruct the reader to preserve my doublequotes?
As an aside. split() performs the job correctly...
>>> allVals = s.split("\t")
>>> print(allVals[0])
"Please preserve my doublequotes"
>>>
--
https://mail.python.org/mailman/listinfo/python-list
Re: issue with CVS module
On 3/11/2016 4:14 PM, MRAB wrote: >>> import csv >>> s = '"Please preserve my doublequotes"\ttext1\ttext2' >>> reader = csv.reader([s], delimiter='\t', quotechar=None) >>> for row in reader: ... print(row[0]) ... "Please preserve my doublequotes" >>> This worked! thank you MRAB -- https://mail.python.org/mailman/listinfo/python-list
Re: issue with CVS module
On 3/11/2016 4:15 PM, Mark Lawrence wrote: https://docs.python.org/3/library/csv.html#csv.Dialect.doublequote thanks, but my TSV is not using any particular dialect as far as I understand... Thank you, anyway -- https://mail.python.org/mailman/listinfo/python-list
Re: issue with CVS module
On 3/11/2016 2:41 PM, Fillmore wrote: I have a TSV file containing a few strings like this (double quotes are part of the string): A big thank you to everyone who helped with this and with other questions. My porting of one of my Perl scripts to Python is over now that the two scripts produce virtually the same result: $ wc -l test2.txt test3.txt 70823 test2.txt 70822 test3.txt 141645 total $ diff test2.txt test3.txt 69351d69350 < there's only an extra empty line at the bottom that I'll leave as a tip to Perl ;) It was instructive. -- https://mail.python.org/mailman/listinfo/python-list
argparse
Playing with ArgumentParser. I can't find a way to override the -h and
--help options so that it provides my custom help message.
-h, --help show this help message and exit
Here is what I am trying:
parser = argparse.ArgumentParser("csresolver.py",add_help=False)
parser.add_argument("-h","--help",
help="USAGE: | myscript.py [-exf Exception
File]")
parser.add_argument("-e","--ext", type=str,
help="Exception file")
args = parser.parse_args()
The result:
$ ./myscript.py -h
usage: myscript.py [-h HELP] [-e EXT]
csresolver.py: error: argument -h/--help: expected one argument
am I missing something obvious?
--
https://mail.python.org/mailman/listinfo/python-list
Re: argparse
On 3/11/2016 6:26 PM, Larry Martell wrote: am I missing something obvious? https://docs.python.org/2/library/argparse.html#usage you rock! -- https://mail.python.org/mailman/listinfo/python-list
Perl to Python again
So, now I need to split a string in a way that the first element goes
into a string and the others in a list:
while($line = ) {
my ($s,@values) = split /\t/,$line;
I am trying with:
for line in sys.stdin:
s,values = line.strip().split("\t")
print(s)
but no luck:
ValueError: too many values to unpack (expected 2)
What's the elegant python way to achieve this?
Thanks
--
https://mail.python.org/mailman/listinfo/python-list
Re: Perl to Python again
On 3/11/2016 7:12 PM, Martin A. Brown wrote: Aside from your csv question today, many of your questions could be answered by reading through the manual documenting the standard datatypes (note, I am assuming you are using Python 3). are you accusing me of being lazy? if that's your accusation, then guilty as charged, but -- https://mail.python.org/mailman/listinfo/python-list
Re: Perl to Python again
On 03/12/2016 04:40 AM, alister wrote: On Fri, 11 Mar 2016 19:15:48 -0500, Fillmore wrote: I not sure if you were being accused of being lazy as such but actually being given the suggestion that there are other places that you can find these answers that are probably better for a number of reasons 1) Speed, you don't have to wait for someone to reply although i hope you are continuing your research whilst waiting 2) Accuracy. I have not seen it here but there are some people who would consider it fun to provide an incorrect or dangerous solution to someone they though was asking too basic a question 3) Collateral learning, whilst looking for the solution it is highly likely that you will unearth other information that answers questions you have yet to raise. Alister, you are right, of course. The reality is that I discovered this trove of a newsgroup and I am rather shamelessly taking advantage of it. Rest assured that I cross check and learn what is associated with each and every answer I get. So nothing is wasted. Hopefully your answers are also useful to others who may find them at a later stage through foofle groups. Also very important, I am very grateful for the support I am getting from you and from others. -- https://mail.python.org/mailman/listinfo/python-list
Re: retrieve key of only element in a dictionary (Python 3)
OK, this seems to do the trick, but boy is it a lot of code. Anythong more
pythonic?
>>> l = list(d.items())
>>> l
[('squib', '007')]
>>> l[0]
('squib', '007')
>>> l[0][0]
'squib'
>>>
On 03/18/2016 05:33 PM, Fillmore wrote:
I must be missing something simple, but...
Python 3.4.0 (default, Apr 11 2014, 13:05:11)
[GCC 4.8.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> d = dict()
>>> d['squib'] = "007"
>>> # I forget that 'squib' is my key to retrieve the only element in d
...
>>> type(d.items())
>>> key = d.items()[0]
Traceback (most recent call last):
File "", line 1, in
TypeError: 'dict_items' object does not support indexing
>>> key,_ = d.items()
Traceback (most recent call last):
File "", line 1, in
ValueError: need more than 1 value to unpack
>>> key,b = d.items()
Traceback (most recent call last):
File "", line 1, in
ValueError: need more than 1 value to unpack
>>> print(d.items())
dict_items([('squib', '007')])
>>> print(d.items()[0])
Traceback (most recent call last):
File "", line 1, in
TypeError: 'dict_items' object does not support indexing
>>>
what am I missing? I don't want to iterate over the dictionary.
I know that there's only one element and I need to retrieve the key
thanks
--
https://mail.python.org/mailman/listinfo/python-list
retrieve key of only element in a dictionary (Python 3)
I must be missing something simple, but...
Python 3.4.0 (default, Apr 11 2014, 13:05:11)
[GCC 4.8.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> d = dict()
>>> d['squib'] = "007"
>>> # I forget that 'squib' is my key to retrieve the only element in d
...
>>> type(d.items())
>>> key = d.items()[0]
Traceback (most recent call last):
File "", line 1, in
TypeError: 'dict_items' object does not support indexing
>>> key,_ = d.items()
Traceback (most recent call last):
File "", line 1, in
ValueError: need more than 1 value to unpack
>>> key,b = d.items()
Traceback (most recent call last):
File "", line 1, in
ValueError: need more than 1 value to unpack
>>> print(d.items())
dict_items([('squib', '007')])
>>> print(d.items()[0])
Traceback (most recent call last):
File "", line 1, in
TypeError: 'dict_items' object does not support indexing
>>>
what am I missing? I don't want to iterate over the dictionary.
I know that there's only one element and I need to retrieve the key
thanks
--
https://mail.python.org/mailman/listinfo/python-list
