[Tutor] Help on best way to check resence of item inside list
Dear All clubA= ["mary","luke","amyr","marco","franco","lucia", "sally","genevra"," electra"] clubB= ["mary","rebecca","jane","jessica","judit","sharon","lucia", "sally"," Castiel","Sam"] I have a list of names that I would to annotate in function of presence in different clubs: my input files is a long file where I have this : mary luke luigi jane jessica rebecca luis à with open("file.in") as p: mit = [] for i in p: lines =i.strip("\n").split("\t") if (lines[0] in clubA: G =lines[-1] +["clubA"] else: G = lines[-1] +["no"] mit.append(G) for i in mit: if i.strip("\n").split("\t")[0] in clubB: G =lines[-1] +["clubB"] else: G = lines[-1] +["no"] finale.append(G) ### I just wonder if is appropriate to use a loops to check if is present the value on a list. Is it the right way? I can use a dictionary because I have many repeated names. In the end I wan to have mary clubA clubB luke clubA luigi no Thanks in advance for any help ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Help on best way to check resence of item inside list
On 27/05/2014 09:05, jarod...@libero.it wrote: Dear All clubA= ["mary","luke","amyr","marco","franco","lucia", "sally","genevra"," electra"] clubB= ["mary","rebecca","jane","jessica","judit","sharon","lucia", "sally"," Castiel","Sam"] I have a list of names that I would to annotate in function of presence in different clubs: my input files is a long file where I have this : mary luke luigi jane jessica rebecca luis à with open("file.in") as p: mit = [] for i in p: lines =i.strip("\n").split("\t") if (lines[0] in clubA: G =lines[-1] +["clubA"] else: G = lines[-1] +["no"] mit.append(G) for i in mit: if i.strip("\n").split("\t")[0] in clubB: G =lines[-1] +["clubB"] else: G = lines[-1] +["no"] finale.append(G) ### I just wonder if is appropriate to use a loops to check if is present the value on a list. Is it the right way? I can use a dictionary because I have many repeated names. In the end I wan to have mary clubA clubB luke clubA luigi no Thanks in advance for any help You can use the in keyword to check for an item in a list. However a very quick glance at your code suggests that you could cut out the list completely and do the same using the in keyword against your dict. Better still I think the defaultdict is what you need here, I'll leave you to look it up as I must dash :) -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence --- This email is free from viruses and malware because avast! Antivirus protection is active. http://www.avast.com ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Help on best way to check resence of item inside list
"jarod...@libero.it" Wrote in message: > Dear All > > clubA= ["mary","luke","amyr","marco","franco","lucia", "sally","genevra"," > electra"] > clubB= ["mary","rebecca","jane","jessica","judit","sharon","lucia", "sally"," > Castiel","Sam"] > > I have a list of names that I would to annotate in function of presence in > different clubs: > > my input files is a long file where I have this : > > mary > luke > luigi > jane > jessica > rebecca > luis > à > > with open("file.in") as p: > mit = [] > for i in p: >lines =i.strip("\n").split("\t") >if (lines[0] in clubA: > G =lines[-1] +["clubA"] >else: >G = lines[-1] +["no"] > mit.append(G) > > > for i in mit: >if i.strip("\n").split("\t")[0] in clubB: > G =lines[-1] +["clubB"] >else: >G = lines[-1] +["no"] > finale.append(G) > ### > I just wonder if is appropriate to use a loops to check if is present the > value on a list. Is it the right way? I can use a dictionary because I have > many repeated names. > > In the end I wan to have > > > mary clubA clubB > luke clubA > luigi no > Thanks in advance for any help There are numerous errors in the above code. You should use copy/paste, so we don't waste energy identifying errors that don't even exist in your actual code. As it stands, it wouldn't even compile. But even if you fix the typos and indentation errors and initialization errors, you still have logic errors if you want the output you specify. First, the second loop doesn’t set the lines variable at all, but just uses the value from the last iteration of the first loop. Second, even if you untangle that, luigi would end up with two 'no's, not one. You don't say what list could have repeats. I don't see any in your sample data. You also don't say how they should be treated. For example, Are all seventeen mary's in clubA? Now to your specific question. You aren't using the loops to check the lists for a name. You're quite reasonably using 'in'. You can accomplish your apparent goal much more reasonably by using a single loop and more complex if elif and else. -- DaveA ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Help on best way to check resence of item inside list
On Tue, May 27, 2014 at 10:05:30AM +0200, jarod...@libero.it wrote: [...] > with open("file.in") as p: > mit = [] You have lost the indentation, which makes this code incorrect. But the rest of the code is too complicated. > for i in p: >lines =i.strip("\n").split("\t") >if (lines[0] in clubA: > G =lines[-1] +["clubA"] >else: >G = lines[-1] +["no"] > mit.append(G) > > for i in mit: >if i.strip("\n").split("\t")[0] in clubB: > G =lines[-1] +["clubB"] >else: >G = lines[-1] +["no"] > finale.append(G) Look at the result you want to get: > mary clubA clubB > luke clubA > luigi no That suggests to me that the best data structure is a dict with sets: {'mary': set(['clubA', 'clubB']), 'luke': set(['clubA']), 'luigi': set(), } Something like this should work: names = {} with open("file.in") as p: # This assumes the data file is one name per line. for line in p: name = line.strip() s = names.get(name, set()) # If name not in the names, # return an empty set. if name in clubA: s.add("clubA") if name in clubB: s.add("clubB") names[name] = s print(names) And I think that should work. -- Steven ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Help on best way to check resence of item inside list
jarod...@libero.it wrote: > Dear All > > clubA= ["mary","luke","amyr","marco","franco","lucia", "sally","genevra"," > electra"] > clubB= ["mary","rebecca","jane","jessica","judit","sharon","lucia", > "sally"," Castiel","Sam"] > > I have a list of names that I would to annotate in function of presence > in different clubs: > > my input files is a long file where I have this : > > mary > luke > luigi > jane > jessica > rebecca > luis > à > > with open("file.in") as p: > mit = [] > for i in p: >lines =i.strip("\n").split("\t") >if (lines[0] in clubA: > G =lines[-1] +["clubA"] >else: >G = lines[-1] +["no"] > mit.append(G) > > > for i in mit: >if i.strip("\n").split("\t")[0] in clubB: > G =lines[-1] +["clubB"] >else: >G = lines[-1] +["no"] > finale.append(G) > ### > I just wonder if is appropriate to use a loops to check if is present the > value on a list. Is it the right way? I can use a dictionary because I > have many repeated names. You mean you have people who are members in more than one club? You can still use a dictionary if you make the value a list: # untested code! membership = {} club = "club_of_people_who_are_not_in_any_club" members = ["Erwin", "Kurt", "Groucho"] for name in members: # can be simplified with dict.setdefault() or collections.defaultdict if name in membership: membership[name].append(club) else: membership[name] = [club] Put that in a loop over (club, members) pairs, and you'll end up with a dict that maps name --> list_of_clubs. Then iterate over the lines in the file: with open("file.in") as source: for line in source: name = line.strip() if name in membership: print(name, *membership[name]) else: print(name, "no") > > In the end I wan to have > > > mary clubA clubB > luke clubA > luigi no > Thanks in advance for any help > ___ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] R: Tutor Digest, Vol 123, Issue 65
Thanks so much for the help! and suggestions I not use dictionary because I have some items are repeated inside the list >Messaggio originale >Da: tutor-requ...@python.org >Data: 27/05/2014 12.00 >A: >Ogg: Tutor Digest, Vol 123, Issue 65 > >Send Tutor mailing list submissions to > tutor@python.org > >To subscribe or unsubscribe via the World Wide Web, visit > https://mail.python.org/mailman/listinfo/tutor >or, via email, send a message with subject or body 'help' to > tutor-requ...@python.org > >You can reach the person managing the list at > tutor-ow...@python.org > >When replying, please edit your Subject line so it is more specific >than "Re: Contents of Tutor digest..." > > >Today's Topics: > > 1. Help on best way to check resence of item inside list > (jarod...@libero.it) > 2. Re: Help on best way to check resence of item inside list > (Mark Lawrence) > > >-- > >Message: 1 >Date: Tue, 27 May 2014 10:05:30 +0200 (CEST) >From: "jarod...@libero.it" >To: tutor@python.org >Subject: [Tutor] Help on best way to check resence of item inside > list >Message-ID: <975274200.13645911401177930495.JavaMail.actor@webmail48> >Content-Type: text/plain;charset="UTF-8" > >Dear All > >clubA= ["mary","luke","amyr","marco","franco","lucia", "sally","genevra"," >electra"] >clubB= ["mary","rebecca","jane","jessica","judit","sharon","lucia", "sally"," >Castiel","Sam"] > >I have a list of names that I would to annotate in function of presence in >different clubs: > >my input files is a long file where I have this : > >mary >luke >luigi >jane >jessica >rebecca >luis >? > >with open("file.in") as p: >mit = [] >for i in p: > lines =i.strip("\n").split("\t") > if (lines[0] in clubA: > G =lines[-1] +["clubA"] > else: > G = lines[-1] +["no"] >mit.append(G) > > >for i in mit: > if i.strip("\n").split("\t")[0] in clubB: > G =lines[-1] +["clubB"] > else: > G = lines[-1] +["no"] > finale.append(G) >### >I just wonder if is appropriate to use a loops to check if is present the >value on a list. Is it the right way? I can use a dictionary because I have >many repeated names. > >In the end I wan to have > > >mary clubA clubB >luke clubA >luigi no >Thanks in advance for any help > > >-- > >Message: 2 >Date: Tue, 27 May 2014 09:48:27 +0100 >From: Mark Lawrence >To: tutor@python.org >Subject: Re: [Tutor] Help on best way to check resence of item inside > list >Message-ID: >Content-Type: text/plain; charset=UTF-8; format=flowed > >On 27/05/2014 09:05, jarod...@libero.it wrote: >> Dear All >> >> clubA= ["mary","luke","amyr","marco","franco","lucia", "sally","genevra"," >> electra"] >> clubB= ["mary","rebecca","jane","jessica","judit","sharon","lucia", "sally"," >> Castiel","Sam"] >> >> I have a list of names that I would to annotate in function of presence in >> different clubs: >> >> my input files is a long file where I have this : >> >> mary >> luke >> luigi >> jane >> jessica >> rebecca >> luis >> ? >> >> with open("file.in") as p: >> mit = [] >> for i in p: >> lines =i.strip("\n").split("\t") >> if (lines[0] in clubA: >>G =lines[-1] +["clubA"] >> else: >> G = lines[-1] +["no"] >> mit.append(G) >> >> >> for i in mit: >> if i.strip("\n").split("\t")[0] in clubB: >> G =lines[-1] +["clubB"] >> else: >> G = lines[-1] +["no"] >>finale.append(G) >> ### >> I just wonder if is appropriate to use a loops to check if is present the >> value on a list. Is it the right way? I can use a dictionary because I have >> many repeated names. >> >> In the end I wan to have >> >> >> mary clubA clubB >> luke clubA >> luigi no >> Thanks in advance for any help > >You can use the in keyword to check for an item in a list. However a >very quick glance at your code suggests that you could cut out the list >completely and do the same using the in keyword against your dict. >Better still I think the defaultdict is what you need here, I'll leave >you to look it up as I must dash :) > > >-- >My fellow Pythonistas, ask not what our language can do for you, ask >what you can do for our language. > >Mark Lawrence > >--- >This email is free from viruses and malware because avast! Antivirus protection is active. >http://www.avast.com > > > > >-- > >Subject: Digest Footer > >___ >Tutor maillist - Tutor@python.org >https://mail.python.org/mailman/listinfo/tutor > > >-- > >End of Tutor Digest, Vol 123, Issue 65 >** > __
[Tutor] I am having difficulty grasping 'generators'
I am studying python on my own (i.e. i am between the beginner and intermediate level) and i haven't met any difficulty until i reached the topic 'Generators and Iterators'. I need an explanation so simple as using the expression 'print ()', in this case 'yield'. Python 2.6 here! Thank you. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] pipes and redirecting
I'm trying to run the following unix command from within Python as opposed to calling an external Bash script (the reason being I'm doing it multiple times within a for loop which is running through a list): "dd if=/home/adam/1 bs=4k conv=noerror,notrunc,sync | pbzip2 > 1.img.bz2" The first thing I do is break it into two assignments (I know this isn't strictly necessary but it makes the code easier to deal with): ddIf = shlex.split("dd if=/home/adam/1 bs=4k conv=noerror,notrunc,sync") compress = shlex.split("pbzip2 > /home/adam/1.img.bz2") I have looked at the docs here (and the equivalent for Python 3) https://docs.python.org/2/library/subprocess.html. I can get a 'simple' pipe like the following to work: p1 = subprocess.Popen(["ps"], stdout=PIPE) p2 = subprocess.Popen(["grep", "ssh"], stdin=p1.stdout, stdout=subprocess.PIPE) p1.stdout.close() output = p2.communicate()[0] I then try to adapt it to my example: p1 = subprocess.Popen(ddIf, stdout=subprocess.PIPE) p2 = subprocess.Popen(compress, stdin=p1.stdout, stdout=subprocess.PIPE) p1.stdout.close() output = p2.communicate()[0] I get the following error: pbzip2: *ERROR: File [>] NOT found! Skipping... --- pbzip2: *ERROR: Input file [/home/adam/1.img.bz2] already has a .bz2 extension! Skipping I think that the '>' redirect needs to be dealt with using the subprocess module as well but I can't quite put the pieces together. I'd appreciate any guidance. Thanks. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] I am having difficulty grasping 'generators'
> I need an explanation so simple as using the expression 'print ()', in this > case 'yield'. > Python 2.6 here! Ever write any C programs with static variables? Generators can be explained in those terms if you have experience with them. Alan ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] I am having difficulty grasping 'generators'
On Tue, May 27, 2014 at 12:27 PM, Degreat Yartey wrote: > I am studying python on my own (i.e. i am between the beginner and > intermediate level) and i haven't met any difficulty until i reached the > topic 'Generators and Iterators'. > I need an explanation so simple as using the expression 'print ()', in this > case 'yield'. You can think of a generator as almost like a function, except it can return, not just once, but multiple times. Because it can return multiple times, if we squint at it enough, it acts like a _sequence_, just like the other sequence-like things in Python like files and lists and tuples. That is, as a sequence, it's something that we can walk down, element by element. We can loop over it. For example, let's say that we wanted to represent the same sequences as that of range(5). Here's one way we can do it with a generator: # def upToFive(): yield 0 yield 1 yield 2 yield 3 yield 4 # Let's try it. # >>> sequence = upToFive() >>> next(sequence) 0 >>> next(sequence) 1 >>> next(sequence) 2 >>> next(sequence) 3 >>> next(sequence) 4 >>> next(sequence) Traceback (most recent call last): File "", line 1, in StopIteration >>> >>> >>> for x in upToFive(): ... print("I see %d" % x) ... I see 0 I see 1 I see 2 I see 3 I see 4 # Now this is a toy example. If we wanted range(5), we'd just say "range(5)" and be done with it. What's neat about generators is that they make it easy to build these sequences while pretending that we're writing a plain function. All of the even numbers, for examples, is a sequence that we can make with a generator: # def onlyEvens(): n = 0 while True: yield n n = n + 2 # Let's try running it: # >>> sequence = onlyEvens() >>> next(sequence) 0 >>> next(sequence) 2 >>> next(sequence) 4 >>> next(sequence) 6 # And note that this sequence doesn't stop! We can keep calling next() on it and it will continue to run. We _can_ write a loop to run over such infinite sequences, but we'll also have to make sure to stop it manually: it won't exhaust otherwise, so doing something like: # for n in onlyEvens(): ... # better have something in the "..." that interrupts or returns, or else that loop will never end. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] I am having difficulty grasping 'generators'
On Tue, May 27, 2014 at 12:27 PM, Degreat Yartey wrote: > I am studying python on my own (i.e. i am between the beginner and > intermediate level) and i haven't met any difficulty until i reached the > topic 'Generators and Iterators'. > I need an explanation so simple as using the expression 'print ()', in this > case 'yield'. > Python 2.6 here! > Thank you. > Simply put, a generator is just like a function - except that a function returns a value and then dies, but a generator yields a value and sticks around, waiting to yield another one - and another, and another, etc. I had some difficulty grasping the concept of generators at first because the examples you get in the tutorials are either trivial (you could do the same thing some other way, and it would be simpler) or horribly complex (and you get bogged down in the problem, rather than getting to grips with generators themselves.) So here's a real-life example - the first time I actually used generators... and they actually saved me a LOT of hard work and confusion. I needed to read some database files (from a very old database, for which I don't have an API - so I basically needed to read each record and interpret it byte-by-byte). I wrote a bunch of classes for the different tables in the database (I'll use Patient() as my example), but the heart of the operation is a generator. def RecordGenerator(office=None, fileType=None): obj = fTypes[fileType]()# create an empty record so we can read its attributes recLen = obj.RecordLength# such as the record length, the filename, etc. with open(getattr(office, obj.TLA) + '.dat','rb') as inFile: tmpIn = inFile.read() buf = StringIO.StringIO(tmpIn) inRec = buf.read(recLen)# initialize inRec and burn the header record while not (len(inRec) < recLen): inRec = buf.read(recLen) obj = fTypes[fileType](inRec) if (obj.Valid): yield obj buf.close() Now I'm writing a program that needs to process all the patients in office 01. I write the following: for obj in RecordGenerator(01, "Patient"): doStuffPart1 doStuffPart2 doStuffPart3 etc. The generator creates an empty Patient() object, looks at its attributes, and finds that the filename is "01PAT.dat" and the records are 1024 bytes long. It opens the file and reads it into tmpIn, then grabs 1024-byte chunks of tmpIn and passes them to Patient() until it finds a valid (not deleted, not corrupted) patient record. It "yields" that record to the caller, and waits. When "doStuffPart3" has completed, I go back to the top of the "for" loop and ask for the next patient. RecordGenerator steps through tmpIn, 1024 bytes at a time, until it finds a valid patient, then yields it and waits. When it hits the end of the file (or, more precisely, when the remaining length of tmpIn is less than a full record length), it quits. Essentially, you write a generator as if it were a function - but you use it as if it were a sequence. Best of both worlds, baby! ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Move files to a new directory with matching folder names
Thanks all, I got it to work using: import os import shutil destination_prefix="/Data/test/" source_prefix="/Data/test1/" for year in range(2011, 2013): year = str(year) for month in range(1, 13): # convert to a string with leading 0 if needed month = "%02d" % month source = os.path.join(source_prefix, year, month+"/") #print source destination = os.path.join(destination_prefix, year, month+"/") for files in os.listdir(source): print files filepath=os.path.join(source, files) print "copying", files, "from:", source, "to:", destination shutil.copy(filepath, destination) ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor