Re: [Tutor] Huge list comprehension

2017-06-10 Thread Abdur-Rahmaan Janhangeer
take a look at numpy

and don't necessarily give us the whole code. it becomes too long without
purpose

Abdur-Rahmaan Janhangeer,
Mauritius
abdurrahmaanjanhangeer.wordpress.com

On 6 Jun 2017 03:26, "syed zaidi"  wrote:


hi,

I would appreciate if you can help me suggesting a quick and efficient
strategy for comparing multiple lists with one principal list

I have about 125 lists containing about 100,000 numerical entries in each

my principal list contains about 6 million entries.

I want to compare each small list with main list and append yes/no or 0/1
in each new list corresponding to each of 125 lists


The program is working but it takes ages to process huge files,
Can someone pleases tell me how can I make this process fast. Right now it
takes arounf 2 weeks to complete this task


the code I have written and is working is as under:


sample_name = []

main_op_list,principal_list = [],[]
dictionary = {}

with open("C:/Users/INVINCIBLE/Desktop/T2D_ALL_blastout_batch.txt", 'r') as
f:
reader = csv.reader(f, dialect = 'excel', delimiter='\t')
list2 = filter(None, reader)
for i in range(len(list2)):
col1 = list2[i][0]
operon = list2[i][1]
main_op_list.append(operon)
col1 = col1.strip().split("_")
sample_name = col1[0]
if dictionary.get(sample_name):
dictionary[sample_name].append(operon)
else:
dictionary[sample_name] = []
dictionary[sample_name].append(operon)
locals().update(dictionary) ## converts dictionary keys to variables
##print DLF004
dict_values = dictionary.values()
dict_keys = dictionary.keys()
print dict_keys
print len(dict_keys)
main_op_list_np = np.array(main_op_list)

DLF002_1,DLF004_1,DLF005_1,DLF006_1,DLF007_1,DLF008_1,
DLF009_1,DLF010_1,DLF012_1,DLF013_1,DLF014_1,DLM001_1,
DLM002_1,DLM003_1,DLM004_1,DLM005_1,DLM006_1,DLM009_1,
DLM011_1,DLM012_1,DLM018_1,DOF002_1,DOF003_1 =[],[],[],[],[],[],[],[],[],[]
,[],[],[],[],[],[],[],[],[],[],[],[],[]
DOF004_1,DOF006_1,DOF007_1,DOF008_1,DOF009_1,DOF010_1,
DOF011_1,DOF012_1,DOF013_1,DOF014_1,DOM001_1,DOM003_1,
DOM005_1,DOM008_1,DOM010_1,DOM012_1,DOM013_1,DOM014_1,
DOM015_1,DOM016_1,DOM017_1,DOM018_1,DOM019_1 =[],[],[],[],[],[],[],[],[],[]
,[],[],[],[],[],[],[],[],[],[],[],[],[]
DOM020_1,DOM021_1,DOM022_1,DOM023_1,DOM024_1,DOM025_1,DOM026_1 =
[],[],[],[],[],[],[]
NLF001_1,NLF002_1,NLF005_1,NLF006_1,NLF007_1,NLF008_1,
NLF009_1,NLF010_1,NLF011_1,NLF012_1,NLF013_1,NLF014_1,
NLF015_1,NLM001_1,NLM002_1,NLM003_1,NLM004_1,NLM005_1,
NLM006_1,NLM007_1,NLM008_1,NLM009_1,NLM010_1 =[],[],[],[],[],[],[],[],[],[]
,[],[],[],[],[],[],[],[],[],[],[],[],[]
NLM015_1,NLM016_1,NLM017_1,NLM021_1,NLM022_1,NLM023_1,
NLM024_1,NLM025_1,NLM026_1,NLM027_1,NLM028_1,NLM029_1,
NLM031_1,NLM032_1,NOF001_1,NOF002_1,NOF004_1,NOF005_1,
NOF006_1,NOF007_1,NOF008_1,NOF009_1,NOF010_1 =[],[],[],[],[],[],[],[],[],[]
,[],[],[],[],[],[],[],[],[],[],[],[],[]
NOF011_1,NOF012_1,NOF013_1,NOF014_1,NOM001_1,NOM002_1,
NOM004_1,NOM005_1,NOM007_1,NOM008_1,NOM009_1,NOM010_1,
NOM012_1,NOM013_1,NOM015_1,NOM016_1,NOM017_1,NOM018_1,
NOM019_1,NOM020_1,NOM022_1,NOM023_1,NOM025_1 =[],[],[],[],[],[],[],[],[],[]
,[],[],[],[],[],[],[],[],[],[],[],[],[]
NOM026_1,NOM027_1,NOM028_1,NOM029_1 = [],[],[],[]


for i in main_op_list_np:
if i in DLF002: DLF002_1.append('1')
else:DLF002_1.append('0')
if i in DLF004: DLF004_1.append('1')
else:DLF004_1.append('0')
if i in DLF005: DLF005_1.append('1')
else:DLF005_1.append('0')
if i in DLF006: DLF006_1.append('1')
else:DLF006_1.append('0')
if i in DLF007: DLF007_1.append('1')
else:DLF007_1.append('0')
if i in DLF008: DLF008_1.append('1')
else:DLF008_1.append('0')
##   if main_op_list[i] in DLF009: DLF009_1.append('1')
 ##   else:DLF009_1.append('0')
if i in DLF010: DLF010_1.append('1')
else:DLF010_1.append('0')
if i in DLF012: DLF012_1.append('1')
else:DLF012_1.append('0')
if i in DLF013: DLF013_1.append('1')
else:DLF013_1.append('0')
if i in DLF014: DLF014_1.append('1')
else:DLF014_1.append('0')
if i in DLM001: DLM001_1.append('1')
else:DLM001_1.append('0')
if i in DLM002: DLM002_1.append('1')
else:DLM002_1.append('0')
if i in DLM003: DLM003_1.append('1')
else:DLM003_1.append('0')
if i in DLM004: DLM004_1.append('1')
else:DLM004_1.append('0')
if i in DLM005: DLM005_1.append('1')
else:DLM005_1.append('0')
if i in DLM006: DLM006_1.append('1')
else:DLM006_1.append('0')
if i in DLM009: DLM009_1.append('1')
else:DLM009_1.append('0')
if i in DLM011: DLM011_1.append('1')
else:DLM011_1.append('0')
if i in DLM012: DLM012_1.append('1')
else:DLM012_1.append('0')
if i in DLM018: DLM018_1.append('1')
else:DLM018_1.append('0')
if i in DOF002: DOF002_1.append('1')
else:DOF002_1.append('0')
if i in DOF003: DOF003_1.append('1')
else:DOF003_1.append('0')
if i in DOF004: DOF004_1.append('1')
else:DOF004_1.app

Re: [Tutor] Huge list comprehension

2017-06-10 Thread Alan Gauld via Tutor
On 10/06/17 08:35, Abdur-Rahmaan Janhangeer wrote:
> take a look at numpy

It seems he already has, np.array is in his code.
It's just the imports that are missing I suspect.

> and don't necessarily give us the whole code. it becomes too long without
> purpose

Yes although in this case it does serve to highlight
the problems with his approach - as highlighted by Peter.


-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] string reversal using [::-1]

2017-06-10 Thread Vikas YADAV
Question: Why does "123"[::-1] result in "321"?

 

MY thinking is [::-1] is same as [0:3:-1], that the empty places defaults to
start and end index of the string object.

So, if we start from 0 index and decrement index by 1 till we reach 3, how
many index we should get? I think we should get infinite infinite number of
indices (0,-1,-2,-3.).

 

This is my confusion.

I hope my question is clear.

 

Thanks,

Vikas

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Python - help with something most essential

2017-06-10 Thread Japhy Bartlett
It's really awkward the way you're using Counter here... you're making new
instances in every lambda (which is not great for memory usage), and then
not actually using the Counter functionality:

return sum(1 for _ in filter(lambda x: Counter(word) == Counter(x.strip()),
fileContent))

(the whole point of the Counter() is to get sums, you don't need to do any
of this !!)

I'm not sure that they cared about how you used file.readlines(), I think
the memory comment was a hint about instantiating Counter()s


anyhow, all of this can be much much simpler:

"""
sortedword = sorted(inputWord)  # an array of letters in the word, in
alphabetical order
count = 0

with open(filename) as f:
for word in f.read().split(' '):  # iterate over every word in the file
if sorted(word) == sortedword:
count +=1

print count
"""

which could in turn probably be written as a one liner.

So even though you cleaned the code up a bit, it's still quite a bit more
complicated then it needs to be, which makes it seem like your fundamentals
are not great either!





On Tue, Jun 6, 2017 at 2:31 AM, Peter Otten <__pete...@web.de> wrote:

> Schtvveer Schvrveve wrote:
>
> > I need someone's help. I am not proficient in Python and I wish to
> > understand something. I was in a job pre-screening process where I was
> > asked to solve a simple problem.
> >
> > The problem was supposed to be solved in Python and it was supposed to
> > take two arguments: filename and word. The program reads the file which
> is
> > a .txt file containing a bunch of words and counts how many of those
> words
> > in the file are anagrams of the argument.
> >
> > First I concocted this solution:
> >
> > import sys
> > from collections import Counter
> >
> > def main(args):
> > filename = args[1]
> > word = args[2]
> > print countAnagrams(word, filename)
> >
> > def countAnagrams(word, filename):
> >
> > fileContent = readFile(filename)
> >
> > counter = Counter(word)
> > num_of_anagrams = 0
> >
> > for i in range(0, len(fileContent)):
> > if counter == Counter(fileContent[i]):
> > num_of_anagrams += 1
> >
> > return num_of_anagrams
> >
> > def readFile(filename):
> >
> > with open(filename) as f:
> > content = f.readlines()
> >
> > content = [x.strip() for x in content]
> >
> > return content
> >
> > if __name__ == '__main__':
> > main(sys.argv)
> >
> > Very quickly I received this comment:
> >
> > "Can you adjust your solution a bit so you less loops (as little as
> > possible) and also reduce the memory usage footprint of you program?"
> >
> > I tried to rework the methods into this:
> >
> > def countAnagrams(word, filename):
> >
> > fileContent = readFile(filename)
> >
> > return sum(1 for _ in filter(lambda x: Counter(word) ==
> > Counter(x.strip()), fileContent))
> >
> > def readFile(filename):
> >
> > with open(filename) as f:
> > content = f.readlines()
> >
> > return content
> >
> > And I was rejected. I just wish to understand what I could have done for
> > this to be better?
> >
> > I am a Python beginner, so I'm sure there are things I don't know, but I
> > was a bit surprised at the abruptness of the rejection and I'm worried
> I'm
> > doing something profoundly wrong.
>
> for i in range(0, len(stuff)):
> ...
>
> instead of
>
> for item in stuff:
> ...
>
> and
>
> content = file.readlines() # read the whole file into memory
> process(content)
>
> are pretty much the most obvious indicators that you are a total newbie in
> Python. Looks like they weren't willing to give you the time to iron that
> out on the job even though you knew about lambda, Counter, list
> comprehensions and generator expressions which are not newbie stuff.
>
> When upon their hint you did not address the root cause of the unbounded
> memory consumption they might have come to the conclusion that you were
> reproducing snippets you picked up somewhere and thus were cheating.
>
> ___
> Tutor maillist  -  Tutor@python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
>
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] string reversal using [::-1]

2017-06-10 Thread Alan Gauld via Tutor
On 10/06/17 17:39, Vikas YADAV wrote:
> Question: Why does "123"[::-1] result in "321"?
> 
> MY thinking is [::-1] is same as [0:3:-1], that the empty places defaults to
> start and end index of the string object.

Did you try that? You may be surprised.
The wonderful thing about the >>> prompt is that it's as quick
to try it out as to guess.

Look in the archives for a recent post by Steven(29/5,
"Counting a string backwards") that explains slicing,
it may help.

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor