Re: [Tutor] Python - help with something most essential

2017-06-11 Thread Peter Otten
Japhy Bartlett wrote:

> I'm not sure that they cared about how you used file.readlines(), I think
> the memory comment was a hint about instantiating Counter()s

Then they would have been clueless ;)

Both Schtvveer's original script and his subsequent "Verschlimmbesserung" -- 
beautiful german word for making things worse when trying to improve them --  
use only two Counters at any given time. The second version is very 
inefficient because it builds the same Counter over and over again -- but 
this does not affect peak memory usage much.

Here's the original version that triggered the comment:

[Schtvveer Schvrveve]

> import sys
> from collections import Counter
> 
> def main(args):
> filename = args[1]
> word = args[2]
> print countAnagrams(word, filename)
> 
> def countAnagrams(word, filename):
> 
> fileContent = readFile(filename)
> 
> counter = Counter(word)
> num_of_anagrams = 0
> 
> for i in range(0, len(fileContent)):
> if counter == Counter(fileContent[i]):
> num_of_anagrams += 1
> 
> return num_of_anagrams
> 
> def readFile(filename):
> 
> with open(filename) as f:
> content = f.readlines()
> 
> content = [x.strip() for x in content]
> 
> return content
> 
> if __name__ == '__main__':
> main(sys.argv)
> 
 
referenced as before.py below, and here's a variant that removes 
readlines(), range(), and the [x.strip() for x in content] list 
comprehension, the goal being minimal changes, not code as I would write it 
from scratch.

# after.py
import sys
from collections import Counter

def main(args):
filename = args[1]
word = args[2]
print countAnagrams(word, filename)

def countAnagrams(word, filename):

fileContent = readFile(filename)
counter = Counter(word)
num_of_anagrams = 0

for line in fileContent:
if counter == Counter(line):
num_of_anagrams += 1

return num_of_anagrams

def readFile(filename):
# this relies on garbage collection to close the file
# which should normally be avoided
for line in open(filename):
yield line.strip()

if __name__ == '__main__':
main(sys.argv)

How to measure memoryview? I found

 and as test data I use files containing 10**5 and 10**6 
integers. With that setup (snipping everything but memory usage from the 
time -v output):

$ /usr/bin/time -v python before.py anagrams5.txt 123
6
Maximum resident set size (kbytes): 17340
$ /usr/bin/time -v python before.py anagrams6.txt 123
6
Maximum resident set size (kbytes): 117328


$ /usr/bin/time -v python after.py anagrams5.txt 123
6
Maximum resident set size (kbytes): 6432
$ /usr/bin/time -v python after.py anagrams6.txt 123
6
Maximum resident set size (kbytes): 6432

See the pattern? before.py uses O(N) memory, after.py O(1). 

Run your own tests if you need more datapoints or prefer a different method 
to measure memory consumption.

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] string reversal using [::-1]

2017-06-11 Thread Peter Otten
Vikas YADAV wrote:

> Question: Why does "123"[::-1] result in "321"?
> 
>  
> 
> MY thinking is [::-1] is same as [0:3:-1], that the empty places defaults
> to start and end index of the string object.
> 
> So, if we start from 0 index and decrement index by 1 till we reach 3, how
> many index we should get? I think we should get infinite infinite number
> of indices (0,-1,-2,-3.).
> 
>  
> 
> This is my confusion.
> 
> I hope my question is clear.

It takes a slice object and a length to replace the missing aka None values 
with actual integers. You can experiment with this a bit in the interpreter:

>>> class A:
... def __getitem__(self, index):
... return index
... 
>>> a = A()
>>> a[::-1]
slice(None, None, -1)
>>> a[::-1].indices(10)
(9, -1, -1)
>>> a[::-1].indices(5)
(4, -1, -1)
>>> a[::].indices(5)
(0, 5, 1)

So what a missing value actually means depends on the context.

Note that while I'm a long-time Python user I still smell danger when a 
negative step is involved. For everything but items[::-1] I prefer

reversed(items[start:stop])  # an iterator, not an object of type(items)

when possible.

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor