Re: [Tutor] Understanding a linear runtime implementation of anagram detection
On 10 December 2015 at 12:38, Spyros Charonis wrote: > Dear All, > > I am learning about analysis of algorithms (python 2.7.6). I am reading a > book (Problem solving with Algorithms and Data Structures) where Python is > the language used for implementations. The author introduces algorithm > analysis in a clear and understandable way, and uses an anagram detection > program as a template to compare different runtime implementations > (quadratic, log linear, linear). In the linear, and most efficient > implementation, the code is as follows (comments added by me): This is a good example problem for demonstrating the idea of computational complexity but having actually wanted to write a program that used anagram detection recently I concluded that a quadratic performance can be acceptable or maybe even preferable for this. Here is my own anagram function: def anagram(s1, s2): seen = [False] * len(s2) for c1 in s1: for n2, c2 in enumerate(s2): if c1 == c2 and not seen[n2]: seen[n2] = TrueI would do this by using a dict break else: return False else: return all(seen) This has O(N**2) complexity with N the string lengths. However the real average complexity depends on the distribution of strings. The loops in the above only run to completion if the two strings are anagrams. Most of the time if you need to check whether two strings are anagrams the probability that they are is small. When two strings are not anagrams then the chances are typically high that the first character of s1 is not in s2 at all. What that means is that the most common runtime behaviour is to take the first character c1 from s1 then loop once through s2 verifying that none of the characters in s2 is equal to c1. So in the most common case we have a single iteration through each character of one of the strings, whereas the "linear" complexity algorithm you showed will always loop through both strings fully. There are other differences such as not assuming lower-case ascii characters etc. but these are not so important to this point. If course in practice in Python the above code is inefficient because it is implemented by the interpreter. The following is O(N*log(N)) but will probably out-compete any of the examples shown: def anagram(s1, s2): return sorted(s1) == sorted(s2) The sorted function in Python is very efficient so it will be hard to beat the performance of that unless the strings really are long. This is why anagrams are also a poor example of algorithmic performance: in most contexts where we might practically want to check for anagrams the strings involved are likely to be very short meaning that the big-O type analysis is useless. > My questions are: ... > > 2) > How could I go about adapting this algorithm for multiple strings (say I > had 20 strings and wanted to check if they are anagrams of one another). > > def are_anagrams(*args): > > """ Accepts a tuple of strings and checks if > > they are anagrams of each other """ > > > # Check that neither of strings are null > > for i in args: > > if not i: > > raise TypeError, "Invalid input" > > return None > > > > # Initialize a list of counters for each string > > c = ( [] for i in range(len(args) ) ??? The complexity problem here is that you want to avoid comparing all N strings against each other. If you do then your comparing N*(N-1)/2 (quadratic) strings for anagram-ness. Peter suggested using tuples of counts so that a dict can check them. I would use sorted string keys: def anagrams(*words): bykey = {} for word in words: key = ''.join(sorted(word)) if key not in bykey: bykey[key] = [word] else: bykey[key].append(word) return bykey.values() -- Oscar ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Tutor Digest, Vol 142, Issue 11
Dear Alan, Thank you very much for answering the question. If you don't mind, please kindly let me know which library I should focus on among beautifulsoup, selenium + PhantomJS, and dryscrape. Have a good day regards, Hank On Sun, Dec 13, 2015 at 5:11 PM, wrote: > Send Tutor mailing list submissions to > tutor@python.org > > To subscribe or unsubscribe via the World Wide Web, visit > https://mail.python.org/mailman/listinfo/tutor > or, via email, send a message with subject or body 'help' to > tutor-requ...@python.org > > You can reach the person managing the list at > tutor-ow...@python.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Tutor digest..." > > > Today's Topics: > >1. Re: Calculation error with a simple program (Steven D'Aprano) >2. Re: Calculation error with a simple program (Jim Gallaher) >3. Re: Calculation error with a simple program (Todd Purple) >4. Re: Tutor Digest, Vol 142, Issue 10 (Ken Hammer) >5. Beautiful Soup (Crusier) >6. Re: Beautiful Soup (Alan Gauld) > > > -- > > Message: 1 > Date: Sun, 13 Dec 2015 04:00:25 +1100 > From: Steven D'Aprano > To: tutor@python.org > Subject: Re: [Tutor] Calculation error with a simple program > Message-ID: <20151212170025.gk3...@ando.pearwood.info> > Content-Type: text/plain; charset=us-ascii > > On Sat, Dec 12, 2015 at 01:03:05AM -0600, Jim Gallaher wrote: >> Hi everyone. I'm reading through a beginners Python book and came up >> with a super simple program. I'm not getting any errors and everything >> runs through, but there's a logical calculation error. What the program >> does is take an amount and calculate a couple percentages and add a >> couple fees. >> >> For example, if I put in a value of 1, it will output 752.12 as the sub >> total and 753.12 as the grand total. It's off by 1 on sub total and 2 on >> grand total. > > Check your arithmetic -- with a base price of $1, the sub total is > $752.12 and the grand total is $753.12, exactly as Python calculates. > > > But I wonder whether you have made a logic mistake in your code. You > say: > > dealerPrep = basePrice + 500 > destinationCharge = basePrice + 250 > > This means that the car will cost more than TRIPLE the base price. > Instead of $1, enter a base price of $40,000 and you will see what I > mean: now the dealer prep is $40500 and the destination charge is > $40250, which added to the base price gives $120750 (plus tax and > licence). Surely that's not right, the deal charges more than the cost > of the car for preparation? I think what you want is: > > dealerPrep = 500 > destinationCharge = 250 > > > > -- > Steve > > > -- > > Message: 2 > Date: Sat, 12 Dec 2015 11:34:27 -0600 > From: Jim Gallaher > To: tutor@python.org > Subject: Re: [Tutor] Calculation error with a simple program > Message-ID: <8774a398-4a0f-41df-bbd2-941b49a3c...@gmail.com> > Content-Type: text/plain; charset=us-ascii > > Hi Alan, > > I'm 100 percent sure I'm wrong. :-) I verified it when I fixed the mistake. > The problem was it was adding in the basePrice and the fixed > rates/percentages each time. So I figured it out when Ian said something > about that. > > Thanks for everyone's help! :-) > > -- > > Message: 3 > Date: Sat, 12 Dec 2015 08:13:23 -0500 > From: Todd Purple > To: Jim Gallaher > Cc: tutor@python.org > Subject: Re: [Tutor] Calculation error with a simple program > Message-ID: <6ec6d759-3c19-4a6d-8dd7-6efae1958...@yahoo.com> > Content-Type: text/plain; charset=us-ascii > > > >> On Dec 12, 2015, at 2:03 AM, Jim Gallaher wrote: >> >> Hi everyone. I'm reading through a beginners Python book and came up with a >> super simple program. I'm not getting any errors and everything runs >> through, but there's a logical calculation error. What the program does is >> take an amount and calculate a couple percentages and add a couple fees. >> >> For example, if I put in a value of 1, it will output 752.12 as the sub >> total and 753.12 as the grand total. It's off by 1 on sub total and 2 on >> grand total. >> >> Thanks in advance! Jim Gallaher >> >> # Car Salesman Calculator >> >> # User enters the base price of the car and the program adds tax, license, >> dealer prep, and destination charge. >> >> print("Car Sales Calculator") >> basePrice = int(input("Please enter in the price of the car: ")) >> >> # Misc charges to be added to the total cost of the car >> tax = basePrice * .07 >> license = basePrice * .05 >> dealerPrep = basePrice + 500 >> destinationCharge = basePrice + 250 >> > > I think your main problem is right here. Why do dealerPrep and > destinationCharge include the basePrice in their equation? This will add the > basePrice in multiple times. I would simply set it like this: > > dealerPrep = 500 > destinationCharge = 250 > > If these char
[Tutor] parser recommendations (was Re: Tutor Digest, Vol 142, Issue 11)
On 14/12/15 16:16, Crusier wrote: Please always supply a useful subject line when replying to the digest and also delete all irrelevant text. Some people pay by the byte and we have all received these messages already. > Thank you very much for answering the question. If you don't mind, > please kindly let me know which library I should focus on among > beautifulsoup, selenium + PhantomJS, and dryscrape. I don't know anything about the others but Beautiful soup is good for html, especially badly written/generated html. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] interface
So far all my python programming has been done using print for output and (raw)input whenever user input is required. I'd like to learn how to provide a graphical interface. There are several choices with pros and cons to each but an alternative more general approach might be to use a web based interface which I believe would make the code less platform dependent. I'd be interested in any comments regarding this idea and would be grateful for any suggestions as to frameworks that might be helpful. I've been looking at django but finding the learning curve to be steep and it may not be the best solution. Of late I've been using Python3; my platform is Ubuntu 14.4. Thanks in advance for any guidance. Alex ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] interface
On 14/12/2015 21:48, Alex Kleider wrote: So far all my python programming has been done using print for output and (raw)input whenever user input is required. I'd like to learn how to provide a graphical interface. There are several choices with pros and cons to each but an alternative more general approach might be to use a web based interface which I believe would make the code less platform dependent. I'd say your belief is simply wrong, many Python desktop GUIs are cross platform. I'd start with tkinter as it's available with Python, there are loads of tutorials available for it and you'll be able to get advice here or on the main Python mailing list. I'd be interested in any comments regarding this idea and would be grateful for any suggestions as to frameworks that might be helpful. I've been looking at django but finding the learning curve to be steep and it may not be the best solution. Django strikes me as pure overkill for your needs. If you must go down the web route I'd start here https://wiki.python.org/moin/WebFrameworks at the section headed "Popular Non Full-Stack Frameworks". Of late I've been using Python3; my platform is Ubuntu 14.4. Thanks in advance for any guidance. Alex -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] interface
Alex Kleider writes: > I'd be interested in any comments regarding this idea and would be > grateful for any suggestions as to frameworks that might be helpful. The collected wisdom of the community includes this page at the Python wiki https://wiki.python.org/moin/GuiProgramming>, and a brief comparison of toolkits is at https://wiki.python.org/moin/GUI%20Programming%20in%20Python>. > I've been looking at django but finding the learning curve to be steep > and it may not be the best solution. That seems a reasonable assessment; if you just want a UI for a not-particularly-website program, Django may be a poor fit. > Of late I've been using Python3; my platform is Ubuntu 14.4. I recommend you start with the default, heavily-documented Tkinter https://docs.python.org/3/library/tkinter.html> and the Ttk extension https://docs.python.org/3/library/tkinter.ttk.html>. Both of these are part of the Python 3 system in Ubuntu. You can find a good guide to Tk programming in Python as one stream of the excellent http://www.tkdocs.com/>. -- \“If you have the facts on your side, pound the facts. If you | `\ have the law on your side, pound the law. If you have neither | _o__) on your side, pound the table.” —anonymous | Ben Finney ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] parser recommendations (was Re: Tutor Digest, Vol 142, Issue 11)
beautifulsoup, selenium + PhantomJS, and dryscrape no knowledge of dryscape, never used it. The other tools/apps are used to handle/parse html/websites. Ssoup can handle xml/html as well as other input structs. Good for being able to parse the resulting struct/dom to extract data, or to change/modify the struct itself. Selenium is a framework, acting as a browser env, allowing you to 'test' the site/html. It's good for certain uses regarding testing. Phantomjs/casperjs are exxentially headless broswers, allow you to also run/parse websites. While Soup is more for static, Phantom because it's an actual headless browser, allows you to deal with dynamic sites as well as static. On Mon, Dec 14, 2015 at 2:56 PM, Alan Gauld wrote: > On 14/12/15 16:16, Crusier wrote: > > Please always supply a useful subject line when replying to the digest > and also delete all irrelevant text. Some people pay by the byte and we > have all received these messages already. > >> Thank you very much for answering the question. If you don't mind, >> please kindly let me know which library I should focus on among >> beautifulsoup, selenium + PhantomJS, and dryscrape. > > I don't know anything about the others but Beautiful soup > is good for html, especially badly written/generated html. > > -- > Alan G > Author of the Learn to Program web site > http://www.alan-g.me.uk/ > http://www.amazon.com/author/alan_gauld > Follow my photo-blog on Flickr at: > http://www.flickr.com/photos/alangauldphotos > > > ___ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] interface
On 14/12/15 21:48, Alex Kleider wrote: > provide a graphical interface. There are several choices with pros and > cons to each but an alternative more general approach might be to use a > web based interface which I believe would make the code less platform > dependent. Not necessarily since the web server end still needs to live on a particular OS platform and any OS level operations will be just as restrictive for a web app as for a GUI. But a web app is able to be opened up to multiple platforms as clients so it depends on how you intend to deploy it. If its just for you (or for any one user) then the web doesn't help much and adds lots of extra learning. A GUI will involve new concepts too and a lot more code than a CLI like you have been using. But a web app will require whole new languages(HTML/CSS/Javascript) as well as new concepts. A much steeper learning curve unless you already come from a web background. The simplest GUI framework is Tkinter, but its also fairly limiting longer term. For putting a quick form interface on top of an existing app its great but for a full GUI app its less effective. It also has virtually no GUI builder tool support. wxPython is slightly more complex but more powerful and looks more like the native UI on each platform. (There is also Dabo which is wxPython based but has an IDE with GUI builder and is very good for database oriented apps.) pyQt/Side and PyGTK are both much more complex but also more powerful. IDES with GUI Builders exist for both although none of them worked very well for me. They are all cross platform and you more or less trade power and flexibility for simplicity. Personally I'd start with Tkinter - its a good general intro to the concepts of GUI programming and standard Python so portable to other users without extra downloads. Then decide which of the other three to migrate too if you need more power. Personally I find Tkinter good enough for almost all my GUI needs, but I'm not distributing them or selling them. All of the above is strictly my personal opinion of course. Others may disagree. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] is v == for small ints
Somebody posted a question about this today and I approved it but it hasn't shown up. We have had about 6 or 7 such problems in the last month. Mainly they have been thank-you messages so I didn't make an issue of it but a couple have been genuine questions, like this was. So if you have posted recently and it hasn't got through, my apologies, but I have no idea what is going wrong. As to the content of this message, the gist was that he had tried 'is' and == with both small and large ints. When using small ints the results were the same for both operators but for large ints they differed. He knew the conceptual difference between 'is' and == but didn't understand why the results were different for large/small ints. The answer is of course that small ints are cached internally as an optimisation so all references to 5, for example, refer to the same object in memory and therefore both is (object identity) and == (value equality) are true. Large ints result in two separate objects each with the same value so 'is' becomes False and == remains true. This behaviour should never be relied on since the definition of a "small int" is potentially implementation dependant and could even change with Python version or platform. Always use == to test integers. Apologies again about the missing messages. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] is v == for small ints
On Tue, Dec 15, 2015 at 01:54:30AM +, Alan Gauld wrote: > This behaviour should never be relied on since the definition > of a "small int" is potentially implementation dependant and > could even change with Python version or platform. Always > use == to test integers. It isn't just *potentially* implementation dependent, it certainly is implementation dependant. Different versions of Python have used different rules for which small ints are cached, and the rules for when they will be re-used or cached. For example, this is with Python 2.7: py> a = 11 py> b = 11 py> a is b False py> a = 12; b = 12 py> a is b True You CANNOT rely on this behaviour, it WILL change from version to version. -- Steve ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] interface
Thank you, gentlemen (Alan, Ben, Mark,) for your advice. The consensus seems to be in favour of tkinter so I'll head in that direction. Cheers, Alex ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] interface
Alex Kleider writes: > Thank you, gentlemen (Alan, Ben, Mark,) for your advice. > The consensus seems to be in favour of tkinter > so I'll head in that direction. Keep in mind that Tkinter is the “one obvious way to do it” for GUIs in Python. It is the general-purpose toolkit included with the standard library. There are other ways (i.e. other GUI toolkits for Python) that may be better suited to your specific case, but are not the obvious general default — and they impose a higher deployment burden because they are third-party libraries. So, start with Tkinter and get it working that way. Then, once you better understand your specific case by making a working solution, you can re-assess. Tkinter might be just fine, or you may want to try others. But start with Tkinter, yes. -- \ “It seems intuitively obvious to me, which means that it might | `\ be wrong.” —Chris Torek | _o__) | Ben Finney ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor