Re: [Tutor] Help error 504
Danny Yoo wrote: > Unit tests that depend on external dependencies can be "flaky": they > might fail for reasons that you don't anticipate. I'd recommend not > depending on an external web site like this: it puts load on someone > else, which they might not appreciate. If you have a well-defined error like an exception raised in a specific function by all means write a unit test that ensures that your code can handle it. Whether you use unittest.mock or the technique you describe below is a matter of taste. Even if your ultimate goal is a comprehensive set of unit tests a tool like httbin has its merits as it may help you find out what actually happens in real life situtations. Example: Will a server error raise an exception in urlopen() or is it sometimes raised in the request.read() method? Also, mocking can give you a false sense of security. coverage.py may report 100% coverage for a function like def process(urlopen=urllib.request.urlopen): result = [] for url in [1, 2]: try: r = urlopen(url) result.append(r.read()) except urllib.error.HTTPError: pass return result which will obviously fail once you release it into the wild. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] value range checker
Hello, I have a written a function checks the validity of values. The ranges of valid values are stored in a database table. Such a table contains three columns: category, min and max. One record of such a table specifies the range for a certain category, but a category may be spread out over multiple records. For example, the category-min-max tuples ("cat a", 1, 1), ("cat a", 3, 3), ("cat a", 6, 10), correspond to a range of category A of 1-1, 3-3, 6-10, which is the same as 1, and 3, and 6, 7, 8, 9, 10. The code below does exactly what I want: import collections import bisect import math import pprint def get_valid_value_lookup(records): """ Translates value range information (from a database table) into a dictionary of the form {: []} """ Boundaries = collections.namedtuple("Boundaries", "category min max") records = [Boundaries(*record) for record in records] boundaries = collections.defaultdict(list) crap = [boundaries[record.category].__iadd__(range(record.min, record.max + 1)) for record in records] return dict(boundaries) def is_valid(lookup, category, value): """Return True if value is member of a list of a given category, False otherwise.""" try: return value in lookup[category] except KeyError: raise KeyError("Invalid category: %r" % category) def is_valid2(lookup, category, value): """Return True if value is member of a list of a given category, False otherwise.""" # this version also knows how to deal with floats. try: L = lookup[category] except KeyError: raise KeyError("Invalid category: %r" % category) adjusted_value = value if int(value) in (L[0], 0, L[-1]) else math.ceil(value) try: chopfunc = bisect.bisect_right if value < L[0] else bisect.bisect_left return L[chopfunc(L, value)] == adjusted_value except IndexError: return False if __name__ == '__main__': L = [("cat a", 1, 1), ("cat a", 3, 3), ("cat a", 6, 10), ("cat b", 1, 9), ("cat c", 1, 2), ("cat c", 5, 9)] lookup = get_valid_value_lookup(L) assert not is_valid(lookup, "cat a", 999) # value 999 is invalid for category 1 assert is_valid(lookup, "cat a", 10) assert not is_valid2(lookup, "cat a", 0.1) assert not is_valid2(lookup, "cat a", -1) assert is_valid2(lookup, "cat a", 6.1) L2 = [(1, -5, 1), (1, 3, 3), (1, 6, 10), (2, 1, 9), (3, 1, 2), (3, 5, 9)] lookup = get_valid_value_lookup(L2) assert is_valid2(lookup, 1, -4.99) assert is_valid2(lookup, 1, -5) My questions: [1] @ is_valid: is there a better way to do this? I mostly don't like the use of the __iadd__ dunder method. [2] @ is_valid2: Perhaps an answer to my previous question. Is this a better approach? [3] I am inheriting a this system. It feels a bit strange that these range check values are stored in a database. Would yaml be a better choice? Some of the tables are close to 200 records. Thank you in advance! Albert-Jan ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] value range checker
On 26/08/15 14:19, Albert-Jan Roskam wrote: I have a written a function checks the validity of values. > The ranges of valid values are stored in a database table. That's an unusual choice because: 1) using a database normally only makes sense in the case where you are already using the database to store the other data. But in that case you would normally get validation done using a database constraint. 2) For small amounts of data the database introduces a significant overhead. Databases are good for handling large amounts of data. 3) A database is rather inflexible since you need to initialise it, create it, etc. Which limits the number of environments where it can be used. Such a table contains three columns: category, min and max. ... a category may be spread out over multiple records. And searching multiple rows is even less efficient. Would yaml be a better choice? Some of the tables are close to 200 records. Mostly I wouldn't use a data format per-se (except for persistence between sessions). I'd load the limits into a Python set and let the validation be a simple member-of check. Unless you are dealing with large ranges rather than sets of small ranges. Even with complex options I'd still opt for a two tier data structure. But mostly I'd query any design that requires a lot of standalone data validation. (Unless its function is to be a bulk data loader or similar.) I'd probably be looking to having the data stored as objects that did their own validation at creation/modification time. If I was doing a bulk data loader/checker I'd probably create a validation function for each category and add it to a dictionary. So I'd write a make_validator() function that took the validation data and created a specific validator function for that category. Very simple example: def make_validator(min, max, *values): def validate(value): return (min <= value <= max) or value in *values) return validator ... for category in categories: lookup[category] = make_validator(min,max, valueList) ... if lookup[category](my_value): # process valid value else: raise ValueError -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] How should my code handle db connections? Should my db manager module use OOP?
My ongoing project will be centered around an SQLite db. Since almost all data needed by the program will be stored in this db, my thought is that I should create a connection to this db shortly after program startup and keep this connection open until program closure. I am assuming that opening and closing a db connection has enough overhead that I should only do this once. But I do not *know* that this is true. Is it? If not, then the alternative would make more sense, i.e., open and close the db as needed. In the first iteration of my project, my intent is to create and populate the db with tables external to the program. The program will only add entries to tables, query the db, etc. That is, the structure of the db will be pre-set outside of the program, and the program will only deal with data interactions with the db. My intent is to make the overall design of the program OO, but I am wondering how to handle the db manager module. Should I go OO here as well? With each pertinent method handling a very specific means of interacting with the db? Or go a procedural route with functions similar to the aforementioned methods? It is not clear to me that OOP provides a real benefit here, but, then again, I am learning how to OOP during this project as well, so I don't have enough knowledge yet to realistically answer this question. TIA! -- boB ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] How should my code handle db connections? Should my db manager module use OOP?
On Wed, Aug 26, 2015 at 07:11:42PM -0500, boB Stepp wrote: > My ongoing project will be centered around an SQLite db. Since almost > all data needed by the program will be stored in this db, my thought > is that I should create a connection to this db shortly after program > startup and keep this connection open until program closure. If you do this, you will (I believe) hit at least three problems: - Now only one program can access the DB at a time. Until the first program closes, nobody else can open it. - Your database itself is vulnerable to corruption. SQLite is an easy to use database, but it doesn't entirely meet the ACID requirements of a real DB. - If your database lives on a NTFS partition, which is very common for Linux/Unix users, then if your program dies, the database will very likely be broken. I don't have enough experience with SQLite directly to be absolutely sure of these things, but Firefox uses SQLite for a bunch of things that (in my opinion) don't need to be in a database, and it suffers from these issues, especially on Linux when using NTFS. For example, if Firefox dies, when you restart you may lose all your bookmarks, history, and most bizarrely of all, the back button stops working. -- Steve ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] How should my code handle db connections? Should my db manager module use OOP?
On Aug 26, 2015 9:03 PM, "Steven D'Aprano" wrote: > - If your database lives on a NTFS partition, which is very common for > Linux/Unix users > these issues, especially on Linux when using NTFS. Surely you mean NFS, as in Network FileSystem, rather than NTFS as in New Technology FileSystem? :) -- Zach (On a phone) ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] How should my code handle db connections? Should my db manager module use OOP?
Hi there, My ongoing project will be centered around an SQLite db. Not a bad way to start. There are many possible ways to access SQL DBs. I'll talk about one of my favorites, since I'm a big fan of sqlalchemy [0], which provides a broad useful toolkit for dealing with SQL DBs and an abstraction layer. To start, often the question is why any such abstraction tool, given the additional complexity of a module, a.k.a. another layer of code? Briefly, my main two reasons: A) abstraction of data model from SQL implementation for the Python program (allows switching from SQLite another DBAPI, e.g. postgres, later with a minimum effort) B) somebody has already implemented the tricky bits, such as ORMs (see below), failover, connection pooling (see below) and other DB-specific features Since almost all data needed by the program will be stored in this db, my thought is that I should create a connection to this db shortly after program startup and keep this connection open until program closure. That is one possible approach. But, consider using a "connection pooling" technique that somebody else has already implemented and tested. This saves your time for working on the logic of your program. There are many different pooling strategies, which include things like "Use only one connection at a time." or "Connect on demand." or "Hold a bunch of connections open and let me use one when I need one, and I'll release it when I'm done." and even "When the connection fails, retry quietly in the background until a successful connection can be re-established." I am assuming that opening and closing a db connection has enough overhead that I should only do this once. But I do not *know* that this is true. Is it? If not, then the alternative would make more sense, i.e., open and close the db as needed. Measure, measure, measure. Profile it before coming to such a conclusion. You may be correct, but, it behooves you to measure. (My take on an old computing adage: Premature optimization can lead you down unnecessarily painful or time consuming paths.) N.B. Only you (or your development cohort) can anticipate the load on the DB, the growth of records (i.e. data set size), the growth of the complexity of the project, or the user count. So, even if the measurements tell you one thing, be sure to consider the longer-term plan for the data and application. Also, see Steven D'Aprano's comments about concurrency and other ACIDic concerns. In the first iteration of my project, my intent is to create and populate the db with tables external to the program. The program will only add entries to tables, query the db, etc. That is, the structure of the db will be pre-set outside of the program, and the program will only deal with data interactions with the db. If the structure of the DB is determined outside the program, this sounds like a great reason to use an Object Relational Modeler (ORM). An ORM which supports reflection (sqlalchemy does) can create Pythonic objects for you. My intent is to make the overall design of the program OO, but I am wondering how to handle the db manager module. Should I go OO here as well? With each pertinent method handling a very specific means of interacting with the db? Or go a procedural route with functions similar to the aforementioned methods? It is not clear to me that OOP provides a real benefit here, but, then again, I am learning how to OOP during this project as well, so I don't have enough knowledge yet to realistically answer this question. I'm not sure I can weigh in intelligently here (OOP v. procedural), but I'd guess that you could get that Object-Oriented feel by taking advantage of an ORM, rather than writing one yourself. Getting used to the idea of an ORM can be tricky, but if you can get reflection working [1], I think you will be surprised at how quickly your application logic (at the business layer) comes together and you can (mostly) stop worrying about things like connection logic and SQL statements executing from your Python program [2]. There probably are a few people on this list who have used sqlalchemy and are competent to answer it, but if you have questions specifically about sqlalchemy, you might find better answers on their mailing list [3]. Now, back to the beginnings...a SQLite DB is a fine place to start if you have only one thread/user/program accessing the data at any time. Don't host it on a network(ed) file system if you have the choice. If your application grows so much in usage or volume that it needs a new and different DB, consider it all a success and migrate accordingly. Best of luck, -Martin [0] http://www.sqlalchemy.org/ [1] http://docs.sqlalchemy.org/en/rel_1_0/core/reflection.html [2] Here, naturally, I'm assuming that you know your way around SQL, since you are asserting that the DB already exists, is mai