[Tutor] Generating simple HTML from within Python
I'm updating an application at the moment that generates simple HTML files. However, it's all done using plain strings and brute force. It's hard to read, and isn't very robust in the face of special characters and matching start and end tags. I've been searching for a module to allow simple HTML generation, but most of them appear to be more concerned with *parsing* HTML and XML. The only one that looks promising from reviews dating back to 1998 or so is HTMLgen, but the link on starship.python.net appears long dead. So my question is, what is the preferred way of generating simple HTML or XHTML files these days? Is there a 'son of HTMLgen' or similar module? Cheers Duncan PS. Apologies if this shows up twice. My first post seems to be MIA. ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Generating simple HTML from within Python
> I've been searching for a module to allow simple HTML generation, but > most of them appear to be more concerned with *parsing* HTML and XML. > The only one that looks promising from reviews dating back to 1998 or > so is HTMLgen, but the link on starship.python.net appears long dead. > > So my question is, what is the preferred way of generating simple HTML > or XHTML files these days? Is there a 'son of HTMLgen' or similar module? OK, my thanks to everyone who replied, either on the list or privately. I followed the links, and the links from those links, and poked around in the various documentation. What I want to generate depends on the contents of directories, etc. and is therefore has more variation than is handled easily in the templating systems. I chose the HyperText module (http://dustman.net/andy/python/HyperText/) because this fits more naturally with the code I'm updating, and also allows subclassing for customisation, etc. To give you a flavour, one code snippet changes from: if len(theCase.inputData.relatedFiles)>0: htmlFile.write('The related file(s) for this case are:\n') htmlFile.write('\n') for related_file in theCase.inputData.relatedFiles: htmlFile.write('%s\n' % ( related_file, related_file)) htmlFile.write('\n') to the slightly more verbose, but much clearer and less error-prone: if len(testCaseData.inputData.relatedFiles) > 0: text = "The related file(s) for this case are:" heading = HTML.H2(text) document.append(heading) bulletList = HTML.UL() for fileName in testCaseData.inputData.relatedFiles: anchor = HTML.A(fileName, href=fileName) listItem = HTML.LI(anchor) bulletList.append(listItem) document.append(bulletList) and the module handles all of the start and end tags, indentation and other pretty printing. My only concern is that it is almost as old as HTMLgen :-( Cheers Duncan ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] newbie: Reading text file
> I have a text file (mylist.py actually), it contains exactly below: > --- > # file mylist.py > jobs = [ > 'Lions', > 'SysTest', > 'trainDD', > 'Cats', > 'train', > 'sharks', > 'whale', > ] > > > I want to write another script and get the list "jobs" from the above > script. I assume you mean in another python script? Try: import mylist print mylist.jobs ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] converting a source package into a dll/shared library?
Is it possible to convert a Python package, with __init__.py and related python modules, into a single DLL or shared library that can be imported in the same way? We have used py2exe and cx_freeze to create a complete executable, but we are curious whether there is a middle way between this single executable and distributing all of the source files. I've been searching the documentation and web but haven't yet found the magic combination of keywords that throws up what we want. Does such a possibility exist? If yes, can someone provide me a pointer? The Background: We have developed a demonstration tool in Python that parses input data for ToolA written by CompanyA, converts to our own internal neutral format, and can write in CompanyB's ToolB format. Now that the proof of concept has been shown, the companies want to integrate the conversion directly in their tools, but providing the code for ToolA to CompanyB raises some issues, and similarly the other way. Providing a DLL of the ToolA reader to CompanyB, and a DLL of the ToolB writer to CompanyA might be one way around these issues, but it's not clear whether this is easy to achieve. Cheers Duncan ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] converting a source package into a dll/shared library?
I asked: > Is it possible to convert a Python package, with __init__.py and > related python modules, into a single DLL or shared library that can > be imported in the same way? > > We have used py2exe and cx_freeze to create a complete executable, > but we are curious whether there is a middle way between this single > executable and distributing all of the source files. Kent suggested: K> You can get a modest degree of obscurity by distributing the .pyc K> bytecode files instead of the .py source. These can still be K> decompiled and reverse engineered but it is more effort. Yes we really just want to protect ourselves from giving code relating to various companies' products to their competitors, even though we wrote the code ourselves based on the behaviour of our own test input files for their tools. We don't want any company to feel that we have exposed any internal details about how their tools might work. We had already found the py_compile and compileall modules, and had even gone one further than .pyc files to create .pyo files because these have no docstrings either. We also found the dis module, but it is unlikely that anyone will to go to all the effort to disassemble and reverse engineer the code. K> I suppose you could rewrite some or all of the code into the Python K> dialect supported by Pyrex and compile it that way. That's something to remember for the future, but we already have 100K lines of code for the 6 different tool formats that we currently handle, so it would be non-trivial to change now. Alan also suggested: A> Since you refer to DLLs I'll assume a Windoze platform. A> If so the answer is yes you can create an ActiveX/COM object. A> A> So if its accessibility to non Python code you are interested A> in grab a copy of Mark Hammonds Win32 book for details A> and examples. You can even go DCOM if thats significant. It's a demonstrator project, so we've written the whole thing in Python for the rapid development side, and so far we have not had to worry about accessing non-Python code. Each company only needs to see how we've mapped their own tool formats into our neutral format. Actual integration in their non-Python code will be a completely different story involving an alternate code generator spitting out C/C++... Thanks for confirming our ideas and the extra feedback. Duncan ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] converting a source package into a dll/shared library?
Alan asked me: > As to access to the internal data would it be difficult to > parameterise it so that each companies data is in a > local config file? That way even if the dissasemble > the code it won't help? Unfortunately no. The input files are human readable geometrical models. Each tool has a different way of defining the basic shapes, coordinate systems and transformations, material properties, etc. so there's a lot of knowledge embedded in each of the reader modules to allow conversion from each tool's format to our own neutral format. We don't want to expose knowledge of the internals of one tool to its competitors. Anyway, this is now off-topic for this list so I'll stop here. Thanks for the help and suggestions. Duncan ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Efficiency of Doxygen on Python vs C++?
> Is this not just evidence of a very bad Python coding style? > Should we not always declare *all* class fields in the class definition > by assigning to them, even if the assignment is token or dummy > i.e. 'None', "", [], {} etc. this is one of the many things that pylint can warn you about. It's like pychecker but customisable for your coding standards. See http://www.logilab.org/project/eid/857 Cheers Duncan ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] pychecker: x is None or x == None
We've been programming in Python for about a year. Initially we had a lot of tests of the form if x == None: do_something() but then someone thought that we should really change these to if x is None: do_something() However. if you run pychecker on these two snippets of code, it complains about the second, and not the first: x.py:6: Using is None, may not always work So the question is, which one should we really be using? If it is the second, how do I get pychecker to shut up? I've hunted around in the documentation, and if there is a clear discussion about this issue, I must have missed it. Cheers Duncan ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] deriving class from file to handle input line numbers?
I was sure that this must be a frequently asked [homework?] question, but trying to search for 'file open line number' didn't throw up the sort of answers I was looking for, so here goes... I regularly write parsers for simple input files, and I need to give helpful error messages if the input is invalid. The three most helpful pieces on information are file name, line number and line content. Of course I can use a global variable to track how many times f.readline() is called, but I was wondering whether there is a more OO or Python way of encapsulating this within a class derived from file. What I have below is the minimal interface that I could come up with, but it is a file adaptor rather than a derived class, and it doesn't seem quite clean to me, because I have to open the file externally and then pass the file object into the constructor. Is there a better way of doing it, or am I chasing rainbows? Cheers Duncan class LineCountedInputFile(object): """ add input line count to minimal input File interface The file must be opened externally, and then passed into the constructor. All access should occur through the readLine method, and not using normal File methods on the external file variable, otherwise things will get out of sync and strange things could happen, including incorrect line number. """ __slots__ = ( '_inputFile', '_lineNumber') def __init__(self, inputFile): """ create a LineCountedInputFile adaptor object :param inputFile: existing File object already open for reading only :type inputFile: `File` """ assert isinstance(inputFile, file) assert not inputFile.closed and inputFile.mode == 'r' self._inputFile = inputFile self._lineNumber = 0 #-- def readLine(self): """ call file.readline(), strip excess whitespace, increment line number :return: next line of text from file minus excess whitespace, or None at end-of-file :rtype: str """ line = self._inputFile.readline() if len(line) == 0: return None self._lineNumber += 1 return line.strip() #-- def _get_fileName(self): return self._inputFile.name fileName = property(_get_fileName, None, None, """(read-only)""") #-- def _get_lineNumber(self): return self._lineNumber lineNumber = property(_get_lineNumber, None, None, """(read-only)""") ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] deriving class from file to handle input line numbers?
Kent Johnson wrote: > If you just want to keep track of line numbers as you read the file by lines, > you could use enumerate(): > > f = open('myfile.txt') > for line_number, line in enumerate(f): > ... This is neat, but not all of the parsers work on a 'line by line' basis, so sometimes there are additional calls to f.readline() or equivalent in other places in the code based on what has just been read. > What problem did you have when deriving from file? To be perfectly honest, I didn't try it because even if I had declared class MyFile(file): etc I couldn't see how to have an instance of MyFile returned from the built-in 'open' function. I thought this was the crux of the problem. So what I have done is provide a different interface completely, class MyFile(object): etc I could extend this to take the file name in the constructor, and add a MyFile.open() method, but then I can no longer substitute any MyFile instances in places that expect 'file' instances. Now that I've started to explain all of this to someone else, I'm starting to wonder whether it really matters. Do I really need it to be substitutable for 'file' after all ? I need to sit in a darkened room and think for a while... Cheers Duncan ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] deriving class from file to handle input line numbers?
I wrote: > > class MyFile(file): > > etc > > > > I couldn't see how to have an instance of MyFile returned from the > > built-in 'open' function. I thought this was the crux of the problem. Kent Johnson replied: > open() is actually just an alias for file(): > >>> open is file > True Thank you very much! You have just provided me with the vital piece of information I needed and everything has just clicked into place. Now that I know that I've searched the documentation again and found: The file() constructor is new in Python 2.2 and is an alias for open(). Both spellings are equivalent. The intent is for open() to continue to be preferred for use as a factory function which returns a new file object. The spelling, file is more suited to type testing (for example, writing "isinstance(f, file)"). See http://docs.python.org/lib/built-in-funcs.html#l2h-25 Cheers Duncan ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] is there any Python code for spatial tessellation?
This is off-topic for Python: I've seen this on the FLTK mailing list, but you might find some of the links under this page useful: http://myweb.tiscali.co.uk/oaktree/nshea/tesselsphere/tesselsphere_index.html "OpenGL spherical subdivision utility. Currently employs particle and geodesic modules. GUI morphers can split Delaunay and Voronoi hulls to create new cells in the lattice. Additionally, morphers can target individual cells to split or stellate. Can be used to generate vertices for geodesic spheres, pollen, radiolaria, viruses and other polyhedra. Saves VRML 1.0 and POV-Ray inc file." Disclaimer: This isn't my field of expertise. I haven't used this software. Don't ask me any questions about it. ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] Handling function parameters of mixed object and basic types
I've taken over someone else's code (yes, honestly!) that has a complex class hierarchy on top of the main procedural code. This is unfortunate because it means that isinstance() is everywhere. Profiling recently highlighted one particular formatted output function that has a cascade of isinstance() tests where the order of the tests is significant, as in the example: def oldOutput(x): if x is None: pass elif isinstance(x, Z): pass elif isinstance(x, Y): pass elif isinstance(x, X): pass elif isinstance(x, int): pass elif isinstance(x, float): pass elif isinstance(x, str): pass else: pass # NOTE: # In the real code, there are various enumeration classes # derived from int, so we can't even test for the built in # types before we test for particular classes. I don't like this, because we are usurping Pythons class handling, and suggested that we create methods in the classes and let Python do the work, and replace the above with something like: def newOutput(x): if x is None: pass return try: x.output() except AttributeError: if isinstance(x, int): pass elif isinstance(x, float): pass elif isinstance(x, str): pass else: pass However, when I verified this example using timeit, the results were completely unexpected. The time to resolve the objects remains the same, but for the built-in types raising and catching the exception means that resolution of built-in types takes 3 or 4 times longer. The improved robustness of the code for objects is obviously good, but not at the expense of killing performance for the built-in types. Have I made a basic boo-boo in my test code? Is there a better way of speeding up the original function? I don't really want to spend hours (days?) implementing this in the real code if I'm barking up the wrong tree. I attach the full example code below. Cheers Duncan #- class X(object): def __init__(self): self.x = 0 def output(self): pass class Y(X): def __init__(self): X.__init__(self) self.y = 0 def output(self): pass class Z(Y): def __init__(self): Y.__init__(self) self.z = 0 def output(self): pass def oldOutput(x): if x is None: pass elif isinstance(x, Z): pass elif isinstance(x, Y): pass elif isinstance(x, X): pass elif isinstance(x, int): pass elif isinstance(x, float): pass elif isinstance(x, str): pass else: pass def newOutput(x): if x is None: pass return try: x.output() except AttributeError: if isinstance(x, int): pass elif isinstance(x, float): pass elif isinstance(x, str): pass else: pass if __name__ == '__main__': from timeit import Timer # first test that the functions 'work' before timing them # for i in (None, 1, 1.0, "one", X(), Y(), Z(), []): oldOutput(i) newOutput(i) # now time the functions # for i in ('None', '1', '1.0', '"one"', 'X()', 'Y()', 'Z()', '[]'): s = 'oldOutput(%s)' % i t = Timer(s, 'from __main__ import X, Y, Z, oldOutput, newOutput') print 'old', i, t.timeit() s = 'newOutput(%s)' % i t = Timer(s, 'from __main__ import X, Y, Z, oldOutput, newOutput') print 'new', i, t.timeit() print ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Handling function parameters of mixed object and basic types
I wrote: > >def newOutput(x): > >if x is None: > >pass > >return > >try: > >x.output() > >except AttributeError: > >if isinstance(x, int): > >pass > >elif isinstance(x, float): > >pass > >elif isinstance(x, str): > >pass > >else: > >pass > > > > However, when I verified this example using timeit, the results > > were completely unexpected. The time to resolve the objects > > remains the same, but for the built-in types raising and catching > > the exception means that resolution of built-in types takes 3 or > > 4 times longer. Alan replied: > Yes, thats because whehn you call x.output when the method doesn't > exist Python has to navigate the entire class heirarchy looking for > the missing method before it can return the exception. > > The solution is to extend the logic you used for None - ie put all > the non object cases first, then, only if it is an object, use > isinstance. That was my first thought when I looked at this last week, and just put all of the isinstance(built-in) tests at the top. However, as mentioned in the original post, but maybe not very clearly, there are some enumeration classes that derive from int, so I have to check for those before I check for int: class EnumerationType(int): pass if x is None: pass elif isinstance(x, EnumerationType): pass elif isinstance(x, int): pass ... And of course, there may be specific Enumeration type classes to be tested before testing for the generic, so we end up back at square one, having to know the class hierarchy so that we can test in the correct order. And that's why it was so disappointing to find that doing it the OO way to improve the code might give such poor performance. My colleague suggested using x.__class__ and below as an index into a jump table, but this just perpetuates the unmaintainable instead of letting Python's class mechanisms do all this for us. The EnumerationType class claims to have 'low memory usage and fast performance' but maybe we need to look at re-implementing it to derive from object and not int. Unfortunately it's one of the key abstractions in the code, and these enumerations are used just about everywhere. I'll need to see if other classes derive from built-in types too... Cheers Duncan ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] Handling large arrays/lists
One part of the application I'm dealing with handles a conceptual 'cube' of data, but is in fact implemented as a single python list. Items are stored and retrieved - no manipulation at the moment - by the classical approach of multiplying the appropriate x, y and z indices with the x, y and z extents to give the unique index into the single list. However, now that requirements have moved to include much larger cubes of data than originally envisaged, this single list has just become far too slow. Unfortunately the single list concept is currently a requirement in the tool, but how it is implemented is open to discussion. I've been mulling over the documentation for Numeric/numpy to see whether it makes sense to replace the single standard Python list with an array or multiarray. The first thing to strike me is that, in Numeric, the size of an array is fixed. To grow the 'cube' as each x,y slice is added, I can create a new array in one of three ways, but as far as I can see, each will require copying all of the data from the old to the new array, so I'm concerned that any speed benefit gained from replacing a standard list will be lost to repeated copying. Have I correctly understood the Numeric array handling? Does anyone have any suggestions for a more efficient way of handling a large list of data? Other modules perhaps? And yes, I know that Numeric has been replaced by numpy, but I understand that they are very similar, and it's been easier to find tutorial documentation for Numeric than for numpy. Cheers Duncan ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] line number when reading files using csv module
If I have the following data file, data.csv: 1 2 3 2 3 4 5 then I can read it in Python 2.4 on linux using: import csv f = file('data.csv', 'rb') reader = csv.reader(f) for data in reader: print data OK, that's all well and good, but I would like to record the line number in the file. According to the documentation, each reader object has a public 'line_num' attribute http://docs.python.org/lib/node265.html and http://docs.python.org/lib/csv-examples.html supports this. If I now change the loop to read: for data in reader: print reader.line_num, data I'm presented with the error: AttributeError: '_csv.reader' object has no attribute 'line_num' This has floored me. I've even looked at the source code and I can see the line_num variable in the underlying _csv.c file. I can even see the test_csv.py code that checks it! def test_read_linenum(self): r = csv.reader(['line,1', 'line,2', 'line,3']) self.assertEqual(r.line_num, 0) I suspect this is something so obvious that I just can't see the wood for the trees and I will kick myself. Any ideas? Cheers Duncan ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] line number when reading files using csv module
On Fri, 27 Oct 2006 11:35:40 +0200 Duncan Gibson <[EMAIL PROTECTED]> wrote: > > If I have the following data file, data.csv: > 1 2 3 > 2 3 4 5 > > then I can read it in Python 2.4 on linux using: > > import csv > f = file('data.csv', 'rb') > reader = csv.reader(f) > for data in reader: > print data Oops, mixing examples here. I forgot to say that I'm actually using reader = csv.reader(f, delimiter=' ') so it will read the data correctly even if there isn't a comma in sight in the csv file, but that's a side issue to the line number problem. Cheers Duncan ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] line number when reading files using csv module
Kent Johnson wrote: > The line_num attribute is new in Python 2.5. This is a doc bug, > it should be noted in the description of line_num. Is there some way to create a wrapper around a 2.4 csv.reader to give me pseudo line number handling? I've been experimenting with: import csv class MyReader(object): def __init__(self, inputFile): self.reader = csv.reader(inputFile, delimiter=' ') self.lineNumber = 0 def __iter__(self): self.lineNumber += 1 return self.reader.__iter__() def next(self): self.lineNumber += 1# do I need this one? return self.reader.next() if __name__ == '__main__': inputFile = file('data.csv', 'rb') reader = MyReader(inputFile) for data in reader: print reader.lineNumber, data But that doesn't seem to do what I want. If I add some print statements to the methods, I can see that it calls __iter__ only once: __iter__ 1 ['1', '2', '3'] 1 ['2', '3', '4', '5'] 1 ['3', '4', '5', '6', '7'] 1 ['4', '5', '6', '7'] 1 ['5', '6', '7', '8', '9'] Is there some other __special__ method that I need to forward to the csv.reader, or have I lost all control once __iter__ has done its job? Cheers Duncan ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] line number when reading files using csv module
Duncan Gibson wrote: > > > > import csv > > > > class MyReader(object): > > > > def __init__(self, inputFile): > > self.reader = csv.reader(inputFile, delimiter=' ') > > self.lineNumber = 0 > > > > def __iter__(self): > > self.lineNumber += 1 > > return self.reader.__iter__() > > > > Is there some other __special__ method that I need to forward to the > > csv.reader, or have I lost all control once __iter__ has done its job? Kent Johnson wrote: > __iter__() should return self, not self.reader.__iter__(), otherwise > Python is using the actual csv.reader not your wrapper. And don't > increment line number here. > > You lost control because you gave it away. Thanks Kent. The penny has dropped and it makes a lot more sense now. I was looking for at __iter__ as a special function that *created* an iterator, but all it really does is signal that the returned object will implement the iterator interface, and the next() method in particular. Cheers Duncan ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] import and unittest
> I wondered if it was possible to do something like this: > > src/ >-a_module/ >-sub_module/ > test/ >-a_module/ >-sub_module/ Why not really keep the test code with the main code? # module code here # if __name__ == '__main__': import unittest class TestModuleCode(unittest.TestCase): """ test harness for Module code """ def setUp(self): """ boiler plate for multiple tests """ pass def testSomething """ ensure something happens as expected """ pass unittest.main() I even have some files where I test for command line parameters and if so I process those. If not, I run the unittests. Cheers Duncan ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor