On 11/29/2011 04:40 PM, Mayo Adams wrote:
Apologies for my numerous offenses to protocol, and gratitude for  the
suggestions all around.And yet...  as to the notion of a tuple
existing in some non-Platonic sense, I suspect I will have carry my
confusion away in order to dispel it by further reading.
"Wrong on both counts?" Hey, you win! I WAS wrong. But if it is not a
tuple that is in the file, it would be helpful to know what it is.
Presumably, a string representing a tuple. And as to the matter of
representation,  I cant immediately see how anything in a script is
anything other than a representation of some kind, hence the
distinction between representamen and object does no work for me.

Yes, I should have been clearer as to what I was trying to achieve,
but I underestimated the good will of this community(in most cases)and
their willingness to help.  For which, as I said, much thanks.

Nicely worded. We appreciate your confusion, but please realize that each of us comes from a different background, computing-wise.

In traditional (compiled) languages, there was a sharp and uncrossable division between the stuff in source files and the stuff in data files. The former got compiled (by multi-million dollar machines) into machine code, which used the latter as data, both input and output. The data the user saw never got near a compiler, so the rules of interpreting it were entirely different. A file was just a file, a bunch of bytes made meaningful only by the particular application that dealt with it. And the keyboard, screen and printer (and the card punch/reader) were in some sense just more files.

This had some (perhaps unforeseen) advantages in isolating the differing parts of the application. Source code was written by highly trained people, who tried to eliminate the bugs and crashes before releasing the executable to the public. Once debugged, this code was extremely protected against malicious and careless users (both human and virus, both local and over the internet) who might manage to corrupt things. The operating systems and hardware were also designed to make this distinction complete, so that even buggy programs couldn't harm the system itself. I've even worked on systems where the code and the data weren't even occupying the same kind of memory. When a program crashed the system, it was our fault, not the users' because we didn't give him (at least not on purpose) the ability to run arbitrary code.

then came the days of MSDOS, which is a totally unprotected operating system, and scripting languages and interpreters, which could actually run code from data files, and the lines became blurry. People were writing malicious macros into word processing data, and the machines actually would crash from it. (Actually things were not this linear, interpreters have been around practically forever, but give me a little poetic license)

So gradually, we've added protection back into our personal systems that previously was only on mainframes. But the need for these protections is higher today than ever before, mainly because of the internet.


So to Python. Notice that the rules for finding code are different than those for finding data files. No accident. We WANT to treat them differently. We want data files to be completely spec'ed out, as to what data is permissible and what is not. Without such a spec, programs are entirely unpredictable. The whole Y2K problem was caused because too many programmers made unwarranted assumptions about their data (in that case about the range of valid dates). And they were also saving space, in an era where hard disks were available for maybe three thousand dollars for 10 megabytes.

Ever filled in a form which didn't leave enough room for yout town name, or that assumed that city would not have spaces? Ever had software that thought that apostrophes weren't going to happen in a person's name? Or that zip codes are all numeric (true in US, not elsewhere). Or that names only consist of the 26 English letters. Or even that only the first letter would be capitalized.

Anyway, my post was pointing out that without constraining the data, it was impractical/unsafe/unwise to encode the data in a byte stream(file), and expect to decode it later.

In your example, you encoded a tuple that was a string and an integer. You used repr() to do it, and that can be fine. But the logic to transform that string back into a tuple is quite complex, if you have to cover all the strange cases. And impossible, if you don't know anything about the data.

A tuple object in memory doesn't look anything like the text you save to that file. It's probably over a hundred bytes spread out over at least three independent memory blocks. But we try to pretend that the in-memory object is some kind of idealized tuple, and don't need to know the details. Much of the space is taken up by pointers, and pointers to pointers, that reference the code that knows how to manipulate it.

You might want to look into shelve or pickle, which are designed to save objects to a file and restore them later. These have to be as general as possible, and the complexity of both the code and the resultant data file reflects that.

--

DaveA

_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Reply via email to