I have a question on program design. The program will be a catalog of portable media (DVD, CD, flash drive, floppy, etc). I am currently in the preliminary stages of design and I currently would like to get input on what the best method would be for initially gathering the data and what format it should take.
I plan on using the os.walk function to iterate a scan through the media. This returns a set of three variables in a list that will change each iteration until the scan completes. I don't know anything at all about database design or the most efficient method of gathering the data and then inputing it into the database. In order to optimize this part of the design I wanted to get some input from the masters. It is better to get it right the first time then go back and fix it later. Since os.walk will generate a unique set of data each interation through the directory structure I must append the data from each pass into a 'master' array or list. In order to identify the media later I will have to get the disk label and the disk ID as the primary identifiers for the compiled data. Each iteration of the directories on the media I will create a list like this : root = current directory being scanned - contains single directory dirs = subdirectories under the current root - contains names of all directories files = filenames - contains all the files in the current working directory. I need to generate the following additional information on each file during each interation. size = size of file type = file extension My initial test example is something like this: import os from os.path import join, getsize for root, dirs, files in os.walk(device): # device name from config file Then I would need to get the file size (but this is giving me an error at the moment) for name in files: s = sum(getsize(join(root, name) print s (syntax error here. I have not figured it out yet. There are spaces in the path/filename combo) (code to parse the file extension here) Back to the data though, should I create something like these while reading the media and prior to inserting it into the database? [disk label, disk id [root[dir[file,size,type, permissions]]]] in a dictionary or tuple? or use a list? or flat in a dictionary, tuple or list like [disk label, disk id, root,dir,filename,size,type,permissions] When it has completed the scan I want to insert it into the database with the following fields disk label, disk id, root directory, path, filename, file size, file type, original file permissions, and comment field. (Does anyone thing I should have any other fields? Suggestions welcome) Thank you in advance. If this is off topic, please reply off the list and let me know. -- Doug Glenn FORUM Information Systems, LLC http://foruminfosystems.com
_______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor