Hi,
I've just started to learn programming and was told this was a good
place to ask questions :)
Where I work, we receive large quantities of data which is currently
all printed on large, obsolete, dot matrix printers. This is a problem
because the replacement parts will not be available for much longer.
So I'm trying to create a program which will capture the fixed width
text file data and convert as well as sort the data (there are several
different report types) into a different format which would allow it to
be printed normally, or viewed on a computer.
I've been reading up on the Regular Expression module and ways in which
to manipulate strings however it has been difficult to think of a way
in which to extract an address.
Here's an example of the raw text that I have to work with:
ADDRESS INFORMATION/RENSEIGNEMENTS SUR L'ADRESSE:
****************************
FOR/POUR AL/LA: 20
CORR TYP: A1B 2C3 P:3 CHNGD/CHANG
LANG: E CONS/REGR: #######
MRS XXX X XXXXXXX
### XXXXXXXXX ST DD TYP: P:6
CHNGD/CHANG
MONCTON NB LANG: E CONS/REGR:
#######
MRS XXX X XXXXXXX
#####
####
###-###-#
ADDRESS INFORMATION/RENSEIGNEMENTS SUR L'ADRESSE:
****************************
FOR/POUR AL/LA: 30
BOTH TYP: A1B 2D3 P:3 CHNGD/CHANG
LANG: E CONS/REGR: #######
MISS XXXX XXXXX
### XXXXXXXX ST
MONCTON NB
EARNINGS VITAL INFORMATION/RENSEIGNEMENTS ESSENTIELS SUR LES GAINS:
***********
(the # = any number, and the X's are just regular text)
I would like to extract the address information, but the two different
text objects on the right hand side are difficult to remove. I think
it would be easier if I could just extract a fixed square of
information, but I don't have a clue as to how to go about it.
If anyone could give me suggestions as to methods in sorting this type
of data, it would be appreciated.
--
http://mail.python.org/mailman/listinfo/python-list