****DISCLAIMER*****
I have deliberately not read any of the other replies to this problem so
my answer may be totally redundant! (but here it is anyway...)
One of the first issues that had to be addressed is the fact that your
"CSV" file is probably not in the format you assume it is. Every line is
a list, not the traditional "string separated by commas" format that one
normally expects in a CSV file. One way to deal with that is to resave
the file as a .txt file and deal with each line as one would normally do
with a list, i.e. use list subscripting to manipulate each list element
with regex code. Having said that, in the spirit of minimalism, there
are ways to deal with it as a CSV file as well.
First, import the csv module and use the reader() method to properly
access the contents.
importre
importcsv
withopen(/'non.csv'/, /'r'/) asp:
f = csv.reader(p, delimiter = /','/)
Then use a for loop to access each line and put the regex statements in
the print statement
forw inf:
print(re.sub(r/'(\.\d)'/,/''/,w[0]), re.sub(r/'(\.\d)'/,/''/, w[1]))
The regex statements access the list elements with subscripting. The "$"
was not necessary and without it you get the desired results.
TO SUMMARIZE:
With the following contents of file named "non.csv":
['uc002uvo.3 ', 'uc001mae.1']
['uc010dya.2 ', 'uc001kko.2']
and the following code run in Eclipse:
##test.py
importre
importcsv
withopen(/'non.csv'/, /'r'/) asp:
f = csv.reader(p, delimiter = /','/)
forw inf:
print(re.sub(r/'(\.\d)'/,/''/,w[0]), re.sub(r/'(\.\d)'/,/''/, w[1]))
I get:
['uc002uvo ''uc001mae']
['uc010dya ''uc001kko']
On 7/16/14, 4:04 AM, jarod...@libero.it wrote:
Hi there!!!
I have a file with this data
['uc002uvo.3 ', 'uc001mae.1']
['uc010dya.2 ', 'uc001kko.2']
['uc003ejx.2 ', 'uc010yfr.1']
['uc001bhk.2 ', 'uc003eib.2']
['uc001znc.2 ', 'uc001efn.2']
['uc002ycq.2 ', 'uc001vnh.2']
['uc001odf.1 ', 'uc002mwd.2']
['uc010jkn.1 ', 'uc010luk.1']
['uc003uhf.3 ', 'uc010tqd.1']
['uc002rue.3 ', 'uc001tex.2']
['uc011dtt.1 ', 'uc001lkv.1']
['uc003yyt.2 ', 'uc003mkl.2']
['uc003pkv.2 ', 'uc003ytw.2']
['uc010bhz.2 ', 'uc002kbt.1']
['uc001wnj.2 ', 'uc009wtj.1']
['uc011lyh.1 ', 'uc003jvb.2']
['uc002awj.1 ', 'uc009znm.1']
['uc010bft.2 ', 'uc002cxz.1']
['uc011mar.1 ', 'uc001lvb.1']
['uc001oxl.2 ', 'uc002lvx.1']
I want to replace of the things after the dots, so I want to have a file with
this output:
['uc002uvo ', 'uc001mae']
['uc010dya ', 'uc001kko']
...
I try to use regular expression but I have a strange output
with open("non_annotati.csv") as p:
for i in p:
lines= i.rstrip("\n").split("\t")
mit = re.sub(r'(\.\d$)','',lines[0])
mit2 = re.sub(r'(\.\d$)','',lines[1])
print mit,mit2
uc003klv.2 uc010lxj
uc001tzy.2 uc011kzk
uc010qdj.1 uc001iku
uc004coe.2 uc002vmf
uc002dvw.2 uc004bxn
uc001dmp.2 uc001dmo
uc002rqd.2 uc010ynl
uc010cvm.1 uc002qjc
uc003ewy.3 uc003hgx
uc002ejy.2 uc003mvb
uc002fou.1 uc010ilx
uc003vhf.2 uc010qlo
uc003mix.2 uc010tdt
uc002nez.1 uc003wxe
uc011cpu.1 uc002keg
uc001ovu.2 uc011dne
uc010zfg.1 uc001jvq
uc010jlf.2 uc011azi
uc001ors.3 uc001vzx
uc010tyt.1 uc003vih
uc010fde.2 uc002xgq
uc010bit.1 uc003zle
uc010xcb.1 uc010wsg
uc011acg.1 uc009wlp
uc002bnj.2 uc004ckd
Where is the error? what is wrong in my regular expression code?
_______________________________________________
Tutor maillist - Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor
_______________________________________________
Tutor maillist - Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor