Re: [Tutor] R: Tutor Digest, Vol 125, Issue 49

Dan Janzen Thu, 17 Jul 2014 23:33:29 -0700

****DISCLAIMER*****

I have deliberately not read any of the other replies to this problem somy answer may be totally redundant! (but here it is anyway...)

One of the first issues that had to be addressed is the fact that your"CSV" file is probably not in the format you assume it is. Every line isa list, not the traditional "string separated by commas" format that onenormally expects in a CSV file. One way to deal with that is to resavethe file as a .txt file and deal with each line as one would normally dowith a list, i.e. use list subscripting to manipulate each list elementwith regex code. Having said that, in the spirit of minimalism, thereare ways to deal with it as a CSV file as well.

First, import the csv module and use the reader() method to properlyaccess the contents.


importre

importcsv

withopen(/'non.csv'/, /'r'/) asp:

f = csv.reader(p, delimiter = /','/)

Then use a for loop to access each line and put the regex statements inthe print statement


forw inf:

print(re.sub(r/'(\.\d)'/,/''/,w[0]), re.sub(r/'(\.\d)'/,/''/, w[1]))

The regex statements access the list elements with subscripting. The "$"was not necessary and without it you get the desired results.


TO SUMMARIZE:

With the following contents of file named "non.csv":

['uc002uvo.3 ', 'uc001mae.1']

['uc010dya.2 ', 'uc001kko.2']

and the following code run in Eclipse:

##test.py

importre

importcsv

withopen(/'non.csv'/, /'r'/) asp:

f = csv.reader(p, delimiter = /','/)

forw inf:

print(re.sub(r/'(\.\d)'/,/''/,w[0]), re.sub(r/'(\.\d)'/,/''/, w[1]))

I get:

['uc002uvo ''uc001mae']

['uc010dya ''uc001kko']





On 7/16/14, 4:04 AM, jarod...@libero.it wrote:

Hi there!!!
I have a file  with this data
['uc002uvo.3 ', 'uc001mae.1']
['uc010dya.2 ', 'uc001kko.2']
['uc003ejx.2 ', 'uc010yfr.1']
['uc001bhk.2 ', 'uc003eib.2']
['uc001znc.2 ', 'uc001efn.2']
['uc002ycq.2 ', 'uc001vnh.2']
['uc001odf.1 ', 'uc002mwd.2']
['uc010jkn.1 ', 'uc010luk.1']
['uc003uhf.3 ', 'uc010tqd.1']
['uc002rue.3 ', 'uc001tex.2']
['uc011dtt.1 ', 'uc001lkv.1']
['uc003yyt.2 ', 'uc003mkl.2']
['uc003pkv.2 ', 'uc003ytw.2']
['uc010bhz.2 ', 'uc002kbt.1']
['uc001wnj.2 ', 'uc009wtj.1']
['uc011lyh.1 ', 'uc003jvb.2']
['uc002awj.1 ', 'uc009znm.1']
['uc010bft.2 ', 'uc002cxz.1']
['uc011mar.1 ', 'uc001lvb.1']
['uc001oxl.2 ', 'uc002lvx.1']

I want to replace of the things after the dots, so I want to have  a file with
this output:

['uc002uvo ', 'uc001mae']
['uc010dya ', 'uc001kko']
...

I try to use regular expression but I have  a strange output

with open("non_annotati.csv") as p:
     for i in p:
         lines= i.rstrip("\n").split("\t")
         mit = re.sub(r'(\.\d$)','',lines[0])
         mit2 = re.sub(r'(\.\d$)','',lines[1])
         print mit,mit2


uc003klv.2  uc010lxj
uc001tzy.2  uc011kzk
uc010qdj.1  uc001iku
uc004coe.2  uc002vmf
uc002dvw.2  uc004bxn
uc001dmp.2  uc001dmo
uc002rqd.2  uc010ynl
uc010cvm.1  uc002qjc
uc003ewy.3  uc003hgx
uc002ejy.2  uc003mvb
uc002fou.1  uc010ilx
uc003vhf.2  uc010qlo
uc003mix.2  uc010tdt
uc002nez.1  uc003wxe
uc011cpu.1  uc002keg
uc001ovu.2  uc011dne
uc010zfg.1  uc001jvq
uc010jlf.2  uc011azi
uc001ors.3  uc001vzx
uc010tyt.1  uc003vih
uc010fde.2  uc002xgq
uc010bit.1  uc003zle
uc010xcb.1  uc010wsg
uc011acg.1  uc009wlp
uc002bnj.2  uc004ckd


Where is the error? what is wrong in my regular expression code?


_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] R: Tutor Digest, Vol 125, Issue 49

Reply via email to