Hi all,

I have a very large .csv (correlationfile, which is 16 million lines long)
which I want to split into smaller .csvs. The smaller csvs should be
created be searching for a value and printing any line which contains that
value - all these values are contained in another .csv (vertexfile). I
think that I have an indentation problem or have made a mistake with my
loops because I only get data in one of the output .csvs (outputfile) which
is for the first one of the values. The other .csvs are empty.


Can somebody help me please?


Thanks so much!


Emma


import os

path = os.getcwd()

vertexfile = open(os.path.join(path,'vertices1.csv'),'r')

correlationfile = open(os.path.join(path,'practice.csv'),'r')

x = ''

for v in vertexfile:

    vs = v.replace('\n','')

    outputfile = open(os.path.join(path,vs+'.csv'),'w')

    for c in correlationfile:

        cs = c.replace('\n','').split(',')

        if vs == cs[0]: print vs

     outputfile.write(x)

outputfile.close()
_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Reply via email to