Tommy Kaas wrote: > With Stevens help about writing and Peters help about import codecs - and > when I used \r\n instead of \r to give me new lines everything worked. I > just thought that \n would be necessary? Thanks. > Tommy
Newline handling varies across operating systems. If you are on Windows and open a file in text mode your program sees plain "\n", but the data stored on disk is "\r\n". Most other OSes don't mess with newlines. If you always want "\r\n" you can rely on the csv module to write your data, but the drawback is that you have to encode the strings manually: import csv import urllib2 from BeautifulSoup import BeautifulSoup html = urllib2.urlopen( 'http://www.kaasogmulvad.dk/unv/python/tabeltest.htm').read() soup = BeautifulSoup(html) with open('tabeltest.txt', "wb") as f: writer = csv.writer(f, delimiter="#") rows = soup.findAll('tr') for tr in rows: cols = tr.findAll('td') writer.writerow([unicode(col.string).encode("utf-8") for col in cols]) PS: It took me some time to figure out how deal with beautifulsoup's flavour of unicode: >>> import BeautifulSoup as bs >>> s = bs.NavigableString(u"älpha") >>> s u'\xe4lpha' >>> s.encode("utf-8") Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/pymodules/python2.6/BeautifulSoup.py", line 430, in encode return self.decode().encode(encoding) UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 0: ordinal not in range(128) >>> unicode(s).encode("utf-8") # heureka '\xc3\xa4lpha' _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor