"MRAB" <[email protected]> wrote in message
news:[email protected]...
> [email protected] wrote:
>> I need to parse some ASCII text into 'word' sized chunks of text AND
>> collect the whitespace that seperates the split items. By 'word' I mean
>> any string of characters seperated by whitespace (newlines, carriage
>> returns, tabs, spaces, soft-spaces, etc). This means that my split text
>> can contain punctuation and numbers - just not whitespace.
>> The split( None ) method works fine for returning the word sized chunks
>> of text, but destroys the whitespace separators that I need.
>> Is there a variation of split() that returns delimiters as well as
>> tokens?
>>
> I'd use the re module:
>
> >>> import re
> >>> re.split(r'(\s+)', "Hello world!")
> ['Hello', ' ', 'world!']
also, partition works though it returns a tuple instead of a list.
>>> s = 'hello world'
>>> s.partition(' ')
('hello', ' ', 'world')
>>>
--Tim Arnold
--
http://mail.python.org/mailman/listinfo/python-list