Hello R users,

I have this regex see [1] for apache log lines. I tried using R to parse
some data (only because I wanted to stay in R).
A sample line is [2]

(a) I saved the line in [1] into "~/tmp/a.txt" and [2] into "/tmp/a.txt"

pat <- readLines("~/tmp/a.txt")
test <- readLines("/tmp/a.txt")
test
grep(pat,test)

returns integer(0)

The same query works in python via re.match(....) (i.e does return groups)

Using readLines, the regex is escaped for me. Does Python and R use
different regex styles?

Cheers
Saptarshi

[1]
 
^(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s([^\s]*)\s([^\s]*)\s\[([^\]]+)\]\s"([A-Z]*)\s([^\s]*)\s([^\s]*)"\s([^\s]+)\s(\d+)\s"(.*)"\s"(.*)"\s"(.*)"$

[2]
220.213.119.925 addons.mozilla.org - [10/Jan/2001:01:55:07 -0800] "GET
/blocklist/3/%8ce33983c0-fd0e-11dc-12aa-0800200c9a66%7D/4.0b5/Fennec/20110217140304/Android_arm-eabi-gcc3/chrome:%2F%2Fglobal%2Flocale%2Fintl.properties/beta/Linux%
202.6.32.9/default/default/6/6/1/ HTTP/1.1" 200 3243 "-" "Mozilla/5.0
(Android; Linux armv7l; rv:2.0b12pre) Gecko/20110217 Firefox/4.0b12pre
Fennec/4.0b5" "BLOCKLIST_v3=110.163.217.169.1299218425.9706"

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to