Hi 

I am just beginning to learn how to use sed in order to sort my squid log files by 
virtualhost and am having trouble getting my head around how the regular expression 
works. 

I can sort my log files into the different virtual hosts using grep eg "grep '^test' 
access-sed.txt > test.wilderness.log" as I have got squid to write the logs with the 
virtualhost entry added to the front of every log entry as illustrated below. What I 
am having problems with is using sed to strip the virtualhost entry from the front of 
the log entrys once they have been sorted so I can then use webalizer to analyse the 
logs for me and get different webalizer reports for different virtual hosts.

www.sydney.wilderness.org.au/docs/node.php? 203.48.59.163 - - [26/Aug/2003 08:09:56] 
"GET http://www.sydney.wilderness.org.au/docs/node.php? HTTP/1.1" 200 8719 
"http://www.sydney.wilderness.org.au/docs/module.php?mod=book"; "Mozilla/5.0 (X11; U; 
Linux ppc) Gecko/20030714 Galeon/1.3.7 Debian/1.3.7.20030723-1" TCP_MISS:DIRECT

I have tried things like the following:

sed -e 's/^w.*\s//' > log

thinking that it would delete from the beginning of the line to the first white space 
but it deletes all matched expressions. I was wondering how I could get sed to just 
match the first expression or is there a better way to do this. I am having a bit of 
trouble understanding exactly how regular expressions work in sed.

Thanks for any help

John



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED] 
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Reply via email to