Clojurists - I'm fairly new to Clojure and didn't realize how broken I've
become using imperative languages all my life. I'm stumped as to how to
parse a Varnish (www.varnish-cache.org) log file using Clojure. The main
problem is that for a single request a varnish log file generates multiple
log lines and each line is interspersed with lines from other threads.
These log files can be several gigabytes in size (so using a stable sort
of the entire log by thread id is out of the question).
Below I've included a small example log file and an example output Clojure
data structure. Let me thank everyone in advance for any hints / help they
can provide on this seemingly simple problem.
*Rules of the Varnish Log File*
- The first number on each line is the thread id (not unique and gets
reused frequently)
- Each ReqStart marks the start of a request and the last number on the
line is the unique transaction id (e.g. 118591777)
- ReqEnd denote the end of the processing of the request by the thread
- Each line is atomically written, however many threads generate log
lines that are interspersed with other requests (threads)
- These log files can be VERY large (10+ Gigabytes in the case of my
application) so using a stable sort by thread id or anything that loads the
entire file into memory is out of the question.
*Example Varnish Log file*
40 ReqEnd c 118591771 1350759605.775758028 1350759611.249602079
5.866879225 5.473801851 0.000042200
15 ReqStart c 10.102.41.121 4187 118591777
15 RxRequest c GET
15 RxURL c /json/engagement
15 RxHeader c host: www.example.com
30 ReqStart c 10.102.41.121 3906 118591802
15 RxHeader c Accept: application/json
30 RxRequest c GET
30 RxURL c /ws/boxtops/user/
30 RxHeader c host: www.example.com
15 ReqEnd c 118591777 1350759605.775758028 1350759611.249602079
5.866879225 5.473801851 0.000042200
30 RxHeader c Accept: application/xml
30 ReqEnd c 118591802 1350759611.326084614 1350759611.329720259
0.005002737 0.003598213 0.000037432
15 ReqStart c 10.102.41.121 4187 118591808
15 RxRequest c GET
15 RxURL c /ws/boxtops/user/
30 ReqStart c 10.102.41.121 3906 118591810
15 RxHeader c host: www.example.com
15 RxHeader c Accept: application/xml
30 RxRequest c GET
30 RxURL c /registration/success
30 RxHeader c host: www.example.com
46 TxRequest - GET
30 RxHeader c Accept: text/html
46 TxURL - /registration/success
15 ReqEnd c 118591808 1350759611.442447424 1350759611.444925785
0.016906023 0.002441406 0.000036955
30 ReqEnd c 118591810 1350759611.521781683 1350759611.525400877
0.098322868 0.003532171 0.000087023
*Desired Output*
{
118591802
{ :ReqStart ["10.102.41.121 3906 118591802"]
:RxRequest ["GET"]
:RxURL ["/ws/boxtops/user/"]
:RxHeader ["host: www.example.com" "Accept: application/xml"]
or better yet
:RxHeader {:host "www.example.com" :Accept "application/xml"}
:ReqEnd ["118591802 1350759611.326084614 1350759611.329720259
0.005002737 0.003598213 0.000037432"] }
118591777
{ :ReqStart ["10.102.41.121 4187 118591777"]
:RxRequest ["GET"]
:RxURL ["/json/engagement"]
:RxHeader ["host: www.example.com" "Accept: application/json"]
:ReqEnd ["118591777 1350759605.775758028 1350759611.249602079
5.866879225 5.473801851 0.000042200" ]}
118591808
{ :ReqStart [10.102.41.121 4187 118591808]
:RxRequest ["GET"]
:RxURL ["/ws/boxtops/user/"]
:RxHeader ["host: www.example.com" "Accept: application/xml"]
:ReqEnd ["118591808 1350759611.442447424 1350759611.444925785
0.016906023 0.002441406 0.000036955"] }
118591810
{ :ReqStart ["10.102.41.121 3906 118591810"]
:RxRequest ["GET"]
:RxURL ["/registration/success"]
:RxHeader ["host: www.example.com" "Accept: text/html]
:ReqEnd ["118591810 1350759611.521781683 1350759611.525400877
0.098322868 0.003532171 0.000087023"] }
}
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to [email protected]
Note that posts from new members are moderated - please be patient with your
first post.
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en