Hey clojurians,

I am using a java library that reads WARC files (an internet archive 
format) to use with hadoop. 

I was recently motivated to upgrade this project's clojure from 1.6 to 1.8 
(to be able to use the recent (wonderful!) cider), and I got quite a 
strange behavior, that I managed to reduce to a simple example (on github 
<https://github.com/vadali/warc-cc/blob/upgrade-clojure/src/warc_cc/example.clj>
)


(defn mapper-map [this ^Text key ^ArchiveReader warc-value ^MapContext 
context] (doseq [^ArchiveRecord r warc-value]
(let [header (.getHeader r)
mime (.getMimetype header)]
(if (plain-text? mime)
(println "got " (.available r))




Using any clojure version prior to 1.7.0-alpha6 (meaning, alpha5 and 
below), this code works great, and I get plenty of different "got %d" 
printed to the console with different sizes.

However, upgrading to 1.7.0-alpha6 and above, I am getting constant "got 0" 
for every record in the file, and nothing (obviously) gets computed.

I tried to see if I can find the culprit 
using 
https://github.com/clojure/clojure/compare/clojure-1.7.0-alpha5...clojure-1.7.0-alpha6
 
and couldnt find an obvious problem. I thought I might ask the list for 
pointers before I deep dive into this any further. 

If you wish to help with this problem by checking it on your machine, you 
could clone https://github.com/vadali/warc-cc/tree/upgrade-clojure (use 
upgrade-clojure branch), get the example file into the root dir of the 
cloned project using 

           s3cmd get 
s3://aws-publicdatasets/common-crawl/crawl-data/CC-MAIN-2013-48/segments/1387345775423/wet/CC-MAIN-20131218054935-00092-ip-10-33-133-15.ec2.internal.warc.wet.gz

and run using lein test warc-cc.example.

Thanks!


-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to [email protected]
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to