Hi,
I'm trying figure out how to load a huge file that contains some 800k pair
of integers (two integers per line) which represent edges of a directed
graph.
So if the ith line has x and y, it means that there is an edge between x
and y vertex in the graph.
The goal is to load it in an array of arrays representation, where the kth
array contains all the nodes, where there is a directed edge from the kth
node to those nodes.
I've attempted multiple variants of with-open reader and line-seq etc. but
almost always ended up with OutMemoryException or sg VERY slow.
My latest attempt that also does not work on the large input:
(defn load-graph [input-f]
(with-open [rdr (io/reader input-f)]
(->> (line-seq rdr)
(map (fn [row]
(let [[v1str v2str] (str/split row #"\s")]
[ (Integer/parseInt v1str) (Integer/parseInt v2str) ]))
)
(reduce (fn [G [v1 v2]]
(if-let [vs (get G v1)]
(update-in G [v1] #(conj % v2))
(assoc G v1 [v2]))) { } ))))
I'm getting a bit frustrated as there are Python, Go implementations that
load the graph in less the 5 seconds.
What am I doing wrong?
Thanks
--
László Török
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to [email protected]
Note that posts from new members are moderated - please be patient with your
first post.
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en