wow! this is even stranger now!!! I removed the call to count from ngrams* and now the same thing happens but all 4 cpus are busy!!! I don't understand...

Jim


On 24/03/13 13:54, Marko Topolnik wrote:
May or may not be related, but calling /count/ on a lazy sequence eagerly consumes the entire sequence.

On Sunday, March 24, 2013 2:35:35 PM UTC+1, Jim foo.bar wrote:

    the operation is 'ngrams*' which doesn't care about what objects
    it finds in the seq...Typically you'd have characters or word
    ngrams but that doesn't mean you can't have any type of
    object...it simply doesn't care...

    (defn ngrams*
     "Create ngrams from a seq s.
      Pass a single string for character n-grams or a seq of strings
    for word n-grams."
      [s n]
      (when (>= (count s) n)
        (lazy-seq
          (cons (take n s) (ngrams* (next s) n)))))

    I cannot get the ngrams from the second case but yes they should
    be different (e.g. not=) but the final coll should be of the same
    size in both cases and should terminate in the same time...

    Jim


    On 24/03/13 13:22, Marko Topolnik wrote:
    What do you mean by "performing the same operation"? How can you
    perform the same operation on completely different objects? Do
    you mean that you don't have the exact same /ngrams*/ in the
    first and second case?

    On Sunday, March 24, 2013 1:45:37 PM UTC+1, Jim foo.bar wrote:

        Hi everyone,

        I'm experiencing some odd behaviour that I cannot justify so
        I thought someone smarter can help here...

        I'm reading in a file with 39,7226 lines, each one containing
        a token-tag pair (e.g. The/DET). This gives me 39,7226
        fully-realized TokenTagPair objects (records->
        TokenTagPair{:token "The", :tag "DET"}).
        Let's call them 'tt-pairs'. Now, I've got a function that
        takes n-grams from a seq lazily. If I first extract all the
        tags and pass them to ngrams then all is fine:

         (def tags (mapv :tag tt-pairs)) ;;all the 39,7226 tags
         (def ntags (doall (ngrams* tags 2))) ;;returns quickly
         => (last ntags)
            ("N" "P")

        * HOWEVER,*

        (def ntt-pairs (doall (ngrams* tt-pairs 2))) ;hangs forever!

        one of my cpus is busy but nothing happens!!! How is this
        justified? _both collections are of the same size and I'm
        performing the same operation on them_...
        the only difference is that 'tags' contains String objects
        whereas 'tt-pairs' contains TokenTagPair objects... weird
        stuff, yes?


        any ideas anyone?

        Jim

-- -- You received this message because you are subscribed to the Google
    Groups "Clojure" group.
    To post to this group, send email to [email protected]
    <javascript:>
    Note that posts from new members are moderated - please be
    patient with your first post.
    To unsubscribe from this group, send email to
    [email protected] <javascript:>
    For more options, visit this group at
    http://groups.google.com/group/clojure?hl=en
    <http://groups.google.com/group/clojure?hl=en>
    ---
    You received this message because you are subscribed to the
    Google Groups "Clojure" group.
    To unsubscribe from this group and stop receiving emails from it,
    send an email to [email protected] <javascript:>.
    For more options, visit https://groups.google.com/groups/opt_out
    <https://groups.google.com/groups/opt_out>.



--
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to [email protected]
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.



--
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to [email protected]
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to