Here's a somewhat-optimized version of the code:

#lang racket/base
(require racket/string racket/vector racket/port)

(define h (make-hash))

(time
 (for* ([l (in-lines)]
        [w (in-list (string-split l))]
        [w* (in-value (string-downcase w))])
   (hash-update! h w* add1 0)))

(define v
  (time
   (for/vector #:length (hash-count h)
       ([(k v) (in-hash h)])
     (cons k v))))
(time (vector-sort! v > #:key cdr))
(define p (current-output-port) #;(open-output-nowhere))
(time
 (for ([pair (in-vector v)])
   (write-string (car pair) p)
   (write-string (number->string (cdr pair)) p)
   (newline p)))

It's much more imperative, but also pretty nice and compact. The
`printf` optimization is significant for that portion of the program,
but that isn't much of the running time. The overall running time for
10 copies of the KJV is about 9 seconds on my laptop.

I think the remaining difference between Racket and other languages is
likely the `string-split` and `string-downcase` functions, plus the
relatively-inefficient string representation that Racket uses.

Sam


On Thu, Mar 18, 2021 at 10:28 AM Pawel Mosakowski <[email protected]> wrote:
>
> Hi David,
>
> Yes, the 21 seconds includes the interpreter startup time. I have done a 
> simple test to see how long it takes:
>
> $ time racket -e '(displayln "Hello, world")'
> Hello, world
>
> real    0m0.479s
> user    0m0.449s
> sys    0m0.030s
>
> I have also put my code inside a main function and profiled it:
>
> Profiling results
> -----------------
>   Total cpu time observed: 20910ms (out of 20970ms)
>   Number of samples taken: 382 (once every 55ms)
>   (Hiding functions with self<1.0% and local<2.0%: 1 of 12 hidden)
>
> ==============================================================
>                                   Caller
>  Idx    Total         Self      Name+src                Local%
>         ms(pct)       ms(pct)     Callee
> ==============================================================
>  [1] 20910(100.0%)     0(0.0%)  [running body] 
> ...word-occurences-profile.rkt":##f
>                                   profile-thunk [2]     100.0%
> --------------------------------------------------------------
>                                   [running body] [1]    100.0%
>  [2] 20910(100.0%)     0(0.0%)  profile-thunk 
> ...ket/pkgs/profile-lib/main.rkt:9:0
>                                   run [3]               100.0%
> --------------------------------------------------------------
>                                   profile-thunk [2]     100.0%
>  [3] 20910(100.0%)     0(0.0%)  run 
> ...share/racket/pkgs/profile-lib/main.rkt:39:2
>                                   main [4]              100.0%
> --------------------------------------------------------------
>                                   run [3]               100.0%
>  [4] 20910(100.0%)    50(0.2%)  main 
> ...cket/count-word-occurences-profile.rkt:5:0
>                                   read-from-stdin-it [5] 98.5%
>                                   ??? [6]                 0.2%
> --------------------------------------------------------------
>                                   main [4]              100.0%
>  [5] 20606(98.5%)  11796(56.4%) read-from-stdin-it 
> ...-occurences-profile.rkt:19:6
>                                   internal-split [7]     42.8%
> --------------------------------------------------------------
>                                   main [4]              100.0%
>  [6]    51(0.2%)       0(0.0%)  ??? 
> ...cket/collects/racket/private/sort.rkt:369:3
>                                   generic-sort/key [8]  100.0%
> --------------------------------------------------------------
>                                   read-from-stdin-it [5]100.0%
>  [7]  8810(42.1%)   3528(16.9%) internal-split 
> ...collects/racket/string.rkt:117:0
>                                   regexp-split [9]       59.9%
> --------------------------------------------------------------
>                                   ??? [6]               100.0%
>  [8]    51(0.2%)       0(0.0%)  generic-sort/key 
> .../racket/private/sort.rkt:156:2
>                                   copying-mergesort [10]100.0%
> --------------------------------------------------------------
>                                   internal-split [7]    100.0%
>  [9]  5282(25.3%)   2810(13.4%) regexp-split 
> ...ts/racket/private/string.rkt:338:2
>                                   loop [11]              46.8%
> --------------------------------------------------------------
>                                   generic-sort/key [8]   10.0%
>                                   copying-mergesort [10] 90.0%
> [10]    51(0.2%)      51(0.2%)  copying-mergesort 
> ...racket/private/sort.rkt:129:8
>                                   copying-mergesort [10] 90.0%
> --------------------------------------------------------------
>                                   regexp-split [9]      100.0%
> [11]  2471(11.8%)   2471(11.8%) loop 
> ...t/collects/racket/private/string.rkt:169:7
> --------------------------------------------------------------
>
> Kind regards,
> Pawel
>
>
> On Thursday, March 18, 2021 at 2:09:35 PM UTC [email protected] wrote:
>>
>> Hi Pawel,
>>
>> I'll take a look at the code later, but did that 21 seconds include startup 
>> time for the interpreter?
>>
>> On Thu, Mar 18, 2021, 9:24 AM Pawel Mosakowski <[email protected]> wrote:
>>>
>>> Hello,
>>>
>>> I am a Racket beginner and I have come across this article:
>>>
>>> https://benhoyt.com/writings/count-words/
>>>
>>> This is my attempt at solving the challenge:
>>>
>>> https://pastebin.com/kL16w5Hc
>>>
>>> However when I have benchmarked it, it takes ~21 seconds to run compared to 
>>> the Python and Ruby versions which take around 3-4 seconds.
>>>
>>> I understand that both Ruby and Python probably have the string operations 
>>> and hash tables implemented in optimized C but is there anything I can do 
>>> to improve performance of my program?
>>>
>>> Many thanks for all help and suggestions.
>>>
>>> --
>>> You received this message because you are subscribed to the Google Groups 
>>> "Racket Users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>> email to [email protected].
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/racket-users/118c1340-66d1-421d-92a4-6b66c56cb88fn%40googlegroups.com.
>
> --
> You received this message because you are subscribed to the Google Groups 
> "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/racket-users/09c58a34-bd2d-49e7-bfbd-d3253c1e6dd1n%40googlegroups.com.

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/racket-users/CAK%3DHD%2BaA8%3D_mMp8s_DL5D3vNJEmmW9v7FP9sRXqdGbM9i0mASw%40mail.gmail.com.

Reply via email to