This is a slightly different result as this time I measure elapsed time
(see appendix and excuse not so nice code) as opposed to a clock time.
Results are similar (unless you have more processes than cores). I am
planning to release the code to github soon.
+----------+--------+-------+
| # of | seq | rdm |
|processes | | |
+----------+--------+-------+
|1 | 1.00 | 1.00 |
+----------+--------+-------+
|2 | 1.96 | 1.75 |
+----------+--------+-------+
|4 | 3.20 | 1.83 |
+----------+--------+-------+
|8 | 3.78 | 1.83 |
+----------+--------+-------+
|16 | 3.61 | 1.81 |
+----------+--------+-------+
|32 | 3.56 | 1.81 |
+----------+--------+-------+
This is great stuff.
> Let me make sure I read it correctly.
> Having 2 processes makes a value 1.97 times higher than with 1 core in the
> random case, and 1.76 times higher in the linear case, but what is that
> value being measured?
> Some form of throughput I suppose and not time, right?
>
I think you could call that a normalized throughput. Here are more details.
First column is the number of separate O/S test processes running in
background concurrently, started by the shell script virtually at the same
time. Then I collected the output which simply logs how long it takes to
iterate thru 40MB of memory in sequential or random manner. Second and
third column are
number_of_processes/elapsed_time*elapsed_time_from_first_row_process for
sequential and random access respectively. In ideal conditions,
elapsed_time should be constant as we use more and more cores/CPUs.
> Indeed. It also means single threaded linear access isn't going to be very
> much faster if you add more threads.
> BTW, are you sure the threads were running in parallel on separate cores
> and not just concurrently on a smaller number of cores?
> As you said, this should be dependent on hardware and running this on
> actual server machine would be as interesting.
>
> I wanted to see the worse case, separate processes and memory , which was
simplest to implement. Yes, I am sure that cores were utilized by OS as the
number of processed added. I watched MBP Activity Monitor and CPU History
which was hitting 100%. Also I did not optimize the output (should not
matter).
One interesting thing, I heard somewhere, that O/Ss bounce long lasting CPU
intensive threads between cores in order to equalize the heat generated
from the silicon. I did not observe that but longest running test took
about 15 sec using a single core.
Thx,
Andy
Appendix 1:
{
int n;
int j;
struct timeval begin,end;
gettimeofday (&begin, NULL);
for(n=0;n<100;n++)
for (j=0;j<len;j++)
i_array[rand()%len]=j;
gettimeofday (&end, NULL);
printf("[:rdm %s %d %d %f]\n",des,len,m,tdiff(&begin,&end));
}
{
int n;
int j;
struct timeval begin,end;
gettimeofday (&begin, NULL);
for(n=0;n<100;n++)
for (j=0;j<len;j++)
i_array[j]=j;
gettimeofday (&end, NULL);
printf("[:seq %s %d %d %f]\n",des,len,m,tdiff(&begin,&end));
}
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to [email protected]
Note that posts from new members are moderated - please be patient with your
first post.
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.