Yes, that is possible.
The simulated cores are already generated functions in C.
It’s my experience that if you can leverage an existing concurrency framework
then life is better for everyone; go’s is fairly robust, so this is an
experiment to see how close I can get. A real simulation system has to have a
good inner engine; but it also needs/wants visualisation, logging, dstats
gathering,…. all of which are simpler to write and get correct if one simply
leverages a working infrastructure.
It would be very simple to generate go functions instead of C functions
Plus I get the other advantages of go over C (better library, more concise, etc
etc)
So, if it works fast enough, life is good.
If it isn’t, then I can try a custom barrier (as Robert has pointed out) and
measure again
And if that isn’t fast enough, then back to C. Which would be regrettable.
As to linked lists:
- they’re a wonderful thing in a single-core implementation; not
mentioned in the problem expression was that although the population of
simulatable objects (“agents”) doesn’t change during runtime, it is the case
that some may do a lot less work than others. A memory bank only does something
when a miss has filtered through the intervening caches, for example. Just
‘polling’ an agent to see if it’s active or not takes appreciable time. So
cluttering up the collection of agents with agents that have a 0.1% duty cycle
is a Bad Thing for performance; so you need to remove them from the collection.
In a single processor sequential simulation, a linked list lets you remove an
agent very cheaply. In go, the built-in slice lets me exchange someone I don’t
want any more with the last one and shrink the size; this is no quicker than
cutting an agent out of a linked list; and walking down an array of pointers
still requires me to get a pointer before playing with the agent. (Experiments
in C with slice-like data structures showed there was no performance difference
compared with the linked list) (Agents removed from the active collection are
re-inserted when something arrives for them, of course)
— P
> On Jan 17, 2021, at 10:46 AM, Bakul Shah <[email protected]> wrote:
>
> I’d be tempted to just use C for this. That is, generate C code from a
> register level description of your N simulation cores and run that. That is
> more or less what “cycle based” verilog simulators (used to) do. The code gen
> can also split it M ways to run on M physical cores. You can also generate
> optimal synchronization code.
>
> With linked lists you’re wasting half of the memory bandwidth and potentially
> the cache. Your # of elements are not going to change so a linked list
> doesn’t buy you anything. An array is ideal from a performance PoV.
WARNING / LEGAL TEXT: This message is intended only for the use of the
individual or entity to which it is addressed and may contain information which
is privileged, confidential, proprietary, or exempt from disclosure under
applicable law. If you are not the intended recipient or the person responsible
for delivering the message to the intended recipient, you are strictly
prohibited from disclosing, distributing, copying, or in any way using this
message. If you have received this communication in error, please notify the
sender and destroy and delete any copies you may have received.
http://www.bsc.es/disclaimer
http://bsc.es/disclaimer
--
You received this message because you are subscribed to the Google Groups
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/golang-nuts/B093EE6D-2FDF-4EF4-A442-EE661164DBD7%40bsc.es.