In the database space, sim testing is more widely known. Databases are 
complex, concurrent, and required to be correct (yet many aren't, see jepsen 
<https://jepsen.io/>). A challenging experience for a database developer is 
seeing a rare bug in a continuous integration test suite that only happens 
about once per month while maxing out all cores 24x7. It is difficult to 
fix a bug that is not readily reproducible. Sim testing is a solution to 
this problem. Until this announcement I thought the only option was 
Antithesis which starts at $168,000/year <https://antithesis.com/pricing/> (as 
of 2025-09). Antithesis also has the advantage of being coverage guided, 
which is a more powerful form of fuzz testing than what go currently has. 
The go fuzzer is coverage guided, but it can't be run on code which is 
non-deterministic. The idea here, is that non-deterministic code can be 
made deterministic (via simulation) so that it can be fuzzed. And this can 
be done in a way which abstracts that complexity away from user code (very 
inline with go design philosophy).

I hope gosim has a future, because I'd like to use it with new go versions. 
There will definitely be some tension between gosim and the desire to stop 
people from depending on internal details of the runtime 
<https://github.com/golang/go/issues/67401>. Feels like an engineering 
discussion to be had there. Maybe the proposal process 
<https://github.com/golang/proposal#readme> could be useful in building 
consensus there. But it's not completely clear to me what the proposal 
would be.

Here are the usages of sim testing in industry that I'm aware of.

   - FoundationDB <https://apple.github.io/foundationdb/testing.html>
   - TigerBeetle 
   <https://tigerbeetle.com/blog/2023-07-06-simulation-testing-for-liveness/>
   - FrostDB <https://www.polarsignals.com/blog/posts/2025/07/08/dst-rust>
   - Paxos Made Live 
   <https://www.cs.utexas.edu/~lorenzo/corsi/cs380d/papers/paper2-1.pdf>

Seth
On Wednesday, August 27, 2025 at 10:41:06 AM UTC-7 Jason E. Aten wrote:

> Hi Jelle,
>
> Gosim is all the more impressive now that I've tried my hand at writing 
> tests with synctest. It is indeed very, very hard to get strict 
> determinism out
> of the Go runtime. The fact that Gosim emulates Linux at the system
> call level is beyond impressive. I fully get now why Gosim has to
> translate all Go source to use the Gosim deterministic runtime. 
>
> Gosim is obviously the result of alot of hard and painstaking work.
> Thank you, and congratulations on getting it to this point.
>
> I'm able to run gosim test when built under go1.23.5, but later Go 1.24 
> and 1.25
> seem to have difficulties -- I think because of the linkname 
> (restrictions? changes?)
> that make some things inaccessible.
>
> *~/go/src/github.com/jellevandenhooff/gosim/examples/etcd 
> <http://github.com/jellevandenhooff/gosim/examples/etcd> **(main) **$* 
> *gos**im test -v*
>
> 2025/08/27 12:28:46 ERROR missing function body pkg=internal/runtime/sys 
> name=GetCallerPC
>
> Does gosim strictly need linkname magic? Is there some approach to fixing 
> Gosim
>
> to work with either of the last two Go versions, given the new linkname 
> restrictions
>
> and/or updates?  Have you been able to make Gosim work with Go 1.25 for 
> instance?
>
> Thanks!
>
> Jason
>
>
> On Tuesday, December 10, 2024 at 11:41:15 PM UTC Jelle van den Hooff wrote:
>
>> Hi Roger, thanks for the compliment.
>>
>> Yes, there is quite some overlap with the new "testing/synctest" package. 
>> The tests you can write with synctest I think you can also write with 
>> gosim, as gosim's scheduler does what synctest does: If threads are paused, 
>> synctest and gosim both advance an internal clock, and so tests that take a 
>> long wall-clock time can be fast in both.
>>
>> I think synctest is an interesting point in the design space. In Go tests 
>> you can use interfaces to mock the OS, the network, etc, but time and 
>> scheduling is impossible to mock because you don't know when goroutines are 
>> blocked. Synctest fixes that, and once you have synctest, you can test 
>> almost all the same scenarios as in Gosim _if_ you mock all interactions 
>> with the OS and avoid using any shared global state.
>>
>> The trade-off is where the complexity is: With mocks and synctest you do 
>> not need significant changes to the runtime, but none of your code (or your 
>> dependencies) can use standard OS calls. With Gosim, the program under test 
>> does not need to change, but you rely on a more complicated mocking and 
>> rewriting mechanism. Practically this means Gosim can test programs using 
>> Bolt (https://pkg.go.dev/go.etcd.io/bbolt, 
>> https://github.com/jellevandenhooff/gosim/blob/main/examples/bolt/bolt_test.go)
>>  
>> and test how Bolt behaves when a machine restarts without having to change 
>> any of the code in Bolt.
>>
>> You could perhaps reuse the underlying mocks (for a network that drops 
>> packets, etc.) between Gosim and synctest. However, Gosim currently 
>> integrates at the syscall layer, so the interface exposed is quite 
>> different than the high-level mocks you would need to replace os.File, 
>> net.Conn, etc. In an earlier version of Gosim I tried mocking those 
>> higher-level interfaces, but I found it quite difficult: The API-surface is 
>> broad and not nearly as well-defined as Posix. Simulating that API 
>> accurately is important for testing error handlers that match error types 
>> returned by a net.Conn.
>>
>> Gosim also adds determinism (running the same test twice results in the 
>> same output) which is helpful if you are trying to debug rare failures. You 
>> can imagine future Antithesis-like tricks to test behavior: Run with same 
>> seed up to an interesting simulated time, and then change the seed. I think 
>> adding that to synctest would be quite difficult.
>>
>> This blog post 
>> https://www.polarsignals.com/blog/posts/2024/05/28/mostly-dst-in-go 
>> describes yet another approach, running go with the -faketime flag (used on 
>> the go playground) inside of wasm to get deterministic execution and 
>> standard OS calls by interposing at the wasm-syscall boundary, which means 
>> the program needs to build under wasm.
>>
>> Jelle
>> Op dinsdag 10 december 2024 om 14:53:22 UTC-8 schreef roger peppe:
>>
>> Impressive stuff! Some potentially interesting overlap with the new 
>> "synctest" package. Do you have any thoughts on that?
>>
>>
>> On Tue, 10 Dec 2024 at 17:41, Jelle van den Hooff <[email protected]> 
>> wrote:
>>
>> Hi golang-nuts,
>>
>> I am excited to share Gosim: simulation testing for Go (
>> https://github.com/jellevandenhooff/gosim). Gosim is a project I have 
>> been working on for quite a while that aims to make testing distributed 
>> systems easier. It implements simulation testing as popularized by 
>> FoundationDB (https://www.youtube.com/watch?v=4fFDFbi3toc).
>>
>> Gosim runs mostly-standard Go code in its simulated environment. It 
>> supports standard packages like `os`, `net`, gRPC, protobuf, and more; the 
>> largest real-world program I have successfully run is etcd. Inside of the 
>> simulation, Gosim implements fake time, network, disks, and machines. Tests 
>> can manipulate the network to eg. partition a host, or restart a machine, 
>> and verify that code still behaves as it should -- and all that without 
>> needing to manage real VMs or containers.
>>
>> Gosim works by source-translating Go to replace all references to 
>> concurrency primitives, the operating system, and non-deterministic code to 
>> its own runtime. So `go foo()` becomes `gosimruntime.Go(foo)`, etc. Then, 
>> Gosim implements a (subset of) the Linux system call interface to simulate 
>> disk and network. More details on the design are in 
>> https://github.com/jellevandenhooff/gosim/blob/main/docs/design.md. 
>> Gosim's system call implementations are (currently) in 
>> https://github.com/jellevandenhooff/gosim/blob/main/internal/simulation/os_linux.go
>> .
>>
>> To give you a taste of the kinds of tests Gosim can write, below is a 
>> snippet of a test running Etcd (taken from 
>> https://github.com/jellevandenhooff/gosim/blob/main/examples/etcd/etcd_test.go).
>>  
>> The test creates several Gosim machines that have their own network stack, 
>> disk, global variables, and more, and lets them run and communicate. From 
>> the point of view of the code, each Etcd instance runs on its own machine 
>> and is its own independent process. The simulation however runs all 
>> machines in the same Go process so that you can easily debug what happens, 
>> the test is reproducible, and overhead is low.
>>
>> I have tried to make Gosim easy to use. To get started you can run a test 
>> by replacing `go test ...` with `gosim test`. If Gosim might be useful for 
>> you, I would be happy to chat and prioritize future features. Some things I 
>> would certainly like to add are support for running main() functions; 
>> simulating clock drift; support for running different versions of code; and 
>> built-in simulation of common cloud APIs like S3.
>>
>> Gosim is experimental, so it will change and break, and only runs Go 
>> code. So it can test systems that are written in Go, but it will not work 
>> with external dependencies. I have some ideas on using eg. Wazero to run 
>> Sqlite or Postgres inside of the Go process but those are, well, still 
>> ideas.
>>
>> Jelle
>>
>> // TestEtcd runs a 3 node etcd cluster, partitions the network between 
>> the // nodes, and makes sure key-value puts and gets work. func TestEtcd(t 
>> *testing.T) { gosim.SetSimulationTimeout(2 * time.Minute) // run machines: 
>> gosim.NewMachine(gosim.MachineConfig{ Label: "etcd-1", Addr: 
>> netip.MustParseAddr("10.0.0.1"), MainFunc: func() { runEtcdNode("etcd-1", 
>> "10.0.0.1") }, }) gosim.NewMachine(gosim.MachineConfig{ Label: "etcd-2", 
>> Addr: netip.MustParseAddr("10.0.0.2"), MainFunc: func() { time.Sleep(100 * 
>> time.Millisecond) runEtcdNode("etcd-2", "10.0.0.2") }, }) 
>> gosim.NewMachine(gosim.MachineConfig{ Label: "etcd-3", Addr: 
>> netip.MustParseAddr("10.0.0.3"), MainFunc: func() { time.Sleep(200 * 
>> time.Millisecond) runEtcdNode("etcd-3", "10.0.0.3") }, }) // mess with the 
>> network in the background go nemesis.Sequence( nemesis.Sleep{ Duration: 10 
>> * time.Second, }, nemesis.PartitionMachines{ Addresses: []string{ 
>> "10.0.0.1", "10.0.0.2", "10.0.0.3", }, Duration: 30 * time.Second, }, 
>> ).Run()
>>
>>  
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "golang-nuts" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected].
>> To view this discussion visit 
>> https://groups.google.com/d/msgid/golang-nuts/CAP%3DJquaBu1O5rN6aR6fMs03q4O92cPAc9DfGQZ9fck9zB2sEkw%40mail.gmail.com
>>  
>> <https://groups.google.com/d/msgid/golang-nuts/CAP%3DJquaBu1O5rN6aR6fMs03q4O92cPAc9DfGQZ9fck9zB2sEkw%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/golang-nuts/d3c2eb43-69c8-4e45-9e41-981f87b608b7n%40googlegroups.com.

Reply via email to