I am keeping an `allb` slice, and with that I did see it occasionally succeed.
I am using the binarytree <https://gitlab.com/AbelThar/go.batch/blob/b10ef431c29b01fa7568a7bf9712a0286033266f/batching/src/runnables/binarytree.go> test, since it is an issue regarding the GC. In fact running it with GOGC=off, or also keep a slice with pointers in the program, does consistently succeed, as well. What I do know is when I allocate a batch, I keep a raw pointer in the slice, and is never popped or removed from there at any point. P's will keep their own batch using a uintptr, and the others are stored in either a global batch queue, or a queue of empty batches, the same as gQueue but for the batch type *b*uintptr: all of which are irrelevant to the GC. Now I modified the batch allocation to show me the pointer of `allb` and the new batch allocated: // Allocate a new batch //go:nosplit //go:yeswritebarrierrec func allocb() *b { // Break the cycle by doing acquirem/releasem around new(b). // The acquirem/releasem increments m.locks during new(b), // which keeps the garbage collector from being invoked. mp := acquirem() var bp *b bp = new(b) allb = append(allb, bp) print("allb: ", allb, ", bp:", bp, "\n") releasem(mp) return bp } With GOGC=off I get that 6 batches have been created GOGC=off GODEBUG=gccheckmark=1 gobatch run ./binarytree.go allb: [1/1]0xc000010010, bp:0xc0000160f0 allb: [2/2]0xc000012010, bp:0xc000016100 allb: [3/4]0xc00000e020, bp:0xc000016110 allb: [4/4]0xc00000e020, bp:0xc000016120 allb: [5/8]0xc000062000, bp:0xc000060000 allb: [6/8]0xc000062000, bp:0xc000060010 When it does succeed with the GC on, it consistently takes 13 batches, which I find rather odd. GODEBUG=gccheckmark=1 gobatch run ./binarytree.go allb: [1/1]0xc000010010, bp:0xc0000160f0 allb: [2/2]0xc000012010, bp:0xc000016100 allb: [3/4]0xc00000e020, bp:0xc000016110 allb: [4/4]0xc00000e020, bp:0xc000016120 allb: [5/8]0xc000062000, bp:0xc000060000 allb: [6/8]0xc000062000, bp:0xc000060010 allb: [7/8]0xc000062000, bp:0xc000016150 allb: [8/8]0xc000062000, bp:0xc0004b8000 allb: [9/16]0xc000510000, bp:0xc00044a040 allb: [10/16]0xc000510000, bp:0xc000514000 allb: [11/16]0xc000510000, bp:0xc000540000 allb: [12/16]0xc000510000, bp:0xc0004b8010 allb: [13/16]0xc000510000, bp:0xc0004b8020 Now when it crashes it returns the following: (full stack trace on pastebin) <https://pastebin.com/40iYNQrh> GODEBUG=gccheckmark=1 gobatch run ./binarytree.go allb: [1/1]0xc000010010, bp:0xc0000160f0 allb: [2/2]0xc000012010, bp:0xc000016100 allb: [3/4]0xc00000e020, bp:0xc000016110 allb: [4/4]0xc00000e020, bp:0xc000016120 allb: [5/8]0xc00006a000, bp:0xc000068000 allb: [6/8]0xc00006a000, bp:0xc000068010 allb: [7/8]0xc00006a000, bp:0xc000016170 allb: [8/8]0xc00006a000, bp:0xc0004b4000 allb: [9/16]0xc0004ea000, bp:0xc0004b4010 allb: [10/16]0xc0004ea000, bp:0xc0004ee000 allb: [11/16]0xc0004ea000, bp:0xc000448040 runtime: marking free object 0xc000448040 found at *(0xc0004ea000+0x50) base=0xc0004ea000 s.base()=0xc0004ea000 s.limit=0xc0004ec000 s.spanclass=18 s.elemsize=128 s.state=mSpanInUse *(base+0) = 0xc0000160f0 *(base+8) = 0xc000016100 *(base+16) = 0xc000016110 *(base+24) = 0xc000016120 *(base+32) = 0xc000068000 *(base+40) = 0xc000068010 *(base+48) = 0xc000016170 *(base+56) = 0xc0004b4000 *(base+64) = 0xc0004b4010 *(base+72) = 0xc0004ee000 *(base+80) = 0xc000448040 <== *(base+88) = 0x0 *(base+96) = 0x0 *(base+104) = 0x0 *(base+112) = 0x0 *(base+120) = 0x0 obj=0xc000448040 s.base()=0xc000448000 s.limit=0xc00044a000 s.spanclass=5 s. elemsize=16 s.state=mSpanInUse *(obj+0) = 0x0 *(obj+8) = 0xc0004ee000 fatal error: marking free object At this point i'm assuming the error has been done, and the trace is just when it was realized to be wrong. What I do notice is that its not always at the same level when the error is noticed: for the full stack trace, it was when the depth was 3, ... goroutine 1 [runnable]:runtime.newobject(0x464820, 0x2) /go.batch/src/runtime/malloc.go:1067 +0x51 fp=0xc000084b20 sp= 0xc000084b18 pc=0x40a701 main.bottomUpTree(0xffffffffffffbefb, 0x3, 0xc0008bb840) /go.batch/batching/src/runnables/binarytree.go:33 +0x91 fp=0xc000084b60 sp=0xc000084b20 pc=0x44f011 ...and in another stack trace <https://pastebin.com/AukWCxAe>, it occured when the depth was 1. I ran it some more times, and it always seemed to crash after batch 11, and depth ranged from 0 to 3 The stack trace for depth 0 <https://pastebin.com/ZMG02MM3> of goroutine 1 started more interesting, where it did not trigger at `newobject`. goroutine 1 [GC assist marking]: runtime.systemstack_switch() /go.batch/src/runtime/asm_amd64.s:311 fp=0xc000086930 sp=0xc000086928 pc =0x446d30 runtime.gcAssistAlloc(0xc000000180) /go.batch/src/runtime/mgcmark.go:422 +0x15c fp=0xc000086990 sp= 0xc000086930 pc=0x416e5c runtime.mallocgc(0x18, 0x464820, 0x1, 0x18) /go.batch/src/runtime/malloc.go:843 +0x8e6 fp=0xc000086a30 sp= 0xc000086990 pc=0x40a456 runtime.newobject(0x464820, 0xc00045e000) /go.batch/src/runtime/malloc.go:1068 +0x38 fp=0xc000086a60 sp= 0xc000086a30 pc=0x40a6e8 main.bottomUpTree(0xfffffffffffdec85, 0x0, 0x20) /go.batch/batching/src/runnables/binarytree.go:29 +0xfc fp=0xc000086aa0 sp=0xc000086a60 pc=0x44f07c I think at this point I may be overthinking it a bit, and my lack of experience is more apparent. If there is something else I should be looking into, I am open to ideas. On Monday, 8 April 2019 19:42:56 UTC+2, Ian Lance Taylor wrote: > > On Sun, Apr 7, 2019 at 12:30 PM Tharen Abela <[email protected] > <javascript:>> wrote: > > > > The gist of the problem is that I am allocating an object in the > runtime, (which I refer to as the batch struct), and the GC is deallocating > the object, even though a reference is being kept in a slice (similar to > allp and allm). > > While allocating, I call acquirem to prevent the GC being triggered, > during which I append the batch pointer to the slice. > > > > From running `GODEBUG=gccheckmark=1` I know that the batch object > allocated, was being freed, yet when it crashes it says the object is being > marked (hence marking a freed object). > > > > Now my intention is to keep the batch allocation till the end of the > program, keeping it in an extra batch queue, so it should not be freed. > > > > Thinking about it now, I am not sure if the deallocation occurs after > the work of the program is finished and is winding down, by de-allocating > everything, but a reference is still kept in allb, so a double free will > occur, OR, > > what I have been assuming so far, that this takes place while work is > incomplete so the GC is incorrectly de-allocating a batch object still in > use. > > > > Another thing to take note of, is that the batch in P is referenced by a > uintptr, I'm not sure how that might affect it. > > That is going to be your problem. The GC only tracks values with live > pointers. A value of type `uintptr` can not be a live pointer. The > runtime can only get away with the `guintptr`, `muintptr` and > `puintptr` types because it knows that there are existing other > pointers to all G and P values (in the allgs and allp slices and the > allm linked list). If there is ever any moment that your batch > objects are only referenced by `uintptr` values and not by a value of > pointer type, then the garbage collector can collect it. > > Ian > -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
