Re: [go-nuts] Block-based data structures and the Type Parameters - Draft Design

Andrew Werner Sun, 21 Jun 2020 09:44:49 -0700

Thanks for the reply! If I read it correctly, it is already possible to 
specify a type constraint as an Array of a given size and so the first part 
of the "What I might have expected" actually does work. I noticed that it 
works in that it works as a type constraint but I found that types 
constrained that way do not support indexing or slicing. Perhaps that's 
just a limitation of the prototype and not the proposal.


The other thing that I think I'm proposing (thanks for the correction) is a 
way to represent all arrays of a given type:

type Array(type T) interface {
    type [...]T
}

Thanks for dealing with my very long-winded way of saying that. 

On Sunday, June 21, 2020 at 12:35:53 PM UTC-4, David Finkel wrote:
>
>
>
> On Sat, Jun 20, 2020 at 1:22 AM Andrew Werner <[email protected] 
> <javascript:>> wrote:
>
>> [The body of this email is a duplication of the README in 
>> https://github.com/ajwerner/go2dequeue/ 
>> which also contains the sample implementation]
>>
>> Exercise building a dequeue with the go2 Type Parameter Draft 
>>
>> This project is an exploration with the go2go / Type Parameters - Design 
>> Draft 
>> <https://go.googlesource.com/proposal/+/refs/heads/master/design/go2draft-type-parameters.md>
>>  
>> with an eye towards block-based data structures which promote access 
>> locality.
>>
>> Such data structures traditionally utilize blocks of objects laid out 
>> contiguously in memory. In go, today, one generally specializes such data 
>> structures which are performance critical. This is especially important in 
>> the case of reducing allocations which occur on critical paths. Using a 
>> sync.Pool can help with allocations in some cases but can be clunky to use 
>> and, well, do create more pointers for GC to worry about and don't have the 
>> access locality.
>>
>> Whether it's really important to have data structures which include 
>> arrays of other data structures laid out contiguously in RAM is really 
>> important is sort of orthogonal. Let's just assume that it is for now and 
>> look at how we'd do it. We only need to consider rejecting the importance 
>> of this case if we can't find a good solution or idiom to support it. The 
>> google/btree <https://github.com/google/btree> library README links this 
>> Google 
>> Open Source Blog Post on block-based container 
>> <https://opensource.googleblog.com/2013/01/c-containers-that-save-memory-and-time.html>
>>  
>> performance indicating it does matter. The interesting thing about that 
>> library is it hardly gets the access locality of the C++ library in the 
>> blog post it references.
>>
>> The challenge this library explores is to layout blocks in structures 
>> contiguously in RAM while allowing the user to have some control over that 
>> block size.
>> This example 
>>
>> The approach this experiment takes is to allow the users to specify the 
>> block size by providing a function to map an array type to a slice. The 
>> allows the blocks to contain a slice which references the array. The 
>> overhead to maintain the slice header is 3 words but the cost is probably 
>> less in memory and more in the optimizations the compiler will be less able 
>> to perform. In particular, it probably will need to do bounds checking on 
>> that slice and there are probably some opportunities to avoid the pointer 
>> interactions. The interface ends up being pretty reasonable though a little 
>> bit ... opaque?
>>
>> What this looked like in this demo is*
>>
>> // Say we want to create a dequeue of things.type Thing struct { Foo int64, 
>> Bar int64, ID [16]byte }// The below incantation will create one with blocks 
>> of 37 elements.d := NewWithArray(func(a *[37]Thing) []Thing { return (*a)[:] 
>> })// The nodes in the linked list of this structure are 1240 bytes, not 
>> that// that's a great number or anything like that, just saying it's pretty 
>> big.// Maybe there'd be something good to get out of aligning things are 
>> cache line// sizes. See TestNodeSize.
>>
>> Other things you could do Use a slice for the buffer 
>>
>> A different approach would be to just have the array in the node, take a 
>> buffer size and then keep a sync.Pool or freelist of slices. Today's 
>> github.com/google/btree does this but way worse in that it just keeps a 
>> slice of btree.Item. So fine, it's be two objects, the nodes and their item 
>> buffers. Maybe that is the answer but it's pretty lame. My feeling is once 
>> somebody is optimizing a data structure for access locality they are trying 
>> to optimize as far as they can go.
>> Ignore or augment the language generics with manual or automatic 
>> specialization 
>>
>> Today, in performance critical cases, people use a variety of automatic 
>> or manual specialization. There are tools like 
>> https://github.com/cheekybits/genny or 
>> https://github.com/mmatczuk/go_generics. Note that the latter has been 
>> used to good effect in CockroachDB for building specialized copy-on-write, 
>> interval B-trees.
>>
>> One answer to this problem might just be that go's generics solution 
>> doesn't solve for the problem of specializing block-based data structures. 
>> That's not a particularly satisfying answer.
>> What I might have expected / guess I'm proposing 
>>
>> Type lists are a thing for utilizing language features which only apply 
>> to certain kinds of types. There should be a way to create a type list for 
>> arrays of types. 
>>
> As a first pass I would have anticipated that there would be a way to 
>> inform this package of a buffer size maybe drawn from a fixed number of 
>> sizes. For example:
>>
>> type BlockSize int
>> const (
>>     _ BlockSize = 1 << (4*iota)
>>     Block16
>>     Block256
>> )
>> // Array here allowstype Array(type T) interface {
>>     type [Block16]T, [Block256]T
>> }
>> func New(type T)(blockSize BlockSize) Dequeue(T) {
>>
>> Hmm, this function signature would require being able to return different 
> types based on a non-type-parameter, which would require Dequeue to be an 
> interface (I don't see a definition, so I'm not sure)
>
> From what I can tell, though, if you make the array-type a type-parameter 
> directly, though, this becomes workable in the current proposal (although 
> rather awkward for users).
> e.g.
> https://go2goplay.golang.org/p/UTnElflTH27
>
> type ArrayT(type T) interface {
>         type [3]T, [4]T
> }
>
> func NewArray(type T ArrayT(V), V interface{})() T {
>          var v T
>          return v
> }
>

Hmm, maybe I was just holding it wrong and the proposal in "What I might 
have expected" already is the proposal.
 

>
>     switch blockSize {
>>     case Block16: 
>>         return &(dequeue(T, [Block16]T){ ... })
>>     case Block256:
>>         return &(dequeue(T, [Block16]T){})
>>
>> nit: I assume the second type-parameter for this case statement should be 
> Block256.
>
>>     }
>> }
>> type dequeue(type T interface{}, ArrayT Array(T)) struct {
>>     len
>> }
>> type node(type T interface{}, ArrayT Array(T)) struct {
>>     left, right *node(T, ArrayT)
>>     rb ringBuf(T, ArrayT)
>> }
>> type ringBuf(type T interface{}, ArrayT Array(T)) struct {
>>     head, len int
>>     // buf is known to be an array type with a size of either 16 or 256 so
>>     // I'd expect to be able to interact with this like an array. I don't 
>> think
>>     // that that works today.
>>     buf ArrayT 
>> }
>>
>> The above also seems needlessly clunky. I'd prefer if I could just 
>> represent all array types like this:
>>
>> type Array(type T) interface {
>>     [...]T
>> }
>>
>> I like that approach to distinguishing between slices and arrays in 
> type-lists.
> Nit: I think it would still need the type keyword, so it would look like:
>
> type Array(type T) interface {
>         type [...]T
> }
>
Yes, definitely, my bad.  

>
> I feel like there must be an interaction with something else that I'm 
> missing, though. Since there is some discussion about limitations with 
> respect to the lack of non-type parameters in the omissions section 
> <https://go.googlesource.com/proposal/+/refs/heads/master/design/go2draft-type-parameters.md#omissions>.
>  
> (although not about arrays in type-list constraints, specifically)
>
>> Then I could do the following and everything else should just fall out.
>>
>> func New(type T interface{}, ArrayT Array(T))() Dequeue(T) {
>>     return &(dequeue(T, ArrayT){})
>> }
>>
>> The above feels to me to be compatible with the goals and idioms of the 
>> proposal but I could be missing out on some major complexity. Hope y'all 
>> find this helpful.
>> Aside 
>>
>> It would be cool to be able to embed a generic type in another generics 
>> type. I can see why it's annoying because the name of the field will get 
>> totally mangled in the specialization but maybe that's not so bad? Maybe 
>> you'd actually call the embedded type by its unspecialized type name even 
>> in the specialized version.
>>
>> I would have liked to embed the ringBuf in the node. Given it's not 
>> embedded it's almost not worth abstracting.
>>
> I think this is already addressed in the current design 
> <https://go.googlesource.com/proposal/+/refs/heads/master/design/go2draft-type-parameters.md#embedding-an-instantiated-interface-type>.
>  
> You are allowed to directly embed type-parameters, and they have the name 
> of the template argument (rather than its instantiated value). 
>
Cool, I'll try it out. I found it not compiling with `go tool go2go` in my 
first pass. Maybe I was holding it wrong. 

> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "golang-nuts" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/golang-nuts/CA%2BvRuzOYXz1t6CsWLd6gxS4AErRNCwA_ctdSYiu3Mq-YH_TEDA%40mail.gmail.com
>>  
>> <https://groups.google.com/d/msgid/golang-nuts/CA%2BvRuzOYXz1t6CsWLd6gxS4AErRNCwA_ctdSYiu3Mq-YH_TEDA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/2c712c9d-2a5b-44a9-b783-15585a6654b1o%40googlegroups.com.

Re: [go-nuts] Block-based data structures and the Type Parameters - Draft Design

Reply via email to