Not a direct answer, but one simple improvement you could do is to allocate
all the rows together, then break them up by slicing. So instead of your:
T := make([][]byte, n)
for i := 0; i < n; i++ {
T[i] = make([]byte, m)
}
Do something like:
T := make([][]byte, n)
Q := make([]byte, m*n)
for i := 0; i < n; i++ {
T[i] = Q[i*m : (i+1)*m]
}
In my benchmarks, with your matrix size, this gives a 20-30 percent
speedup. In part because it avoids many allocations, but also (I suspect)
because the data is contiguous.
On Wednesday, August 5, 2020 at 2:30:28 PM UTC-4 [email protected] wrote:
> Hi all,
>
> Matrix transpose in pure golang is slow in HPC cases, and using package
> gonum needs structure transformation which costs extra time. So a
> assembly version may be a better solution.
>
> Sizes of the matrix vary ([][]byte) or can be fixed for example (
> [64][512]byte), and the element type may be int32 or int64 for general
> scenarios.
>
> Below is a golang version:
>
> m := 64
> n := 512
>
> // orignial matrix
> M := make([][]byte, m)
> for i := 0; i < m; i++ {
> M[i] = make([]byte, n)
> }
>
> func transpose(M [][]byte) [][]byte {
> m := len(M)
> n := len(M[0])
>
> // transposed matrix
> T := make([][]byte, n)
> for i := 0; i < n; i++ {
> T[i] = make([]byte, m)
> }
>
> var row []byte // a row in T
> for i := 0; i < n; i++ {
> row = T[i]
> for j = 0; j < m; j++ {
> row[j] = M[j][i]
> }
> }
>
> return T
> }
>
>
--
You received this message because you are subscribed to the Google Groups
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/golang-nuts/d7e20ba9-df02-42cc-859d-653450b48e1bn%40googlegroups.com.