Go中的地址对齐

So I am using c2go to link C code with Go. The C code requires certain arguments of a function called from Go to be 256 bit aligned (the function arguments are all pointers to Go variables). Is there a way to achieve this is Go (i.e. to specify 256 alignment for a variable in Go)?

In Go, "unsafe.Alignof(f)" shows as 8 bytes aligned (for "var f [8]float32") i.e. the f is guaranteed by Go to be only 8 bytes aligned. I need it to be 32 bytes aligned somehow.

For the curious: The C code is using SIMD instructions (AVX to be specific). I was using "vmovaps" instruction which requires 256 bit alignment of operands. I can get away with using "vmovups" which doesn't require alignment, but I suspect that has a performance penalty.

For example, trading more memory for less CPU time,

package main

import (
    "fmt"
    "unsafe"
)

// Float32Align32 returns make([]float32, n) 32-byte aligned.
func Float32Align32(n int) []float32 {
    // align >= size && align%size == 0
    const align = 32 // SIMD AVX byte alignment
    const size = unsafe.Sizeof(float32(0))
    const pad = int(align/size - 1)
    if n <= 0 {
        return nil
    }
    s := make([]float32, n+pad)
    p := uintptr(unsafe.Pointer(&s[0]))
    i := int(((p+align-1)/align*align - p) / size)
    j := i + n
    return s[i:j:j]
}

func main() {
    f := Float32Align32(8) // SIMD AVX
    fmt.Printf(
        "SIMD AVX: %T %d %d %p %g
",
        f, len(f), cap(f), &f[0], f,
    )
    CFuncArg := &f[0]
    fmt.Println("CFuncArg:", CFuncArg)
}

Playground: https://play.golang.org/p/mmFnHEwGKt

Output:

SIMD AVX: []float32 8 8 0x10436080 [0 0 0 0 0 0 0 0]
CFuncArg: 0x10436080

The only reasonable way to achieve is to prototype the functions in go and then write the (go) assembly as BYTE and WORD directives as it's done in golang libraries itself, as outlined in glang-1.9.1 documentation:

Unsupported opcodes

The assemblers are designed to support the compiler so not all hardware instructions are defined for all architectures: if the compiler doesn't generate it, it might not be there. If you need to use a missing instruction, there are two ways to proceed. One is to update the assembler to support that instruction, which is straightforward but only worthwhile if it's likely the instruction will be used again. Instead, for simple one-off cases, it's possible to use the BYTE and WORD directives to lay down explicit data into the instruction stream within a TEXT.

For example,

blake2b implementation at line 115 does this for AVX2