类型安全Vs松散输入-GO Vs C

In C, following is the sniffer code,

void Handle_IP(char *buf)
{
    struct iphdr *ip_hdr; // declaring pointer of type ip header 

    struct in_addr in; // declaring a structure which holds ip address 

    FILE *fp;
    int ctl, len;

    /* In the following statement we're adjusting the offset so that
       ip pointer can point to correct location */

    ip_hdr = (struct iphdr *)(buf + 14);
    ....
}

where line,

ip_hdr = (struct iphdr *)(buf + 14) has char * that is type casted to struct iphdr *.

GO being a type safe language, does not allow such type casting. Go has no void * except interface that follows structural typing

How to approach such code in GO?

Foreword: Even though what you want is possible, try to avoid using package unsafe as much as possible. This is not a "go to first" solution but rather think of it as a last resort (or something that comes after that).


Go does give you support for this in package unsafe.

Even the spec has a dedicated section for it. Spec: Package unsafe:

The built-in package unsafe, known to the compiler, provides facilities for low-level programming including operations that violate the type system. A package using unsafe must be vetted manually for type safety and may not be portable.

The package was intentionally named unsafe, giving you the proper prior-warning that if something goes wrong, don't blame the compiler or the language.

What you need is the unsafe.Pointer type. Pointer is a special pointer type that may not be dereferenced, but any pointer type can be converted to Pointer, and Pointer can be converted to any (other) pointer type. So this is your "gateway" between different types.

For example, if you have a value of type float64 (which is 8 bytes in memory), you can interpret those 8 bytes as an int64 (which is also 8 bytes in memory) like this:

var f float64 = 1
var i int64

ip := (*int64)(unsafe.Pointer(&f))
i = *ip

fmt.Println(i)

Output (try it on the Go Playground):

4607182418800017408

The key is this line:

(*int64)(unsafe.Pointer(&f))

It means take the address of f (which will be of type *float64), convert it to unsafe.Pointer (any pointer type can be converted to Pointer), then convert this unsafe.Pointer value to another pointer type *int64. If we dereference this pointer, we get a value of type int64.

In your example you want to "place" a variable on an address with an offset applied. Go does not have pointer arithmetic. You can get around this in 2 ways:

  1. use uintptr which may hold an address, but you can treat it as an int and add values to it

  2. or use a pointer to a "buffer" type, e.g. *[1<<31]byte; you may convert the address to this pointer, and you can apply the offset by indexing the buffer at the given offset, and take the address of that element, e.g. &buf[14]

Method #1 could look likes this:

type X [16]byte
var x = X{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15}

xx := (uintptr)(unsafe.Pointer(&x[0])) + 14 // APPLY OFFSET

var x2 X = *(*X)(unsafe.Pointer(xx))
fmt.Println(x2[0], x2[1])

Output (try it on the Go Playground):

14 15

Method #2 could look like this:

type X [16]byte
var x = X{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15}

xx := (*X)(unsafe.Pointer(&x[0]))
xx = (*X)(unsafe.Pointer(&xx[14])) // APPLY OFFSET

var x2 X = *(*X)(unsafe.Pointer(xx))
fmt.Println(x2[0], x2[1])

Output is the same. Try it on the Go Playground.

Let's look at this from a different angle: what does typecasting do in C?

Well, in some sense, it doesn't do anything! It just takes some bit pattern at some point in memory, and blindly assumes that this particular bit pattern actually represents a value of some type. But there's no validation of this, no sort of parsing. And that's the problem.

But, I already used the word that is the solution to our problem above: parsing. Instead of just blindly assuming that some random bit pattern in memory is the same bit pattern that the C compiler would generate for a value of some type, we actually look at the bit pattern, check whether it conforms to the rules of how bit patterns for that datatype are constructed, and deconstruct the bit patterns into its constituent parts according to those rules.

In other words: we write a parser.

Note that this isn't specific to Go. It is a good idea even in C. In fact, the code you posted is broken in at least three, maybe four different ways:

  • platforms where a char is not 8 bits wide
  • alignment
  • padding
  • endianness

AFAIK, there are already existing IP, UDP, and TCP parsers written in Go out there, that you can either use or learn from. Also, the current system from Alan Kay's Viewpoints Research Institute has a really cool TCP/IP stack written in just 120 lines or so, which is very easy to read. (In fact, the source code for the parser actually just consists of ASCII diagrams cut&pasted from the RFCs.)