在封闭的net.Conn上写入,但返回nil错误

Talk is cheap, so here we go the simple code:

package main

import (
    "fmt"
    "time"
    "net"
)

func main() {
    addr := "127.0.0.1:8999"

    // Server
    go func() {
        tcpaddr, err := net.ResolveTCPAddr("tcp4", addr)
        if err != nil {
            panic(err)
        }
        listen, err := net.ListenTCP("tcp", tcpaddr)
        if err != nil {
            panic(err)
        }
        for  {
            if conn, err := listen.Accept(); err != nil {
                panic(err)
            } else if conn != nil {
                go func(conn net.Conn) {
                    buffer := make([]byte, 1024)
                    n, err := conn.Read(buffer)
                    if err != nil {
                        fmt.Println(err)
                    } else {
                        fmt.Println(">", string(buffer[0 : n]))
                    }
                    conn.Close()
                }(conn)
            }
        }
    }()

    time.Sleep(time.Second)

    // Client
    if conn, err := net.Dial("tcp", addr); err == nil {
        for i := 0; i < 2; i++ {
            _, err := conn.Write([]byte("hello"))
            if err != nil {
                fmt.Println(err)
                conn.Close()
                break
            } else {
                fmt.Println("ok")
            }
            // sleep 10 seconds and re-send
            time.Sleep(10*time.Second)
        }
    } else {
        panic(err)
    }

}

Ouput:

> hello
ok
ok

The Client writes to the Server twice. After the first read, the Server closes the connection immediately, but the Client sleeps 10 seconds and then re-writes to the Server with the same already closed connection object(conn).

Why can the second write succeed (returned error is nil)?

Can anyone help?

PS:

In order to check if the buffering feature of the system affects the result of the second write, I edited the Client like this, but it still succeeds:

// Client
if conn, err := net.Dial("tcp", addr); err == nil {
    _, err := conn.Write([]byte("hello"))
    if err != nil {
        fmt.Println(err)
        conn.Close()
        return
    } else {
        fmt.Println("ok")
    }
    // sleep 10 seconds and re-send
    time.Sleep(10*time.Second)

    b := make([]byte, 400000)
    for i := range b {
        b[i] = 'x'
    }
    n, err := conn.Write(b)
    if err != nil {
        fmt.Println(err)
        conn.Close()
        return
    } else {
        fmt.Println("ok", n)
    }
    // sleep 10 seconds and re-send
    time.Sleep(10*time.Second)
} else {
    panic(err)
}

And here is the screenshot: attachment

There are several problems with your approach.

Sort-of a preface

The first one is that you do not wait for the server goroutine to complete. In Go, once main() exits for whatever reason, all the other goroutines still running, if any, are simply teared down forcibly.

You're trying to "synchronize" things using timers, but this only works in toy situations, and even then it does so only from time to time.

Hence let's fix your code first:

package main

import (
    "fmt"
    "log"
    "net"
    "time"
)

func main() {
    addr := "127.0.0.1:8999"

    tcpaddr, err := net.ResolveTCPAddr("tcp4", addr)
    if err != nil {
        log.Fatal(err)
    }
    listener, err := net.ListenTCP("tcp", tcpaddr)
    if err != nil {
        log.Fatal(err)
    }

    // Server
    done := make(chan error)
    go func(listener net.Listener, done chan<- error) {
        for {
            conn, err := listener.Accept()
            if err != nil {
                done <- err
                return
            }
            go func(conn net.Conn) {
                var buffer [1024]byte
                n, err := conn.Read(buffer[:])
                if err != nil {
                    log.Println(err)
                } else {
                    log.Println(">", string(buffer[0:n]))
                }
                if err := conn.Close(); err != nil {
                    log.Println("error closing server conn:", err)
                }
            }(conn)
        }
    }(listener, done)

    // Client
    conn, err := net.Dial("tcp", addr)
    if err != nil {
        log.Fatal(err)
    }
    for i := 0; i < 2; i++ {
        _, err := conn.Write([]byte("hello"))
        if err != nil {
            log.Println(err)
            err = conn.Close()
            if err != nil {
                log.Println("error closing client conn:", err)
            }
            break
        }
        fmt.Println("ok")
        time.Sleep(2 * time.Second)
    }

    // Shut the server down and wait for it to report back
    err = listener.Close()
    if err != nil {
        log.Fatal("error closing listener:", err)
    }
    err = <-done
    if err != nil {
        log.Println("server returned:", err)
    }
}

I've spilled a couple of minor fixes like using log.Fatal (which is log.Print + os.Exit(1)) instead of panicking, removed useless else clauses to adhere to the coding standard of keeping the main flow where it belongs, and lowered the client's timeout. I have also added checking for possible errors Close on sockets may return.

The interesting part is that we now properly shut the server down by closing the listener and then waiting for the server goroutine to report back (unfortunately Go does not return an error of a custom type from net.Listener.Accept in this case so we can't really check that Accept exited because we've closed the listener). Anyway, our goroutines are now properly synchronized, and there is no undefined behaviour, so we can reason about how the code works.

Remaining problems

Some problems still remain.

The more glaring is you making wrong assumption that TCP preserves message boundaries—that is, if you write "hello" to the client end of the socket, the server reads back "hello". This is not true: TCP considers both ends of the connection as producing and consuming opaque streams of bytes. This means, when the client writes "hello", the client's TCP stack is free to deliver "he" and postpone sending "llo", and the server's stack is free to yield "hell" to the read call on the socket and only return "o" (and possibly some other data) in a later read.

So, to make the code "real" you'd need to somehow introduce these message boundaries into the protocol above TCP. In this particular case the simplest approach would be either using "messages" consisting of a fixed-length and agreed-upon endianness prefix indicating the length of the following data and then the string data itself. The server would then use a sequence like

var msg [4100]byte
_, err := io.ReadFull(sock, msg[:4])
if err != nil { ... }
mlen := int(binary.BigEndian.Uint32(msg[:4]))
if mlen < 0 {
  // handle error
}
if mlen == 0 {
  // empty message; goto 1
}
_, err = io.ReadFull(sock, msg[5:5+mlen])
if err != nil { ... }
s := string(msg[5:5+mlen])

Another approach is to agree on that the messages do not contain newlines and terminate each message with a newline (ASCII LF, , 0x0a). The server side would then use something like a usual bufio.Scanner loop to get full lines from the socket.

The remaining problem with your approach is to not dealing with what Read on a socket returns: note that io.Reader.Read (that's what sockets implement, among other things) is allowed to return an error while having had read some data from the underlying stream. In your toy example this might rightfully be unimportant, but suppose that you're writing a wget-like tool which is able to resume downloading of a file: even if reading from the server returned some data and an error, you have to deal with that returned chunk first and only then handle the error.

Back to the problem at hand

The problem presented in the question, I beleive, happens simply because in your setup you hit some TCP buffering problem due to the tiny length of your messages.

On my box which runs Linux 4.9/amd64 two things reliably "fix" the problem:

  • Sending messages of 4000 bytes in length: the second call to Write "sees" the problem immediately.
  • Doing more Write calls.

For the former, try something like

msg := make([]byte, 4000)
for i := range msg {
    msg[i] = 'x'
}
for {
    _, err := conn.Write(msg)
    ...

and for the latter—something like

for {
    _, err := conn.Write([]byte("hello"))
    ...
    fmt.Println("ok")
    time.Sleep(time.Second / 2)
}

(it's sensible to lower the pause between sending stuff in both cases).

It's interesting to note that the former example hits the write: connection reset by peer (ECONNRESET in POSIX) error while the second one hits write: broken pipe (EPIPE in POSIX).

This is because when we're sending in chunks worth 4k bytes, some of the packets generated for the stream manage to become "in flight" before the server's side of the connection manages to propagate the information on its closure to the client, and those packets hit an already closed socket and get rejected with the RST TCP flag set. In the second example an attempt to send another chunk of data sees that the client side already knows that the connection has been teared down and fails the sending without "touching the wire".

TL;DR, the bottom line

Welcome to the wonderful world of networking. ;-)

I'd recommend buying a copy of "TCP/IP Illustrated", read it and experiment. TCP (and IP and other protocols above IP) sometimes works not like people expect them to by applying their "common sense".