使用邮件程序包时，“缺少词组：不支持字符集”

I'm trying to parse emails and I get this kind of errors using the mail package. Is it a bug on the mail package or something I should handle myself ?

missing word in phrase: charset not supported: "gb18030"

charset not supported: "koi8-r" missing word in phrase: charset not supported: "ks_c_5601-1987"

How can I fix them ? I think I should use charset but I'm not sure how . Here's how an email header looks like

In-Reply-To: <trinity-b7c6d611-52fd-4afa-b739-2deb243532a6-1402761364579@3capp-mailcom-lxa05>
References: <97e07dab7c2d1a005ed928c4350690e0@hotels-desk.co.uk>,
 <tencent_105D3DC11702F53465C0025D@qq.com>
    <trinity-b7c6d611-52fd-4afa-b739-2deb243532a6-1402761364579@3capp-mailcom-lxa05>
From: "=?gb18030?B?08bTzg==?=" <38438nx@qq.com>
To: "=?gb18030?B?V2lsaGVsbSBLdW1tZXI=?=" <sormester@lobbyist.com>
Subject: =?gb18030?B?u9i4tKO6ILvYuLSjulBhbGFjZSBXZXN0bWluc3Rl?=
 =?gb18030?B?cjogMDEtMDctMjAxNCAtIDA0LTA3LTIwMTQ=?=
Mime-Version: 1.0
Content-Type: multipart/alternative;
    boundary="----=_NextPart_539C743F_08A07490_0157E268"
Content-Transfer-Encoding: 8Bit
Date: Sun, 15 Jun 2014 00:11:43 +0800
Message-ID: <tencent_573A737E73016B9F5A3D10C1@qq.com>
Envelope-To: <sormester@lobbyist.com>

Edit:

I've tried to use the charset package it but it has no effect. I still get the same error on the same messages.

import "code.google.com/p/go-imap/go1/imap"
header := imap.AsBytes(rsp.MessageInfo().Attrs["RFC822.HEADER"])

            r, err := charset.NewReader("UTF-8", bytes.NewReader(header))
            if err != nil {
                log.Fatal(err)
            }
            fmt.Printf("new char is %v", r)

            msg, err := mail.ReadMessage(r)
            if err != nil {
                log.Fatal(err)
                return mgs, err
            }

            mg.From, err = msg.Header.AddressList("From")
            if err != nil {
                log.Errorf("NO FROM msg %s, err %v", header, err)
             return
              }

The mail package seems to be able to decode only rfc2047 but the charset package doesn't support this

character set "rfc2047" not found

It seems mahonia which could fix the issue?

I hope this helps someone who may consider Go to process emails(i.e develop client apps). It seems the standard Go standard library is not mature enough for email processing. It doesn't handle multi-part, different char sets etc. After almost a day trying different hacks and packages I've decided to just throw the go code away and use an old good JavaMail solution.

Alexey Vasiliev's MIT-licensed http://github.com/le0pard/go-falcon/ includes a parser package that applies whichever encoding package is needed to decode the headers (the meat is in utils.go).

package main

import (
        "bufio"
        "bytes"
        "fmt"
        "net/textproto"
        "github.com/le0pard/go-falcon/parser"
)

var msg = []byte(`Subject: =?gb18030?B?u9i4tKO6ILvYuLSjulBhbGFjZSBXZXN0bWluc3Rl?=
 =?gb18030?B?cjogMDEtMDctMjAxNCAtIDA0LTA3LTIwMTQ=?=

`)


func main() {
        tpr := textproto.NewReader(bufio.NewReader(bytes.NewBuffer(msg)))
        mh, err := tpr.ReadMIMEHeader()
        if err != nil {
                panic(err)
        }
        for name, vals := range mh {
                for _, val := range vals {
                        val = parser.MimeHeaderDecode(val)
                        fmt.Print(name, ": ", val, "
")
                }
        }
}

It looks like its parser.FixEncodingAndCharsetOfPart is used by the package to decode/convert content as well, though with a couple of extra allocations caused by converting the []byte body to/from a string. If you don't find the API works for you, you might at least be able to use the code to see how it can be done.

Found via godoc.org's "...and is imported by 3 packages" link from encoding/simplifiedchinese -- hooray godoc.org!

I've been using github.com/jhillyerd/enmime which seems to have no trouble with this. It'll parse out both headers and body content. Given an io.Reader r:

// Parse message body
env, _ := enmime.ReadEnvelope(r)
// Headers can be retrieved via Envelope.GetHeader(name).
fmt.Printf("From: %v
", env.GetHeader("From"))
// Address-type headers can be parsed into a list of decoded mail.Address structs.
alist, _ := env.AddressList("To")
for _, addr := range alist {
  fmt.Printf("To: %s <%s>
", addr.Name, addr.Address)
}
fmt.Printf("Subject: %v
", env.GetHeader("Subject"))

// The plain text body is available as mime.Text.
fmt.Printf("Text Body: %v chars
", len(env.Text))

// The HTML body is stored in mime.HTML.
fmt.Printf("HTML Body: %v chars
", len(env.HTML))

// mime.Inlines is a slice of inlined attacments.
fmt.Printf("Inlines: %v
", len(env.Inlines))

// mime.Attachments contains the non-inline attachments.
fmt.Printf("Attachments: %v
", len(env.Attachments))