主题编码的电子邮件(RFC2047)。 解码错误

I'm writing app in Golang. I need to decode email subject.

Original subject:

Raport z eksportu ogłoszeń nieruchomości

Encoded subject:

=?utf-8?B?RG9tLmV1IC0gcmFwb3J0IHogZWtzcG9ydHUgb2fFgm9zemXF?=  =?utf-8?B?hCBuaWVydWNob21vxZtjaQ==?=^M

Decoded subject: "Raport z eksportu ogłosze▒ ▒ nieruchomości"

I use github.com/famz/RFC2047 to decode email subjects.

My code is simple:

RFC2047.Decode(msg.Header.Get("Subject"))

Why, after decoding the subject is broken? Other subjects are correctly decoded. Is this a bad encoded subject?

If you're using Go 1.5, you can use the new functions of the mime package.

If you're using an older version of Go, you can use my drop-in replacement.

Example:

package main

import (
    "fmt"
    "mime" // When using Go 1.5
    mime "gopkg.in/alexcesaro/quotedprintable.v3" // When using older Go versions
)

func main() {
    s := "=?utf-8?B?RG9tLmV1IC0gcmFwb3J0IHogZWtzcG9ydHUgb2fFgm9zemXF?=  =?utf-8?B?hCBuaWVydWNob21vxZtjaQ==?="
    dec := new(mime.WordDecoder)
    subject, err := dec.DecodeHeader(s)
    if err != nil {
        panic(err)
    }
    fmt.Println(subject)
    // Output:
    // Dom.eu - raport z eksportu ogłoszeń nieruchomości
}

That subject is incorrectly encoded. It was broken into two MIME encoded-words (because the encoded line would be longer than 76 characters), but it was split in the middle of the ń character.

If you join the two parts into a single encoded string, you get back the original subject:

s := "=?utf-8?B?RG9tLmV1IC0gcmFwb3J0IHogZWtzcG9ydHUgb2fFgm9zemXFhCBuaWVydWNob21vxZtjaQ==?="
fmt.Println(RFC2047.Decode(s))

// Dom.eu - raport z eksportu ogłoszeń nieruchomości

In free time I wrote this function. The function joins parts of string and return one string.

func parseSubject(s string) string {
   patternType := regexp.MustCompile("(?i)=\\?.*?\\?.*?\\?")
   resType := patternType.FindString(s)

   if resType == "" {
      return s
   } else {
      pattern := regexp.MustCompile("(?i)=\\?.*?\\?.*?\\?|\\s+|\\?=")
      res := pattern.ReplaceAllLiteral([]byte(s), []byte(""))

      return resType + string(res) + "?="
   }
}