返回给定URL内容的Web服务器

I wrote a simple web sever that gets the content of a given URL and writes it out using a http.ResponseWriter. But the problem is that it does not display any images and CSS on the page. The code is given bellow:

func factHandler(w http.ResponseWriter, r *http.Request) {
    res, err := http.Get("http://www.meaningfultype.com/")
    if err != nil {
        log.Fatal(err)
    }
    robots, err := ioutil.ReadAll(res.Body)

    res.Body.Close()
    if err != nil {
        log.Fatal(err)
    }   
    w.Write(robots)
}

What do I need to change so that the the whole page content as seen in the web browser is returned?

TL;DR

  • You are only serving the root HTML page, you need to respond to requests to other resources too (you can see what resource is being requested with the URL variable of the *http.Request variable)

  • When serving the resources, you need to write the Header's Content-Type to let the browser know what's the type of the resource (like image/png)

Full Answer

What you are doing in your request handler is getting http://www.meaningfultype.com/, the HTML page, then the browser finds an image like /images/header-logo.png and makes the request to get it but your server in localhost doesn't know how to respond to http://localhost/images/header-logo.png.

Assuming your handler function is handling requests at the root of the server ("/"), you could get the requested URL r.URL and use it to get the required resource:

url := "http://www.meaningfultype.com/"
if r.URL.Host == "" {
    url += r.URL.String()
} else {
    url = r.URL.String()
}
res, err := http.Get(url)

The problem is that, even after doing this, all the resources are sent as plain/text because you are not setting the Header's Content-Type. That's why you need to specify the type of the resource before writing. In order to know what the type of the resource is, just get it from the Content-Type you just received in the Header of the response of http.Get:

contentType := res.Header.Get("Content-Type")
w.Header().Set("Content-Type", contentType)
w.Write(robots)

The final result:

package main

import (
    "io/ioutil"
    "log"
    "net/http"
)

func main() {
    http.HandleFunc("/", factHandler)
    http.ListenAndServe(":8080", nil)
}

func factHandler(w http.ResponseWriter, r *http.Request) {
    url := "http://www.meaningfultype.com/"
    if r.URL.Host == "" {
        url += r.URL.String()
    } else {
        url = r.URL.String()
    }
    res, err := http.Get(url)
    if err != nil {
        log.Fatal(err)
    }
    robots, err := ioutil.ReadAll(res.Body)

    res.Body.Close()
    if err != nil {
        log.Fatal(err)
    }
    contentType := res.Header.Get("Content-Type")
    w.Header().Set("Content-Type", contentType)
    w.Write(robots)
}

The problem here is presumably that the website you refer is using relative paths for the images and stylesheets, e.g. "/css/main.css". The local website that you deliver to the browser with your Go service has another domain (for example localhost) and thus the relative path cannot be resolved by the browser (there is no http://localhost/css/main.css).

So what you need to do is either set a base URL in the document you deliver or, alternatively, rewrite every relative path to an absolute path (/css/main.css to http://originalwebsite.com/css/main.css).

For adding the base tag or rewriting the URLs in the document I suggest employing something like goquery which lets you manipulate the HTML structure similar to how jQuery does it.

Example of adding a <base> tag:

import (
    "github.com/PuerkitoBio/goquery"
    "fmt"
)

func main() {
    doc,err := goquery.NewDocument("http://www.meaningfultype.com/")

    if err != nil {
        panic(err)
    }

    m := doc.Find("head")

    m.Children().First().BeforeHtml(`<base url="http://localhost/">`)

    fmt.Println(doc.Html())
}