闲得JB疼,让我们来一起看一看Go的源码吧。我是考虑把Go的http到https的,再到底层的TLS,TCP的实现代码都看一遍的(因为go的源码注释写得非常好,所以相当便于理解),并且用文章都做一个记录。

先从最简单的http客户端实现开始吧:

resp, err := http.Get("https://moonlab.top")
resp, err := http.Post("https://moonlab.top")

resp是一个Response对象的引用。官方文档中说明,Get函数只在两种情况会返回非空err:

接下来开始阅阅读源码,Go版本是v1.22.6.

Get Func

此Get函数使用DefaultClient的Get函数:

// DefaultClient is the default [Client] and is used by [Get], [Head], and [Post].
var DefaultClient = &Client{}

之后找到Clinet的Get函数的定义:

// net/http/client.go 482
func (c *Client) Get(url string) (resp *Response, err error) {
	req, err := NewRequest("GET", url, nil)
	if err != nil {
		return nil, err
	}
	return c.Do(req)
}

第一行调用了NewRequest函数,而NewRequest函数通过context.Background()创建了一个context传入到了NewReuqetWithContext,且返回一个Request对象用于Client.Do函数。

// net/http/request.go 882
func NewRequestWithContext(ctx context.Context, method, url string, body io.Reader) (*Request, error)

不过其中对于Method参数的检查与我想象的不同:

if !validMethod(method) {
		return nil, fmt.Errorf("net/http: invalid method %q", method)
	}

这个method其实就是一个字符串,接着查看vaildMathod函数的定义:

func validMethod(method string) bool {
    return len(method) > 0 && strings.IndexFunc(method, isNotToken) == -1
}

这里的strings.IndexFunc函数将字符串拆分成rune再传递给后面的函数,这个源码中就是isNotToken函数(rune其实就是int32,代表一个unicode字符)。

// net/http/http.go
func isNotToken(r rune) bool {
	return !httpguts.IsTokenRune(r)
}

// net/http/httpguts/httplex.go
func IsTokenRune(r rune) bool {
	return r < utf8.RuneSelf && isTokenTable[byte(r)]
}

让我们看向最后的 r < utf8.RuneSelf,找到utf8.RuneSelf的定义:

const (
	RuneSelf  = 0x80         // characters below RuneSelf are represented as themselves in a single byte.
)

0x80就是二进制的10000000,实际上也代表在ASCII中的128个字符。

NewRequest Func

// net/http/request.go 882
func NewRequestWithContext(ctx context.Context, method, url string, body io.Reader) (*Request, error) {
	...
	req := &Request{
		ctx:        ctx,
		Method:     method,
		URL:        u,
		Proto:      "HTTP/1.1",
		ProtoMajor: 1,
		ProtoMinor: 1,
		Header:     make(Header),
		Body:       rc,
		Host:       u.Host,
	}
	...
}

对于Request对象,可以作为Client发出去的请求,也可以作为Server收到的请求。所以Request对象里的有些功能并非是双方都所拥有的。 其中有几个重要的内容:

Body io.ReadCloser

Body就是Post里的那个Body,因为这是个IOReader,所以在Client发生错误被停止等情况下,要用closeBody()函数进行处理。

Proto      string // "HTTP/1.1 or HTTP/2"
ProtoMajor int
ProtoMinor int

(注:ProtoMajor和ProtoMajor是协议的主版本号和次版本号。)

Client.do Func

// net/http/client.go 595
func (c *Client) do(req *Request) (retres *Response, reterr error) 

东西有点多,一一分析,下面代码都在Client.do函数内。

if testHookClientDoResult != nil {
	defer func() { testHookClientDoResult(retres, reterr) }()
}

内部有一个ClientDo完成的hook,但是这个hook函数的设置是不对外开放的。

var (
	deadline      = c.deadline() 
	reqs          []*Request
	resp          *Response
	copyHeaders   = c.makeHeadersCopier(req)
	reqBodyClosed = false // have we closed the current req.Body?

	// Redirect behavior:
	redirectMethod string
	includeBody    bool
)

deadline函数:

func (c *Client) deadline() time.Time {
	if c.Timeout > 0 {
		return time.Now().Add(c.Timeout)
	}
	return time.Time{}
}

之后开始进入一个for循环,此循环是用于处理redirect的。for的最开始是上一次请求的结束,而如果len(reqs) > 0则代表着有redirect需要处理,否则就是第一次请求。

for {
	// For all but the first request, create the next
	// request hop and replace req.
	if len(reqs) > 0 {
		// Ingore this :)
		......
	}
}

我先暂时省略处理redirect的逻辑,直接看第一次请求的代码。

reqs = append(reqs, req)
var err error
var didTimeout func() bool
if resp, didTimeout, err = c.send(req, deadline); err != nil {
	// c.send() always closes req.Body
	reqBodyClosed = true
	if !deadline.IsZero() && didTimeout() {
		err = &httpError{
			err:     err.Error() + " (Client.Timeout exceeded while awaiting headers)",
			timeout: true,
		}
	}
	return nil, uerr(err)
}

var shouldRedirect bool
redirectMethod, shouldRedirect, includeBody = redirectBehavior(req.Method, resp, reqs[0])
if !shouldRedirect {
	return resp, nil
}

req.closeBody()

直接看到了关键的进行http请求的函数c.send()

而关于是否要进行redirect,怎么进行redirect最关键的函数是redirectBehavior()

记住这两个函数,一会分析。在开始分析redirect之前,科普一下3xx的statusCode。3xx response其实就是那些需要客户端主动去处理的情况。而在这些状态码下,resp都会带有 Location header。

但事实上只是定义如此,具体实现每个http客户端,甚至服务端都可能并不相同。比如在下面注释中提到的issue #17773,Googled的GCS为308自己创造了一个定义308 Resume Incomplete,用于文件上传的resume,根本就不是redirect。人与机器最大的区别就是,从不循规蹈矩。

// For all but the first request, create the next
// request hop and replace req.
if len(reqs) > 0 {
	loc := resp.Header.Get("Location")
	if loc == "" {
		// While most 3xx responses include a Location, it is not
		// required and 3xx responses without a Location have been
		// observed in the wild. See issues #17773 and #49281.
		return resp, nil
	}
	u, err := req.URL.Parse(loc)
	if err != nil {
		resp.closeBody()
		return nil, uerr(fmt.Errorf("failed to parse Location header %q: %v", loc, err))
	}
	host := ""
	if req.Host != "" && req.Host != req.URL.Host {
		// If the caller specified a custom Host header and the
		// redirect location is relative, preserve the Host header
		// through the redirect. See issue #22233.
		if u, _ := url.Parse(loc); u != nil && !u.IsAbs() {
			host = req.Host
		}
	}
	ireq := reqs[0]
	req = &Request{
		Method:   redirectMethod,
		Response: resp,
		URL:      u,
		Header:   make(Header),
		Host:     host,
		Cancel:   ireq.Cancel,
		ctx:      ireq.ctx,
	}
	if includeBody && ireq.GetBody != nil {
		req.Body, err = ireq.GetBody()
		if err != nil {
			resp.closeBody()
			return nil, uerr(err)
		}
		req.ContentLength = ireq.ContentLength
	}

	// Copy original headers before setting the Referer,
	// in case the user set Referer on their first request.
	// If they really want to override, they can do it in
	// their CheckRedirect func.
	copyHeaders(req)

	// Add the Referer header from the most recent
	// request URL to the new one, if it's not https->http:
	if ref := refererForURL(reqs[len(reqs)-1].URL, req.URL, req.Header.Get("Referer")); ref != "" {
		req.Header.Set("Referer", ref)
	}
	err = c.checkRedirect(req, reqs)

	// Sentinel error to let users select the
	// previous response, without closing its
	// body. See Issue 10069.
	if err == ErrUseLastResponse {
		return resp, nil
	}

	// Close the previous response's body. But
	// read at least some of the body so if it's
	// small the underlying TCP connection will be
	// re-used. No need to check for errors: if it
	// fails, the Transport won't reuse it anyway.
	const maxBodySlurpSize = 2 << 10
	if resp.ContentLength == -1 || resp.ContentLength <= maxBodySlurpSize {
		io.CopyN(io.Discard, resp.Body, maxBodySlurpSize)
	}
	resp.Body.Close()

	if err != nil {
		// Special case for Go 1 compatibility: return both the response
		// and an error if the CheckRedirect function failed.
		// See https://golang.org/issue/3795
		// The resp.Body has already been closed.
		ue := uerr(err)
		ue.(*url.Error).URL = loc
		return resp, ue
	}
}

Client.send Func

这个函数的作用是把c.Jar的cookies放入到request中,传入到下一个send函数,拿到resp后判断一下有没有错误,再把返回的cookies放入到c.Jar,最后向上传回resp。

CookieJar的创建方法:

jar, err := cookiejar.New(nil)
 if err != nil {
	return
}
client := &http.Client{
	Jar: jar,
}
// client.go 174
func (c *Client) send(req *Request, deadline time.Time) (resp *Response, didTimeout func() bool, err error) {
	if c.Jar != nil {
		for _, cookie := range c.Jar.Cookies(req.URL) {
			req.AddCookie(cookie)
		}
	}
	resp, didTimeout, err = send(req, c.transport(), deadline)
	if err != nil {
		return nil, didTimeout, err
	}
	if c.Jar != nil {
		if rc := resp.Cookies(); len(rc) > 0 {
			c.Jar.SetCookies(req.URL, rc)
		}
	}
	return resp, nil, nil

}

RoundTripper

RoundTripper是个接口,而Transport是它的实现。

// DefaultTransport is the default implementation of [Transport] and is
// used by [DefaultClient]. It establishes network connections as needed
// and caches them for reuse by subsequent calls. It uses HTTP proxies
// as directed by the environment variables HTTP_PROXY, HTTPS_PROXY
// and NO_PROXY (or the lowercase versions thereof).
var DefaultTransport RoundTripper = &Transport{
	Proxy: ProxyFromEnvironment,
	DialContext: defaultTransportDialContext(&net.Dialer{
		Timeout:   30 * time.Second,
		KeepAlive: 30 * time.Second,
	}),
	ForceAttemptHTTP2:     true,
	MaxIdleConns:          100,
	IdleConnTimeout:       90 * time.Second,
	TLSHandshakeTimeout:   10 * time.Second,
	ExpectContinueTimeout: 1 * time.Second,
}

Send

接下来到了send的另一个函数,这里是最关键的。

func send(ireq *Request, rt RoundTripper, deadline time.Time) (resp *Response, didTimeout func() bool, err error) {
		req := ireq // req is either the original request, or a modified fork

	if rt == nil {
		req.closeBody()
		return nil, alwaysFalse, errors.New("http: no Client.Transport or DefaultTransport")
	}

	if req.URL == nil {
		req.closeBody()
		return nil, alwaysFalse, errors.New("http: nil Request.URL")
	}

	if req.RequestURI != "" {
		req.closeBody()
		return nil, alwaysFalse, errors.New("http: Request.RequestURI can't be set in client requests")
	}
	...
}