GoWeb实战及pprof性能分析

发布于 2017-09-13 · 本文总共 6146 字 · 阅读大约需要 18 分钟

Web工作原理

流程：

DNS解析

所谓 递归查询过程 就是 “查询的递交者” 更替, 而 迭代查询过程 则是 “查询的递交者”不变。

HTTP协议

见网络部分

http server 示例

package main

import (
    "net/http"
)

func response(rw http.ResponseWriter, request *http.Request) {
    rw.Write([]byte("Hello world."))
}

func main() {
    http.HandleFunc("/", response)
    http.ListenAndServe(":8001", nil)
}

要编写一个Web服务器很简单，只要调用http包的两个函数就可以了。

PHP：nginx、apache服务器不需要吗？Go就是不需要这些，因为他直接就监听tcp端口了，做了nginx做的事情，然后sayhelloName这个其实就是我们写的逻辑函数了，跟php里面的控制层（controller）函数类似。

Python：Go就是拥有类似Python这样动态语言的特性，写Web应用很方便。

Ruby：和ROR的/script/server启动有点类似。

Go通过简单的几行代码就已经运行起来一个Web服务了，而且这个Web服务内部有支持高并发的特性

http源码分析

// Serve accepts incoming connections on the Listener l, creating a
// new service goroutine for each. The service goroutines read requests and
// then call srv.Handler to reply to them.
//
// HTTP/2 support is only enabled if the Listener returns *tls.Conn
// connections and they were configured with "h2" in the TLS
// Config.NextProtos.
//
// Serve always returns a non-nil error and closes l.
// After Shutdown or Close, the returned error is ErrServerClosed.
func (srv *Server) Serve(l net.Listener) error {
	if fn := testHookServerServe; fn != nil {
		fn(srv, l) // call hook with unwrapped listener
	}

	origListener := l
	l = &onceCloseListener{Listener: l}
	defer l.Close()

	if err := srv.setupHTTP2_Serve(); err != nil {
		return err
	}

	if !srv.trackListener(&l, true) {
		return ErrServerClosed
	}
	defer srv.trackListener(&l, false)

	var tempDelay time.Duration // how long to sleep on accept failure

	baseCtx := context.Background()
	if srv.BaseContext != nil {
		baseCtx = srv.BaseContext(origListener)
		if baseCtx == nil {
			panic("BaseContext returned a nil context")
		}
	}

	ctx := context.WithValue(baseCtx, ServerContextKey, srv)
	for {
		rw, e := l.Accept()
		if e != nil {
			select {
			case <-srv.getDoneChan():
				return ErrServerClosed
			default:
			}
			if ne, ok := e.(net.Error); ok && ne.Temporary() {
				if tempDelay == 0 {
					tempDelay = 5 * time.Millisecond
				} else {
					tempDelay *= 2
				}
				if max := 1 * time.Second; tempDelay > max {
					tempDelay = max
				}
				srv.logf("http: Accept error: %v; retrying in %v", e, tempDelay)
				time.Sleep(tempDelay)
				continue
			}
			return e
		}
		if cc := srv.ConnContext; cc != nil {
			ctx = cc(ctx, rw)
			if ctx == nil {
				panic("ConnContext returned nil")
			}
		}
		tempDelay = 0
		c := srv.newConn(rw)
		c.setState(c.rwc, StateNew) // before Serve can return
		go c.serve(ctx)
	}
}

首先通过Listener接收请求，其次创建一个Conn，最后单独开了一个goroutine，把这个请求的数据当做参数扔给这个conn去服务：go c.serve()。用户的每一次请求都是在一个新的goroutine去服务，相互不影响。

Go为了实现高并发和高性能, 使用了goroutines来处理Conn的读写事件, 这样每个请求都能保持独立，相互不会阻塞，可以高效的响应网络事件。这是Go高效的保证。

开启pprof

http.HandleFunc(“/debug/pprof/”, Index)

http.HandleFunc(“/debug/pprof/cmdline”, Cmdline)

http.HandleFunc(“/debug/pprof/profile”, Profile)

http.HandleFunc(“/debug/pprof/symbol”, Symbol)

http.HandleFunc(“/debug/pprof/trace”, Trace)

{
	"allocs":       "A sampling of all past memory allocations",
	"block":        "Stack traces that led to blocking on synchronization primitives",
	"cmdline":      "The command line invocation of the current program",
	"goroutine":    "Stack traces of all current goroutines",
	"heap":         "A sampling of memory allocations of live objects. You can specify the gc GET parameter to run GC before taking the heap sample.",
	"mutex":        "Stack traces of holders of contended mutexes",
	"profile":      "CPU profile. You can specify the duration in the seconds GET parameter. After you get the profile file, use the go tool pprof command to investigate the profile.",
	"threadcreate": "Stack traces that led to the creation of new OS threads",
	"trace":        "A trace of execution of the current program. You can specify the duration in the seconds GET parameter. After you get the trace file, use the go tool trace command to investigate the trace.",
}

通过_ “net/http/pprof”开启pprof

import (
	"net/http"
	_ "net/http/pprof"
)

func main() {
	http.ListenAndServe("localhost:8001", nil)
}

在web页面查看

http://127.0.0.1:8001/debug/pprof/

/debug/pprof/

Types of profiles available:
Count	Profile
4	allocs
0	block         //查看导致阻塞同步的堆栈跟踪
0	cmdline
6	goroutine     //查看当前所有运行的 goroutines 堆栈跟踪
4	heap          //查看活动对象的内存分配情况
0	mutex
0	profile     //默认进行 30s 的 CPU Profiling，得到一个分析用的 profile 文件
13	threadcreate    //查看导致互斥锁的竞争持有者的堆栈跟踪
0	trace    
full goroutine stack dump
Profile Descriptions:

allocs: A sampling of all past memory allocations
block: Stack traces that led to blocking on synchronization primitives
cmdline: The command line invocation of the current program
goroutine: Stack traces of all current goroutines
heap: A sampling of memory allocations of live objects. You can specify the gc GET parameter to run GC before taking the heap sample.
mutex: Stack traces of holders of contended mutexes
profile: CPU profile. You can specify the duration in the seconds GET parameter. After you get the profile file, use the go tool pprof command to investigate the profile.
threadcreate: Stack traces that led to the creation of new OS threads
trace: A trace of execution of the current program. You can specify the duration in the seconds GET parameter. After you get the trace file, use the go tool trace command to investigate the trace.

在交互终端查看

查看CPU耗时情况

go tool pprof http://127.0.0.1:8001/debug/profile

输出示例：

Fetching profile over HTTP from http://172.16.30.7:62032/debug/profile
Saved profile in xxxxx/pprof/pprof.my_app.samples.cpu.001.pb.gz
File: my_app
Type: cpu
Time: Jun 3, 2020 at 1:14pm (CST)
Duration: 30.01s, Total samples = 20ms (0.067%)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) tree
Showing nodes accounting for 20ms, 100% of 20ms total
----------------------------------------------------------+-------------
      flat  flat%   sum%        cum   cum%   calls calls% + context 	 	 
----------------------------------------------------------+-------------
                                              10ms   100% |   reflect.(*rtype).PkgPath
      10ms 50.00% 50.00%       10ms 50.00%                | reflect.name.name
----------------------------------------------------------+-------------
                                              10ms   100% |   reflect.mapaccess
      10ms 50.00%   100%       10ms 50.00%                | runtime.mapaccess2
----------------------------------------------------------+-------------
                                              10ms   100% |   encoding/json.Marshal
         0     0%   100%       10ms 50.00%                | encoding/json.(*encodeState).marshal
                                              10ms   100% |   encoding/json.(*encodeState).reflectValue
----------------------------------------------------------+-------------
(pprof)
(pprof) top 20
Showing nodes accounting for 20ms, 100% of 20ms total
Showing top 20 nodes out of 31
      flat  flat%   sum%        cum   cum%
      10ms 50.00% 50.00%       10ms 50.00%  reflect.name.name
      10ms 50.00%   100%       10ms 50.00%  runtime.mapaccess2
         0     0%   100%       10ms 50.00%  encoding/json.(*encodeState).marshal
         0     0%   100%       10ms 50.00%  encoding/json.(*encodeState).reflectValue
         0     0%   100%       10ms 50.00%  encoding/json.Marshal
         0     0%   100%       10ms 50.00%  encoding/json.MarshalIndent
         0     0%   100%       10ms 50.00%  encoding/json.arrayEncoder.encode

flat：给定函数上运行耗时

flat%：同上的 CPU 运行耗时总比例

sum%：给定函数累积使用 CPU 总比例

cum：当前函数加上它之上的调用运行总耗时

cum%：同上的 CPU 运行耗时总比例