当前位置:主页 > 资料 >

Debugging simple memory leaks in go #sre
栏目分类:资料   发布日期:2018-08-03   浏览次数:

导读:本文为去找网小编(www.7zhao.net)为您推荐的Debugging simple memory leaks in go #sre,希望对您有所帮助,谢谢! are a class of bugs where memory is not released even after it is no longer needed. They are often explici

本文为去找网小编(www.7zhao.net)为您推荐的Debugging simple memory leaks in go #sre,希望对您有所帮助,谢谢!

www.7zhao.net



are a class of bugs where memory is not released even after it is no longer needed. They are often explicit, and highly visible, which makes them a great candidate to begin learning debugging. Go is a language particularly well suited to identifying memory leaks because of its powerful toolchain, which ships with amazingly capable tools ( ) which make pinpointing memory usage easy.

本文来自去找www.7zhao.net

I’m hoping this post will illustrate how to visually identify memory, narrow it down to a specific process, correlate the process leak with work, and finally find the source of the memory leak using pprof. This post is intentionally contrived in order to allow for the simple identification of the root cause of the memory leak. The pprof overview is intentionally brief and aims to illustrate what it is capable of and isn’t an exhaustive overview of its features. 欢迎访问www.7zhao.net

What is a memory leak?

If memory grows unbounded and never reaches a steady state then there is probably a leak. The key here is that memory grows without ever reaching a steady state, and eventually causes problems through explicit crashes or by impacting system performance. 内容来自www.7zhao.net

Memory leaks can happen for any number of reasons. There can be logical leaks where data-structures grow unbounded, leaks from complexities of poor object reference handling, or just for any number of other reasons. Regardless of the source, many memory leaks elicit a visually noticeable pattern: the “ sawtooth ”.

内容来自www.7zhao.net

Debug Process

This blog post is focused on exploring how to identify and pinpoint root cause for a go memory leak. We’ll be focusing primarily on characteristics of memory leaks, how to identify them, and how to determine their root cause using go. Because of this our actual debug process will be relatively superficial and informal. 欢迎访问www.7zhao.net

The goal of our analysis is to progressively narrow scope of the problem by widdling away possibilities until we have enough information to form and propose a hypothesis.After we have enough data and reasonable scope of the cause, we should form a hypothesis and try to invalidate it with data.

欢迎访问www.7zhao.net

Each step will try to either pinpoint a cause of an issue or invalidate a non-cause. Along the way we’ll be forming a series of hypotheses, they will be necessarily general at first then progressively more specific. This is loosely based on the scientific method. Brandon Gregg does an amazing job of covering different for system investigation (primarily focused on performance).

欢迎访问www.7zhao.net

Just to reiterate we’ll try to:

内容来自www.7zhao.net

  • Ask a question
  • Form a Hypothesis
  • Analyze the hypothesis
  • Repeat until the root cause is found

Identification

How do we even know if there is a problem (ie memory leak)? Explicit errors are direct indicators of an issue. For memory leaks a common error are OOM errors or explicit system crashes.

www.7zhao.net

OOM errors

Errors are the most explicit indicator of a problem. User generated errors have the potential to generate false positives if the logic is off OOM is the OS literally indicating something is using too much memory. In the error below this manifests as cgroup limits being reached and the container being killed. 本文来自去找www.7zhao.net

dmseg

Question:Is the error a regular repeating issue?

本文来自去找www.7zhao.net

Hypothesis:OOM errors are significant enough that they should rarely occur. There is a memory leak in one of the processes. www.7zhao.net

Prediction:Either the Process memory limit has been set too low and there was a uncharacteristic bump or there is a larger issue.

去找(www.7zhao.net欢迎您

Test:Upon further inspection there are quite a few OOM errors suggesting this is a serious issue and not a one off. Check the system memory for historic view into memory usage. www.7zhao.net

System Memory

The next step after identifying a potential problem is to get an idea of system wide memory usage. Memory Leaks frequently display a “sawtooth” pattern. The spikes correspond to the application running while the dips correspond to a service restart. 欢迎访问www.7zhao.net

Sawtooth characterizes a memory leak especially corresponding with a service deploy. I’m using a test project to illustrate memory leaks but even a slow leak would look like saw tooth if the range is zoomed out far enough. With a smaller time range it would look like a gradual rise and then a drop off on process restart. www.7zhao.net

The graph above shows the example of a sawtooth memory growth. Memory continually grows without flatlining. This is a smoking gun for memory issues.

内容来自www.7zhao.net

Question:Which process (or processes) is (are) responsible for the memory growth?

欢迎访问www.7zhao.net

Test:Analyze per process memory. There could be information in the dmesg logs to indicate a process or class of processes that are the targets of OOM. 本文来自去找www.7zhao.net

Per Process Memory

Once a memory leak is suspected the next step is to identify a process that is contributing, or causing the system memory growth. Having per process historical memory metrics is a crucial requirement (container based system resources are available through the a tool like ). Go’s provides per process memory metrics by default, which is where the graph below gets its data. copyright www.7zhao.net

The below graph shows a process which is very similar to the system sawtooth memory leak graph above: continual growth until process restarts.

www.7zhao.net

Memory is a critical resource and can be used to indicate abnormal resource usage or could be used as a dimension for scaling. Additionally, having memory stats help inform how to set container based ( ) memory limits. The s of the graph values above can be found . After the process has been identified it’s time to dig in and find out which specific part of the code is responsible for this memory growth.

去找(www.7zhao.net欢迎您

Root Cause Analysis / Source Analysis

Go Memory Analysis

Once again prometheus gives us detailed information into the go runtime, and what our process is doing. The chart shows that bytes are continually allocated to the heap until a restart. Each dip corresponds with the service process restart. copyright www.7zhao.net

Question:Which part(s) of the application is(are) leaking memory?

本文来自去找www.7zhao.net

Hypothesis:There’s a memory leak in a routine which is continually allocating memory to the heap (global variable or pointer, potentially visible through )

copyright www.7zhao.net

Test:Correlate the memory usage with an event. 本文来自去找www.7zhao.net

Correlation With Work

Establishing a correlation will help to partition the problem space by answering: is this happening online (in relation to transactions) or in the background?

www.7zhao.net

One way to determine this could be to start the service and let it idle without applying any transactional load. Is the service leaking? If so it could be the framework or a shared library. Our example happens to have a strong correlation with transactional workload.

本文来自去找www.7zhao.net

HTTP Request Count

The above graph show the count of HTTP requests. These directly match the system memory growth and time and establish diving into HTTP request handling as a good place to start.

copyright www.7zhao.net

Question:Which part of the application are responsible for the heap allocations?

copyright www.7zhao.net

Hypothesis: There is an HTTP handler that is continually . www.7zhao.net

Test:Periodically analyze heap allocations during program running in order to track memory growth. www.7zhao.net

Go Memory Allocations

In order to inspect how much memory is being allocated and the source of those allocations we’ll use . pprof is an absolutely amazing tool and one of the main reasons that I personally use go. In order to use it we’ll have to first enable it, and then take some snapshots. If you’re already using http, enabling it is literally as easy as: www.7zhao.net

import _ "net/http/pprof" 内容来自www.7zhao.net 

Once pprof is enabled we’ll periodically take heap snapshots throughout the life of process memory growth. Taking a heap snapshot is just as trivial:

内容来自www.7zhao.net

curl http://localhost:8080/debug/pprof/heap > heap.0.pprof
sleep 30
curl http://localhost:8080/debug/pprof/heap > heap.1.pprof
sleep 30
curl http://localhost:8080/debug/pprof/heap > heap.2.pprof
sleep 30
curl http://localhost:8080/debug/pprof/heap > heap.3.pprof 本文来自去找www.7zhao.net 

The goal is to get an idea of how memory is growing throughout the life of the program. Let’s inspect the most recent heap snapshot: 欢迎访问www.7zhao.net

This is absolutely amazing. pprof defaults to Type: inuse_space which displays all the objects that are currently in memory at the time of the snapshot. We can see here that bytes.Repeat is directly responsible for 98.60% of all of our memory!!!

www.7zhao.net

The line below that shows:

本文来自去找www.7zhao.net

1.28MB  0.31% 98.91%   410.25MB 98.91%  main.(*RequestTracker).Track 

去找(www.7zhao.net欢迎您

This is really interesting, it shows that Track itself has 1.28MB or 0.31% but is responsible for 98.91% of all in use memory!!!!!!!!!!!!! Further more we can see that http has even less memory in use but is responsible for even more than Track (since Track is called from it).

copyright www.7zhao.net

pprof exposes many ways to introspect and visualize memory (in use memory size, in use number of objects, allocated memory size, allocated memory objects), it allows listing the track method and showing how much each line is responsible for:

内容来自www.7zhao.net

This directly pinpoints the culprit:

copyright www.7zhao.net

1.28MB   410.25MB     24:   rt.requests = append(rt.requests, bytes.Repeat([]byte("a"), 10000)) 

内容来自www.7zhao.net

pprof also allows visual generation of the textual information above:

www.7zhao.net

(pprof) svg
Generating report in profile003.svg 
内容来自www.7zhao.net

This clearly shows the current objects occupying the process memory. Now that we have the culprit Track we can , and fix the root issue.

www.7zhao.net

Resolution: Memory was being continually allocated to a global variable on each HTTP request. The memory growth in this case should have been inspected. www.7zhao.net

I hope that this post was able to illustrate the power of visually identifying memory leaks and a loose process for progressively narrowing down its source. Finally, I hope that it was able to touch on the power of pprof for go memory introspection and analysis. As always I would love any feedback. Thank you for reading. copyright www.7zhao.net

欢迎访问www.7zhao.net


本文原文地址:https://medium.com/dm03514-tech-blog/sre-debugging-simple-memory-leaks-in-go-e0a9e6d63d4d

以上为Debugging simple memory leaks in go #sre文章的全部内容,若您也有好的文章,欢迎与我们分享! 去找(www.7zhao.net欢迎您

Copyright ©2008-2017去找网版权所有   皖ICP备12002049号-2 皖公网安备 34088102000435号   关于我们|联系我们| 免责声明|友情链接|网站地图|手机版