DEBUG神器valgrind之memcheck报告分析memcheck怎么运⾏
valgrind --log-file=valgrind.log --tool=memcheck --leak-check=full --show-reachable=no --workaround-gcc296-bugs=yes ./mcsample arg1 arg2
–log-file 表⽰输出报告⽂件,可以是相对路径或完全路径
–tool=memcheck 做内存检测就是memcheck,要知道valgrind是⼀个⼯具集
–leak-check=full 完整检测
–show-reachable=no 是否显⽰reachable详见内存泄露部分,通常是no,也可以改成yes
–workaround-gcc296-bugs=yes 如果你的gcc存在对应的bug,则要设为yes,否则有误报
最后是被检测程序及其参数。
memcheck报告怎么看
先来⼀段意外的写错
int main(int argc, char *argv[])
{
char* bigBuff = (char*)malloc[1024];
free(bigBuff);
}
==3498== Invalid free() / delete / delete[] / realloc()
==3498== at 0x402B06C: free (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==3498== by 0x8048444: main (main.cpp:19)
==3498== Address 0x40c0500 is in the Text gment of /lib/i386-linux-gnu/libc-2.15.so
代码错误的将malloc()写成了malloc[],相当于取得了malloc函数指针后⾯的地址,输出报告告诉我们这个地址位于.text段。
可以看出报告的基本格式是:
{问题描述}
at {地址、函数名、模块或代码⾏}
by {地址、函数名、代码⾏}
by ...{逐层依次显⽰调⽤堆栈}
Address 0x {描述地址的相对关系}
⽽报告的输出⽂档整体格式则可以总结为:
1. copyright 版权声明
2. 异常读写报告
2.1 主线程异常读写
2.2 线程A异常读写报告
2.3 线程B异常读写报告
< 其他线程
3. 堆内存泄露报告
3.1 堆内存使⽤情况概述(HEAP SUMMARY)
3.2 确信的内存泄露报告(definitely lost)
3.3 可疑内存操作报告 (show-reachable=no关闭)
3.4 泄露情况概述(LEAK SUMMARY)
都有哪些常见异常报告
内存泄漏
int main(int argc, char *argv[])
{
char* bigBuff = (char*)malloc(1024);
}
1,024 bytes in 1 blocks are definitely lost in loss record 1 of 1
at 0x402BE68: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
by 0x8048414: main (main.cpp:17)
definitely lost:内存没有被释放,且没有任何指针指向这⾥。肯定泄漏了。报告给出的堆栈是内存被分配时的调⽤堆栈,它可以基本明确内存是由什么业务逻辑创建的。
still reachable:是说内存没有被释放,尽管如此仍有指针指向,内存仍在使⽤中,这可以不算泄露。(程序退出时仍在⼯作的异步系统调⽤?)
possibly lost:是说可能有泄漏,⼀般是有⼆级指针(指针的指针)等复杂情况不易于追踪时出现。
suppresd:统计了使⽤valgrind的某些参数取消了特定库的某些错误,会被归结到这⾥
异常释放
int main(int argc, char *argv[])
{
char* bigBuff = (char*)malloc(1024);
char* offtBuff = bigBuff + 888;
free(offtBuff);
}
Invalid free() / delete / delete[] / realloc()
at 0x402B06C: free (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
by 0x8048461: main (main.cpp:24)
Address 0x41f23a0 is 888 bytes inside a block of size 1,024 alloc'd
at 0x402BE68: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
by 0x8048444: main (main.cpp:17)
free() / delete / delete[] / realloc() 四种中的任⼀种,这⾥是free的⾮法释放。在描述地址的相对关系时,
使⽤了⼀个句⼦,句⼦的格式是:Address 0x is {x} bytes {inside/before/after} a block of size {y} {alloc’d/free’d}
它表⽰了释放的地址与⼀个y长度块的相对位置关系。如果地址位于块前,则⽤before,位于块内则⽤inside,块后则是after。⽽最后的alloc’d代表这个y长度的块处于有效状态,其分配时的栈如下;⽽free’d代表y长度块已删除,其删除时的栈如下。
所以上⾯的报告可以解释为:地址0x41f23a0位于⼀个长度1024的有效块内+888处,其分配时的调⽤堆栈如下。
⾮法读写
int main(int argc, char *argv[])
{
char* bigBuff = (char*)malloc(1024);
uint64_t* bigNum = (uint64_t*)(bigBuff+1020);
*bigNum = 0x12345678AABBCCDD;
printf("bigNum is %llu\n",*bigNum);
free(bigBuff);
}
Invalid write of size 4
at 0x8048490: main (main.cpp:19)
Address 0x41f2428 is 0 bytes after a block of size 1,024 alloc'd
at 0x402BE68: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
by 0x8048474: main (main.cpp:17)
Invalid read of size 4
at 0x804849B: main (main.cpp:20)
Address 0x41f2428 is 0 bytes after a block of size 1,024 alloc'd
at 0x402BE68: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
by 0x8048474: main (main.cpp:17)
对⼀个内存区的使⽤超过了分配的⼤⼩时,可以触发Invalid write/read,同时被告知长度。本例中uint64_t有8字节长,访问超出了4字节。如果将bigBuff+1020改成bigBuff-20,那么报告中会准确的告诉你Address xxx is 20 bytes before a block of …
另外⼀个有趣的现象是,我发现对uint64_t的⾮法访问会产⽣2次4字节长度⾮法访问的报告,这说明了什么?
不匹配的释放
int main(int argc, char *argv[])
{
int unud;
char* bigBuff = (char*)malloc(1024);
delete[] bigBuff;
printf("unud=%d",unud);
}
Mismatched free() / delete / delete []
at 0x402A8DC: operator delete[](void*) (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
by 0x80484FB: main (main.cpp:19)
Address 0x4323028 is 0 bytes inside a block of size 1,024 alloc'd
at 0x402BE68: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
by 0x80484E4: main (main.cpp:18)
U of uninitialid value of size 4
at 0x416E0DB: _itoa_word (_itoa.c:195)
by 0x417221A: vfprintf (vfprintf.c:1629)
by 0x4178B2E: printf (printf.c:35)
by 0x41454D2: (below main) (libc-start.c:226)
不管malloc分配后⽤delete还是delete[],⼜或者是new[]之后粗⼼⽤delete释放,都会得到Mismatched free() / delete / delete []报告,且报告主体内容基本⼀致。
使⽤未初始的值
上例中int unud并未赋值即被使⽤,得到了U of uninitialid value of size 4的报告,这样的问题通常不致命,但是也需要排除。
可以观察到⼀个有趣情况,堆栈最后⼀层⾸次出现了 (below main),它表⽰代码位于main函数以外被执⾏,也并⾮来⾃于线程,我还不能明确解释这种现象,但是我做了下⾯这个测试:…
静态构造和释放
class GlobalClass
{
public:
GlobalClass()
{
char* buf = (char*)malloc(10);
*(int*)(buf+8) = 100;
free(buf);
}
~GlobalClass()
{
char* buf = (char*)malloc(10);
*(int*)(buf+8) = 100;
free(buf);
}
void fake(){}
} g_globalClass;
int main(int argc, char *argv[])
{
g_globalClass.fake();
}
Invalid write of size 4
at 0x804857B: GlobalClass::GlobalClass() (main.cpp:21)
by 0x804850F: __static_initialization_and_destruction_0(int, int) (main.cpp:31)
by 0x8048551: _GLOBAL__sub_I_g_globalClass (main.cpp:55)
by 0x8048631: __libc_csu_init (in /home/jinzeyu/codelocal/build-mcsample-Desktop_Qt_5_3_GCC_32bit-Debug/mcsample)
by 0x4060469: (below main) (libc-start.c:185)
Address 0x41f2030 is 8 bytes inside a block of size 10 alloc'd
at 0x402BE68: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
by 0x8048571: GlobalClass::GlobalClass() (main.cpp:20)
by 0x804850F: __static_initialization_and_destruction_0(int, int) (main.cpp:31)
by 0x8048551: _GLOBAL__sub_I_g_globalClass (main.cpp:55)
by 0x8048631: __libc_csu_init (in /home/jinzeyu/codelocal/build-mcsample-Desktop_Qt_5_3_GCC_32bit-Debug/mcsample)
by 0x4060469: (below main) (libc-start.c:185)
Invalid write of size 4
at 0x80485B9: GlobalClass::~GlobalClass() (main.cpp:27)
by 0x4079B80: __run_exit_handlers (exit.c:78)
by 0x4079C0C: exit (exit.c:100)
by 0x40604DA: (below main) (libc-start.c:258)
Address 0x41f2070 is 8 bytes inside a block of size 10 alloc'd
at 0x402BE68: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
by 0x80485AF: GlobalClass::~GlobalClass() (main.cpp:26)
by 0x4079B80: __run_exit_handlers (exit.c:78)
by 0x4079C0C: exit (exit.c:100)
by 0x40604DA: (below main) (libc-start.c:258)
静态类的构造和释放都在main之外,所以都出现了(below main)的字样,堆栈的函数名也很好的证实了这两个过程。这⾥我联想到了另⼀个问题,就是静态构造的顺序不⼀定按预期,强烈建议静态对象之间不要有依赖关系。
崩溃
如果在memcheck运⾏你的程序过程中遇到崩溃,它依然能够提供⼀些有⽤的信息
--16198-- VALGRIND INTERNAL ERROR: Valgrind received a signal 11 (SIGSEGV) - exiting
--16198-- si_code=1; Faulting address: 0x74207972; sp: 0x6564ca5c
valgrind: the 'impossible' happened:
Killed by fatal signal
==16198== at 0x380C0AD4: (in /usr/lib/valgrind/memcheck-x86-linux)
==16198== by 0x380C12C5: (in /usr/lib/valgrind/memcheck-x86-linux)
==16198== by 0x38040A63: (in /usr/lib/valgrind/memcheck-x86-linux)
==16198== by 0x38040B36: (in /usr/lib/valgrind/memcheck-x86-linux)
==16198== by 0x3803EA4B: (in /usr/lib/valgrind/memcheck-x86-linux)
==16198== by 0x20202E78:
sched status:
running_tid=3
然后报告中依次罗列崩溃时各线程所处的堆栈和线程的运⾏状态
Thread 1: status = VgTs_WaitSys
...
Thread 2: status = VgTs_WaitSys
...
Thread 3: status = VgTs_Runnable
==16198== at 0x402C9B4: operator new(unsigned int) (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==16198== by 0x437D7D3: std::string::_Rep::_S_create(unsigned int, unsigned int, std::allocator<char> const&) (in /usr/lib/i386-linux-gnu/libstdc++.so.6.
0.16)
==16198== by 0x437FBB5: std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&) (i n /usr/lib/i386-linux-gnu/libstdc++.so.6.0.16)
==16198== by 0x82A76A3: DataChecker::handle_data_check_resp_msg(void*) (data_checker.c:55)
==16198== by 0x8144411: main_thread(void*) (main_thread.c:198)
==16198== by 0x82839CF: thread_manager_start_routine(void*) (thread_manager.c:72)
==16198== by 0x42D3D4B: start_thread (pthread_create.c:308)
==16198== by 0x450BFDD: clone (clone.S:130)
Thread 4: status = VgTs_WaitSys
...
那么,运⾏中的线程⾃然是嫌疑最⼤的,我们可以提取它的堆栈信息做进⼀步分析。