C++中new和delete的背后

关于 C++中new背后的行为, 以前已经写过一篇了 理解C++中new背后的行为, 但是里面也只是泛泛而谈,没有真凭实据, 下面我们从汇编的角度看C++编译器究竟在背后干了什么?我们的代码很简单, 如下:#include class A{public:virtual void print(){std::cout << 10;}virtual ~A(){std::cout << "~A()";}};class B: public A{public:virtual void print(){std::cout << 100;}};int tmain(int argc, _TCHAR argv[]){A p = new B();p->print();delete p;return 0;}我用WinDbg可以看到main函数生成的汇编代码如下:NewTest!wmain:00aa1020 56 push esi00aa1021 6a04 push 400aa1023 e8b4030000 call NewTest!operator new (00aa13dc) //调用operator new分配大小为4字节的空间00aa1028 83c404 add esp,400aa102b 85c0 test eax,eax00aa102d 740a je NewTest!wmain+0x19 (00aa1039)00aa102f c7005421aa00 mov dword ptr [eax],offset NewTest!B::vftable' (00aa2154) //将虚表地址写入对象地址的头4个字节(虚表指针)00aa1035 8bf0 mov esi,eax00aa1037 eb02 jmp NewTest!wmain+0x1b (00aa103b)00aa1039 33f6 xor esi,esi00aa103b 8b06 mov eax,dword ptr [esi]00aa103d 8b10 mov edx,dword ptr [eax]00aa103f 8bce mov ecx,esi00aa1041 ffd2 call edx //调用虚表内的第一个函数print00aa1043 8b06 mov eax,dword ptr [esi]00aa1045 8b5004 mov edx,dword ptr [eax+4]00aa1048 6a01 push 100aa104a 8bce mov ecx,esi00aa104c ffd2<span style="color: rgba(255, 0, 0, 1)">call edx //调用虚表内的第二个函数(析构函数)</span>00aa104e 33c0 xor eax,eax00aa1050 5e pop esi00aa1051 c3 ret00aa1052 cc int 3从上面代码中我们可以看到我们构造的B对象一共只有4个字节,而这四个字节包含的就是对象的虚表指针,对于C++对象内存布局, 对于C++对象的内存布局,可以<span style='color: rgba(42, 42, 42, 1); line-height: 20px; font-family: "Microsoft Yahei"'>看我这篇</span><span style='color: rgba(42, 42, 42, 1); font-family: "Microsoft Yahei"'><span style="line-height: 20px">[探索C++对象模型](http://www.cppblog.com/weiym/archive/2012/09/21/191543.html)。同时我们可以看到, C++里确实是通过虚表来实现多态的。</span></span><span style='color: rgba(42, 42, 42, 1); font-family: "Microsoft Yahei"'><span style="line-height: 20px">上面的代码也告诉了我们为什么不能在构造函数里通过调用虚函数实现多态? 因为虚表是在最终派生类的构造函数中生成的的, 执行基类构造函数时虚表都还没有生成。</span></span><span style='color: rgba(42, 42, 42, 1); font-family: "Microsoft Yahei"'><span style="line-height: 20px">接下来我们看看operator new背后的行为:</span></span>0:000> u 00aa13dcNewTest!operator new:00aa13dc ff25cc20aa00 jmp dword ptr [NewTest!_imp_??2YAPAXIZ (00aa20cc)]里面是一个直接跳转:0:000> u poi(00aa20cc) L10MSVCR90!operator new:74603e99 8bff mov edi,edi74603e9b 55 push ebp74603e9c 8bec mov ebp,esp74603e9e 83ec0c sub esp,0Ch74603ea1 eb0d jmp MSVCR90!operator new+0x17 (74603eb0)74603ea3 ff7508 push dword ptr [ebp+8]74603ea6 e859dcfbff call MSVCR90!_callnewh (745c1b04)74603eab 59 pop ecx74603eac 85c0 test eax,eax74603eae 740f je MSVCR90!operator new+0x26 (74603ebf)74603eb0 ff7508 push dword ptr [ebp+8]**74603eb3 e887feffff call MSVCR90!malloc (74603d3f)**74603eb8 59 pop ecx74603eb9 85c0 test eax,eax74603ebb 74e6 je MSVCR90!operator new+0xa (74603ea3)74603ebd c9 leave<span style='color: rgba(42, 42, 42, 1); font-family: "Microsoft Yahei"'><span style="line-height: 20px">我们可以看到operator new最终调用的是malloc, 如果再深入下去, 会发现malloc调用的是Kernel32!HeapAlloc, 而HeapAlloc调用的又是ntdll!</span></span><span style="color: rgba(75, 75, 75, 1); line-height: 20px; font-family: Tahoma; font-size: 16px; background-color: rgba(255, 255, 255, 1)">RtlAllocateHeap, 关于heap的布局和分配算法,可以看</span><span style="color: rgba(75, 75, 75, 1); font-family: Tahoma; font-size: medium"><span style="line-height: 20px">张银奎的<span class="Apple-converted-space"></span>[软件调试](http://book.douban.com/subject/3088353/)</span></span><span style='color: rgba(42, 42, 42, 1); font-family: "Microsoft Yahei"'>上面论证了new操作符背后的行为:</span><span style='color: rgba(42, 42, 42, 1); font-family: "Microsoft Yahei"'>首先调用operator new分配空间, 我们可以重载operator new, 定义自己的内存分配算法</span><span style='color: rgba(42, 42, 42, 1); font-family: "Microsoft Yahei"'>然后在分配的空间上调用构造函数创建对象, 构造函数内部可能会赋值虚表指针。</span><span style='color: rgba(42, 42, 42, 1); font-family: "Microsoft Yahei"'>接下来我们看下delete背后的行为。</span><span style='color: rgba(42, 42, 42, 1); font-family: "Microsoft Yahei"'>我们看到delete调用的是虚表里的第二个函数, 我们先看虚表内容:</span>0:000> dps 00aa215400aa2154 00aa1010 NewTest!B::print [f:\test\newtest\newtest\newtest.cpp @ 26]00aa2158 00aa1060 NewTest!B::scalar deleting destructor'00aa215c 0000000000aa2160 0000004800aa2164 00000000上面看到虚表里有2个函数, 一个是print, 还有一个是destructor, 我们看下第二个函数的内容:0:000> u 00aa1060 L10NewTest!B::scalar deleting destructor':00aa1060 56 push esi00aa1061 8bf1 mov esi,ecx00aa1063 c7064821aa00 mov dword ptr [esi],offset NewTest!A::vftable' (00aa2148)00aa1069 a15820aa00 mov eax,dword ptr [NewTest!_imp?coutstd (00aa2058)]00aa106e 50 push eax00aa106f e84c010000 call NewTest!std::operator<< > (00aa11c0)00aa1074 83c404 add esp,400aa1077 f644240801 test byte ptr [esp+8],100aa107c 7409 je NewTest!B::scalar deleting destructor'+0x27 (00aa1087)00aa107e 56 push esi00aa107f e806030000 call NewTest!operator delete (00aa138a)00aa1084 83c404 add esp,400aa1087 8bc6 mov eax,esi00aa1089 5e pop esi00aa108a c20400 ret 4<span style='color: rgba(42, 42, 42, 1); font-family: "Microsoft Yahei"'>我们可以看到虚表里放的是 B 的</span><span class="Apple-converted-space"></span><span style='color: rgba(42, 42, 42, 1); font-family: "Microsoft Yahei"'>scalar deleting destructor</span><span style='color: rgba(42, 42, 42, 1); font-family: "Microsoft Yahei"'>, 它里面包含两部分代码, 一个是我们真正定义的析构函数的代码,还有一部分就是operator delete ( operator delete又会去调用free, free调用kernel32!HeapFree)。这里的</span><span style='color: rgba(42, 42, 42, 1); font-family: "Microsoft Yahei"'>scalar deleting destructor显然不是B的析构函数~B(), 这是编译器帮我产生的一个函数,它就是给delete B类型对象用的。</span><span style='color: rgba(42, 42, 42, 1); font-family: "Microsoft Yahei"'>接下来我们看看对于数组类型的指针, C++编译器背后是如何处理的, 把代码改成如下:</span>int _tmain(int argc, _TCHAR* argv[]){A* p = new A[10];delete []p;return 0;}下面是生成的汇编代码:NewTest!wmain:01181030 6a2c push 2Ch01181032 e8c4030000 call NewTest!operator new[] (011813fb) //通过operator new分配44自己01181037 83c404 add esp,40118103a 85c0 test eax,eax0118103c 7444 je NewTest!wmain+0x52 (01181082)0118103e 56 push esi<span style="color: rgba(255, 0, 0, 1)">0118103f 6810101801 push offset NewTest!A::~A (01181010) //A的析构函数</span><span style="color: rgba(255, 0, 0, 1)">01181044 6800111801 push offset NewTest!A::A (01181100) //A的构造函数</span><span style="color: rgba(255, 0, 0, 1)">01181049 6a0a push 0Ah //10</span><span style="color: rgba(255, 0, 0, 1)">0118104b 8d7004 lea esi,[eax+4] //跨过了头四个字节</span><span style="color: rgba(255, 0, 0, 1)">0118104e 6a04 push 4 //对象大小</span><span style="color: rgba(255, 0, 0, 1)">01181050 56 push esi //esi里放的是对象列表的起始地址(跨过了头四个字节)</span><span style="color: rgba(255, 0, 0, 1)">01181051 c7000a000000 mov dword ptr [eax],0Ah //头四个字节写入对象列表数量(10)</span><span style="color: rgba(255, 0, 0, 1)">01181057 e812040000 call NewTest!eh vector constructor iterator' (0118146e)0118105c 85f6 test esi,esi0118105e 7421 je NewTest!wmain+0x51 (01181081)01181060 837efc00 cmp dword ptr [esi-4],0 //判断对象数量是否 为 001181064 8d46fc lea eax,[esi-4] //包含对象数量的地址保存到 eax01181067 740f je NewTest!wmain+0x48 (01181078)01181069 8b06 mov eax,dword ptr [esi] //取A的虚表地址0118106b 8b5004 mov edx,dword ptr [eax+4] //虚表里的第二个函数0118106e 6a03 push 301181070 8bce mov ecx,esi01181072 ffd2 call edx01181074 5e pop esi01181075 33c0 xor eax,eax01181077 c3 ret重点看上面红色的代码, 我们可以看到, 在new一个数组时,编译器帮我们做了下面一些事情:(1)调用数组的operator new[] 分配内存, 大小为 4 + sizeof(object) * count, 其中头四个字节为对象数量(2)调用NewTest!eh vector constructor iterator(pArrayAddress, sizeof(object), object_count, pFunConstructor, pFunDestructor),其中 pFunDestructor为析构函数, pFunConstructor为构造函数, object_count为对象数量, sizeof(object)为对象大小,pArrayAddress为起始地址。,下面我们反汇编 NewTest!eh vector constructor iterator:0:000> u 0118146e L50NewTest!eh vector constructor iterator':0118146e 6a10 push 10h01181470 6890221801 push offset NewTest!__rtc_tzz+0x8 (01182290)01181475 e8d2040000 call NewTest!__SEH_prolog4 (0118194c)0118147a 33c0 xor eax,eax0118147c 8945e0 mov dword ptr [ebp-20h],eax0118147f 8945fc mov dword ptr [ebp-4],eax01181482 8945e4 mov dword ptr [ebp-1Ch],eax01181485 8b45e4 mov eax,dword ptr [ebp-1Ch] //临时计数,初始为001181488 3b4510 cmp eax,dword ptr [ebp+10h] //将临时计数和对象数量比较0118148b 7d13 jge NewTest!eh vector constructor iterator'+0x32 (011814a0) //如果临时计数大于对象数量则退出循环0118148d 8b7508 mov esi,dword ptr [ebp+8] //保存第一个参数(起始地址)到 esi01181490 8bce mov ecx,esi //赋this指针到ecx01181492 ff5514call dword ptr [ebp+14h] //调用构造函数01181495 03750c add esi,dword ptr [ebp+0Ch] //移动指针, 加上对象大小01181498 897508 mov dword ptr [ebp+8],esi //保存新对象地址到第一个参数0118149b ff45e4 inc dword ptr [ebp-1Ch] //增加临时计数0118149e ebe5 jmp NewTest!eh vector constructor iterator'+0x17 (01181485)011814a0 c745e001000000 mov dword ptr [ebp-20h],1011814a7 c745fcfeffffff mov dword ptr [ebp-4],0FFFFFFFEh011814ae e808000000 call NewTest!eh vector constructor iterator'+0x4d (011814bb)011814b3 e8d9040000 call NewTest!__SEH_epilog4 (01181991)011814b8 c21400 ret 14h我们可以看到NewTest!eh vector constructor iterator是编译器帮我们生成的函数, 它的作用就是为数组中的每个对象都调用构造函数。接下我们再看看数组形式的delete []在背后究竟干了什么?<span style="color: rgba(42, 42, 42, 1)">重点看上面</span><span style="color: rgba(255, 0, 255, 1)">紫色</span><span style="color: rgba(42, 42, 42, 1)">的代码:</span><span style="color: rgba(42, 42, 42, 1)">NewTest!wmain:</span><span style="color: rgba(42, 42, 42, 1)">....</span><span style="color: rgba(255, 0, 255, 1)">01181060 837efc00 cmp dword ptr [esi-4],0 //判断对象数量是否 为 0</span><span style="color: rgba(255, 0, 255, 1)">01181064 8d46fc lea eax,[esi-4] //包含对象数量的地址保存到 eax</span><span style="color: rgba(255, 0, 255, 1)">01181067 740f je NewTest!wmain+0x48 (01181078)</span><span style="color: rgba(255, 0, 255, 1)">01181069 8b06 mov eax,dword ptr [esi] //取A的虚表地址</span><span style="color: rgba(255, 0, 255, 1)">0118106b 8b5004 mov edx,dword ptr [eax+4] //虚表里的第二个函数</span><span style="color: rgba(255, 0, 255, 1)">0118106e 6a03 push 3</span><span style="color: rgba(255, 0, 255, 1)">01181070 8bce mov ecx,esi</span><span style="color: rgba(255, 0, 255, 1)">01181072 ffd2 call edx</span><span style="color: rgba(255, 0, 255, 1)">....</span><span style="color: rgba(42, 42, 42, 1)">可以看到它将对象列表起始地址保存到ecx, 然后调用对象虚表里的第二个函数, 并且传入参数是3, 我们先看对象虚表内容:</span>0:000> dps 0118214801182148 01181000 NewTest!A::print [f:\test\newtest\newtest\newtest.cpp @ 11]0118214c 01181090 NewTest!A::vector deleting destructor'我们看看该函数究竟干了什么:0:000> u 01181090 L40NewTest!A::vector deleting destructor':01181090 53 push ebx01181091 8a5c2408 mov bl,byte ptr [esp+8]01181095 56 push esi01181096 8bf1 mov esi,ecx01181098 f6c302 test bl,2 //是否需要调用析构函数0118109b 742b je NewTest!A::vector deleting destructor'+0x38 (011810c8)0118109d 8b46fc mov eax,dword ptr [esi-4]011810a0 57 push edi011810a1 6810101801 push offset NewTest!A::~A (01181010)011810a6 8d7efc lea edi,[esi-4]011810a9 50 push eax011810aa 6a04 push 4011810ac 56 push esi011810ad e87f040000call NewTest!eh vector destructor iterator'</span><span style="color: rgba(42, 42, 42, 1)"><span class="Apple-converted-space"></span>(01181531)</span>011810b2 f6c301 test bl,1 //是否需要释放内存011810b5 7409 je NewTest!A::vector deleting destructor'+0x30 (011810c0)011810b7 57 push edi011810b8 e85f030000call NewTest!operator delete[] (0118141c)011810bd 83c404 add esp,4011810c0 8bc7 mov eax,edi011810c2 5f pop edi011810c3 5e pop esi011810c4 5b pop ebx011810c5 c20400 ret 4可以看到它内部调用的是NewTest!eh vector destructor iterator, 而如果再跟踪NewTest!eh vector destructor iterator,会看所有数组里的对象调用析构函数, 最后调用operator delete[]释放所有内存。我们可以看到数组new[]和delete[]的关键是, C++编译器在数组起始地址之前的4个字节保存了对象的数量N,后面会根据这个数量值进行N次的构造和析构 。最后申明下, 上面的分析仅限于VS2008, 实际上在符合C++标准的前提下, 各个C++编译器有各自不同的实现。我们可以看到C++ 编译器在背后干了很多事情,可能会内联我们的函数, 也可以修改和产生其他一些函数, 而这是很多C开发者受不了的事情, 所以在内核级别, 很多人宁愿用C来减少编译器背后的干扰。最后思考一下, 如果我们代码这样写,会怎么样?int _tmain(int argc, _TCHAR argv[]){A p = newB[10];delete []p;return 0;}答案请看这里原文链接: https://www.cnblogs.com/eagleknight/p/3437414.html

欢迎关注

微信关注下方公众号,第一时间获取干货硬货;公众号内回复【pdf】免费获取数百本计算机经典书籍

原创文章受到原创版权保护。转载请注明出处:https://www.ccppcoding.com/archives/113017

非原创文章文中已经注明原地址,如有侵权,联系删除

关注公众号【高性能架构探索】,第一时间获取最新文章

转载文章受原作者版权保护。转载请注明原作者出处!

(0)
上一篇 2023年2月10日 下午12:10
下一篇 2023年2月10日 下午12:14

相关推荐