Linux Kernel: container_of 黑科技（与Strict Aliasing）

#include<stdio.h>
#include<stdint.h>

/* 求一个struct内的成员MEMBER到这个struct的内存起始处的距离(offset) */
#define offsetof(TYPE,MEMBER)  ((size_t)&((TYPE*)0)->MEMBER)

  /* @ptr     所用的member的指针
  *  @type    所求的struct的类型名
  *  @member  所用的member的变量名
  */
#define container_of(ptr,type,member) ({      \
    const typeof( ((type*)0)->member ) *__mptr = (ptr);  \
    (type *)( (uint8_t *)__mptr - offsetof(type,member) );})  
/*第一行：声明一个跟ptr一样类型的指针*//*第二行：求出这个指针所指的成员所在的struct的内存起始位置，做类型转换后，就可以得到这个struct的指针*/
struct MyStruct {
    int i_1;
    int i_2;
    int i_3;
    int i_4;
    int i_5;
    int i_6;
    int i_7;
    int i_8;
    int i_9;
    int i_a;
    int i_b;
    int i_c;
    int i_d;
    int i_e;
    int i_f;
    int i_g;
    int i_h;
    int i_j;
};

int main()
{

    struct MyStruct Test = { 0,1,2,3,4,5,6,7,8,9,11,12,13,14,15,16,17,18};
    printf("Just a test: Test->i_1: %d\n",Test.i_1);
    int *ptr = &(Test.i_2);
    printf("Now we got i_2: %d \n",*ptr);
    struct MyStruct *struct_ptr = container_of(ptr,struct MyStruct,i_2);   /*当使用长得像函数的宏的时候，就可以将一个类型，而不是一个变量，当参数传进去*/
    printf("Now we got struct\n");
    printf("Test.i_3 is: %d \n",struct_ptr->i_3);

    return 0;
}

由一个struct里的member(成员)的指针得到这个struct的指针，从而可以得到这个struct的其他成员。

通过这种方法，Linux内部就不需要维护一个task_struct的双向链表了，只需要维护task_struct里面的某个成员的双向链表，效果等同于一个task_struct的双向链表。

这样有什么好处呢？好处就是可以通过这种方法达到“范型”的功能！

举个例子：

假如不这样做，那么：

假如要维护task_struct的链表，那么就要写一系列的task_struct的链表操作。。。。

假如又要维护一个task_queue, 那么又要写一系列的操作。。。

假如又有一个task_xxxx，那么就gg了。。。。

但是假如这样的话，就方便多了。

（task_struct是Linux Kernel内部用来描述一个process／thread的struct，维护一个task_struct的双向链表是因为要从一个进程得到其子进程、父进程、兄弟进程......）

可以看出其实这个黑科技之所以能实现主要是因为gcc那个 typeof操作符......

Links（更清楚的代码）:

Definition of offsetof :http://lxr.free-electrons.com/source/include/linux/stddef.h

Definition of container_of:http://lxr.free-electrons.com/source/include/linux/kernel.h#L840

上面用到的((type*)0)是什么意思？

通过这个转换，可以得到一个类型为type的指针，而且其指向的地址是0x00000000。这个转换之后，编译器会“以为”这个指针所指的位置的后面那块地方放着一个类型为type的东西（尽管可能不是）

值得说明的是，指针就是指向一个点而已，但是指针的类型决定着它后面可以有多少“属于它”的内存。

但是!但是!这里补充说明一点，不要天真的以为你可以用任意类型t的指针指向某一块内存，从而掩耳盗铃般地以为后面那块内存就是存放这个类型的变量。

在 C/C++的世界里，有一个规则叫做 Strict Aliasing：

故事的开始是这样的：

由于Ｃ／Ｃ＋＋里头有指针的存在，所以在Strict Aliasing这个规则出来以前，很多人都会做这样的事：

uint32_t foo;
uint16_t *bar = &foo;
bar[0] = 1;
bar[1] = 2;

这个看起来很自然，毕竟，人家就是将一块内存分成两块，然后分别对两块进行操作.......

但是，这样做有很多问题：

第一，直接无视了big-endian和little-endian的机器的区别。

第二，严重的妨碍了编译器做优化。要知道，编译器在做优化的时候，如果发现在某个过程中，某个值X看起来没有修改过，或者根本就没人引用它，他就真的会直接地以为这个值根本没有被修改过，所以，这个过程之后，如果X已经被放在了寄存器中，当再次被使用的时候，编译器就会直接从寄存器中取这个值，而不会再次取查看内存中的X。所以，如果像上面的代码那样的话，编译器优化的结果就是错的了。

在远古时期，很多程序员都会干这种事。那时候的编译器作者苦啊，辛辛苦苦做优化，做出来结果却不正确，明明错不在自己，还要被人批。

所以，为了解决这个问题，Ｃ／Ｃ＋＋标准规定了：

From C9899:201x 6.5 Expressions:

7 An object shall have its stored value accessed only by an lvalue expression that has one of the following types:

— a type compatible with the effective type of the object,

— a qualified version of a type compatible with the effective type of the object,

— a type that is the signed or unsigned type corresponding to the effective type of the object,

— a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,

— an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or

— a character type.

什么意思呢？就是，假如有一个long类型的变量，如果你用一个int类型的指针去指向它，那么你的行为就是非法的，未定义的了，编译器将不会对你的代码的结果做出任何保证。当然，你可以用unsignedlong或者unsignedlongvolatile之类的指针指向它，这还是合法的。这就是Strict Aliasing。

有一个例外：你可以用char类型的指针指向任何东西（看标准的最后一条）。

还有一个例外，关于union结构体的。这个在C99标准之内，但是不在C++标准之内（但是我觉得大多数编译器还是会默认这种行为正确的）。我就不说明了，给一个错误的例子和正确的例子自己找差别：

错误的例子（结果Undefined）：

uint32_t
swaphalves(uint32_t a)
{
    uint32_t acopy=a;
    uint16_t *ptr=(uint16_t*)&acopy;// can't use static_cast<>, not legal.
                   　　　　　　　　　 // you should be warned by that.
    uint16_t tmp=ptr[0];
    ptr[0]=ptr[1];
    ptr[1]=tmp;
    return acopy;　　　　　　　　　　//由于你用unint_16的指针指向一个uint32_t的变量，编译器当做没看到，然后就以为acopy根本就没有被修改过（因为没有东西指向acopy)，                                  //很有可能会直接将一开始的acopy直接返回（点击博文下面的链接【１】看汇编）
}

int main()
{
    uint32_t a;
    a=32;
    cout << hex << setfill('0') << setw(8) << a << endl;
    a=swaphalves(a);
    cout << setw(8) << a << endl;
}

将上面的swaphalves改成这样之后就是正确的了：

uint32_t
swaphalves(uint32_t a)
{
    typedef union { 
        uint32_t as32bit; 
        uint16_t as16bit[2]; 
    } swapem;

    swapem s={a};
    uint16_t tmp;
    tmp=s.as16bit[0];
    s.as16bit[0]=s.as16bit[1];
    s.as16bit[1]=tmp;
    return s.as32bit;
}

几个关于Strict Aliasing的的链接：

【１】：http://dbp-consulting.com/tutorials/StrictAliasing.html （我觉得讲得最好的）

【２】：http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html （引用率比较高的）

【３】：http://stackoverflow.com/questions/98650/what-is-the-strict-aliasing-rule （肯定要有stackoverflow。。。）

最后还有一点，关于restrict这个关键字：

void foo(int * restrict i1, int * restrict i2);

通过restrict关键字，程序员向编译器保证，r1和r2不会指向相同的内存，这样编译器就可以放胆进行优化了。当然，如果两者真的指向相同的内存，后果自负。

（Ｃ＋＋中没有register这个关键字，但是很多编译器都有形如restrict或__restrict的扩展。

Link：

【１】： http://cellperformance.beyond3d.com/articles/2006/05/demystifying-the-restrict-keyword.html

（其实这些东西，包括上一篇的volatile关键字，的奇奇怪怪的原因与应用，都跟编译器的优化有关，看出来没有。。。。。。）

：）

原文链接: https://www.cnblogs.com/walkerlala/p/5451090.html

欢迎关注

微信关注下方公众号，第一时间获取干货硬货；公众号内回复【pdf】免费获取数百本计算机经典书籍

原创文章受到原创版权保护。转载请注明出处：https://www.ccppcoding.com/archives/232791

非原创文章文中已经注明原地址，如有侵权，联系删除

关注公众号【高性能架构探索】，第一时间获取最新文章

转载文章受原作者版权保护。转载请注明原作者出处！

Linux Kernel: container_of 黑科技 （与Strict Aliasing）

相关推荐

Linux Kernel: container_of 黑科技（与Strict Aliasing）