| yu's profile表说偶无趣:)PhotosBlogLists | Help |
|
July 21 The story of "container_of()" and "likely() ,unlikely()"(1)container_of()
What exactly does the weird Linux kernel
To help explain pointer mess, let us go through the steps of what it is doing. In the
and is used in code as:
where Replacing the above code with the first macro expansion produces:
Then, the next level of expansion creates:
To simplify this, remember that So with the type of the
To show this using real numbers, say the i2c_driver structure looks like:
The location of the driver variable usually is 32 bytes into the structure, depending on the packing rules of the compiler at the time. So, if the Putting these values into the macro, it then is reduced to:
which is then evaluated at runtime by the processor as a simple subtraction on a pointer, which should be quite fast. In order to do this kind of pointer manipulation, the code has to know the type of pointer being passed to it. The driver core only passes in the type of driver structure registered with it, so this type of manipulation is safe. This also prevents other parts of the kernel from modifying the unique fields of the structure used to control the subsystem's code.
(2)likely() ,unlikely() What are they ?In Linux kernel code, one often find calls to likely() and unlikely(), in conditions, like : bvl = bvec_alloc(gfp_mask, nr_iovecs, &idx); if (unlikely(!bvl)) { mempool_free(bio, bio_pool); bio = NULL; goto out; } In fact, these functions are hints for the compiler that allows it to correctly optimize the branch, by knowing which is the likeliest one. The definitions of these macros, found in include/linux/compiler.h are the following : #define likely(x) __builtin_expect(!!(x), 1) #define unlikely(x) __builtin_expect(!!(x), 0) The GCC documentation explains the role of __builtin_expect() : -- Built-in Function: long __builtin_expect (long EXP, long C) You may use `__builtin_expect' to provide the compiler with branch prediction information. In general, you should prefer to use actual profile feedback for this (`-fprofile-arcs'), as programmers are notoriously bad at predicting how their programs actually perform. However, there are applications in which this data is hard to collect. The return value is the value of EXP, which should be an integral expression. The value of C must be a compile-time constant. The semantics of the built-in are that it is expected that EXP == C. For example: if (__builtin_expect (x, 0)) foo (); would indicate that we do not expect to call `foo', since we expect `x' to be zero. Since you are limited to integral expressions for EXP, you should use constructions such as if (__builtin_expect (ptr != NULL, 1)) error (); when testing pointer or floating-point values. How does it optimize things ?It optimizes things by ordering the generated assembly code correctly, to optimize the usage of the processor pipeline. To do so, they arrange the code so that the likeliest branch is executed without performing any jmp instruction (which has the bad effect of flushing the processor pipeline). To see how it works, let's compile the following simple C user space program with gcc -O2 : #define likely(x) __builtin_expect(!!(x), 1) #define unlikely(x) __builtin_expect(!!(x), 0) int main(char *argv[], int argc) { int a; /* Get the value from somewhere GCC can't optimize */ a = atoi (argv[1]); if (unlikely (a == 2)) a++; else a--; printf ("%d\n", a); return 0; } Now, disassemble the resulting binary using objdump -S (comments added by me) : 080483b0 <main>: // Prologue 80483b0: 55 push %ebp 80483b1: 89 e5 mov %esp,%ebp 80483b3: 50 push %eax 80483b4: 50 push %eax 80483b5: 83 e4 f0 and $0xfffffff0,%esp // Call atoi() 80483b8: 8b 45 08 mov 0x8(%ebp),%eax 80483bb: 83 ec 1c sub $0x1c,%esp 80483be: 8b 48 04 mov 0x4(%eax),%ecx 80483c1: 51 push %ecx 80483c2: e8 1d ff ff ff call 80482e4 <atoi@plt> 80483c7: 83 c4 10 add $0x10,%esp // Test the value 80483ca: 83 f8 02 cmp $0x2,%eax // -------------------------------------------------------- // If 'a' equal to 2 (which is unlikely), then jump, // otherwise continue directly, without jump, so that it // doesn't flush the pipeline. // -------------------------------------------------------- 80483cd: 74 12 je 80483e1 <main+0x31> 80483cf: 48 dec %eax // Call printf 80483d0: 52 push %edx 80483d1: 52 push %edx 80483d2: 50 push %eax 80483d3: 68 c8 84 04 08 push $0x80484c8 80483d8: e8 f7 fe ff ff call 80482d4 <printf@plt> // Return 0 and go out. 80483dd: 31 c0 xor %eax,%eax 80483df: c9 leave 80483e0: c3 ret Now, in the previous program, replace the unlikely() by a likely(), recompile it, and disassemble it again (again, comments added by me) : 080483b0 <main>: // Prologue 80483b0: 55 push %ebp 80483b1: 89 e5 mov %esp,%ebp 80483b3: 50 push %eax 80483b4: 50 push %eax 80483b5: 83 e4 f0 and $0xfffffff0,%esp // Call atoi() 80483b8: 8b 45 08 mov 0x8(%ebp),%eax 80483bb: 83 ec 1c sub $0x1c,%esp 80483be: 8b 48 04 mov 0x4(%eax),%ecx 80483c1: 51 push %ecx 80483c2: e8 1d ff ff ff call 80482e4 <atoi@plt> 80483c7: 83 c4 10 add $0x10,%esp // -------------------------------------------------- // If 'a' equal 2 (which is likely), we will continue // without branching, so without flusing the pipeline. The // jump only occurs when a != 2, which is unlikely. // --------------------------------------------------- 80483ca: 83 f8 02 cmp $0x2,%eax 80483cd: 75 13 jne 80483e2 <main+0x32> // Here the a++ incrementation has been optimized by gcc 80483cf: b0 03 mov $0x3,%al // Call printf() 80483d1: 52 push %edx 80483d2: 52 push %edx 80483d3: 50 push %eax 80483d4: 68 c8 84 04 08 push $0x80484c8 80483d9: e8 f6 fe ff ff call 80482d4 <printf@plt> // Return 0 and go out. 80483de: 31 c0 xor %eax,%eax 80483e0: c9 leave 80483e1: c3 ret How should I use it ?You should use it only in cases when the likeliest branch is very very very likely, or when the unlikeliest branch is very very very unlikely. July 15 7-15放几篇好玩的文章地址吧:) http://lwn.net/Articles/187336/ http://lwn.net/Articles/188056/ http://lwn.net/Articles/188059/ 第一篇是说ext4的,后2篇是一个系列,承认第3篇我看的时候比较晕 http://www.ntu.edu.sg/home/ascpfu/veno/veno.html http://www-ece.rice.edu/networks/TCP-LP/ 关于tcp的小东西,已经merge进入2.6.18 http://developer.osdl.org/dev/robustmutexes/ 看标题就知道是什么的吧,MS这个现在还只是patch,不过对于高压力多线程服务器程序还是很有帮助 http://sourceforge.net/mailarchive/forum.php?thread_id=23836054&forum_id=2697 woo,一个完美的GPL的NTFS实现,由于采用了FUSE架构,这个patch无需进入内核,不过让人ft的是,它为优化过的效率竟然是这样的... > Version 1.03 ------Sequential Create------ --------Random Create-------- > -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- > files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP > reiserfs 16k 21459 99 +++++ +++ 17856 96 20172 98 +++++ +++ 16414 96 > jfs 16k 7015 13 +++++ +++ 5868 10 3068 14 +++++ +++ 1075 3 > ntfs-3g 16k 3021 99 14291 99 5226 99 3548 99 16149 99 5223 99 > xfs 16k 2401 17 +++++ +++ 2095 15 2301 20 +++++ +++ 347 2 > ext3 16k 1862 96 +++++ +++ +++++ +++ 1914 96 +++++ +++ 9695 98 > minix 16k 1450 97 +++++ +++ 18148 94 1694 97 +++++ +++ 4847 98 > fat32 16k 366 97 +++++ +++ 1809 97 428 97 +++++ +++ 1361 97 > paragon ntfs 16k 58 98 1259 99 245 99 55 99 +++++ +++ 832 99 > captive ntfs Always crashes early on file creations and loses files. > This was also confirmed and fix denied by its developer. |
|
|