文字内容
1. Python⾼级内存管理 - xiaorui.cc
2. Object-specific allocators _____ ______ ______ ________ [ int ] [ dict ] [ list ] ... [ string ] Python core +3 <----- Object-specific memory -----> <-- Non-object memory --> _______________________________ [ Python's object allocator ] +2 ####### Object memory ####### <------ Internal buffers ------> ______________________________________________________________ [ Python's raw memory allocator (PyMem_ API) ] +1 <----- Python memory (under PyMem manager's control) ------> __________________________________________________________________ [ Underlying general-purpose allocator (ex: C library malloc) ] 0 <------ Virtual memory allocated for the python process -------> ========================================================================= _______________________________________________________________________ [ OS-specific Virtual Memory Manager (VMM) ] -1 <--- Kernel dynamic storage allocation & management (page-based) ---> __________________________________ __________________________________ [ ][ ] -2 <-- Physical memory: ROM/RAM --> <-- Secondary storage (swap) -->
3. * Request in bytes Size of allocated block Size class idx * ---------------------------------------------------------------* 1-8 8 0 * 9-16 16 1 * 17-24 24 2 * 25-32 32 3 * 33-40 40 4 * 41-48 48 5 * 49-56 56 6 * 57-64 64 7 * ... ... ... * 497-504 504 62 * 505-512 512 63 * * */
4. 名词解释 process heap Arenas Pool UsedPools FreePools
5. method posix malloc python memory pool object buffer pool
6. Arena Process UserPool stack 1-8 Arena heap bss init data text malloc heap & pool …… 249 - 256 Pool Free Block Free Block Use Block FeeePool Pool Pool Headers Pool Pool No BLock
7. userdpool design UserdPools 1-8 9-16 Pool Header Header 分配 Free Block 回收 Free Block … 17-24 … … Pool 249-256 Free Block Free Block Use Block 同⼀个Pool下Block⼀样长 单Pool为4kb Block及Pool都为单链表
8. free pool desgin FeeePool Pool Pool Headers Pool No BLock Pool … Pool Headers No BLock Pool为4kb⼤小 Pool清理Headers
9. where store variable ? run-time Stack heap list [1 ,2, 3] dict {“n”: “1”} int 1
10. why ? In [1]: a = 123 In [7]: a = 'n' In [2]: b = 123 In [8]: b = 'n' In [3]: a is b Out[3]: True In [9]: a is b Out[9]: True In [4]: a = 1000 In [10]: a = "python" In [5]: b = 1000 In [11]: b = "python" In [6]: a is b Out[6]: False In [12]: a is b Out[12]: True
11. why ? In [10]: a = b = 'nima' In [11]: b = a In [12]: a is b Out[12]: True 只有引用 ? In [1]: def go(var): ...: print id(var) …: In [2]: id(a) Out[2]: 4401335072 In [13]: b = 'hehe' In [14]: a is b Out[14]: False In [3]: go(a) 4401335072
12. python objects stored in memory? names Python Has Names, Not Variables ! ! ! names object
13. 整数对象池 小整数 -5 var_1 -4 … var_2 the same addr ! 0 … ⼤整数 256 257 28 bytes 解释器初始化 … … -5 -4 … … 257 … … … … -5 -4 … … 257 … … var_3 var_4 not the same addr !
14. 整数对象池 Block List PyIntBlock PyIntBlock 不会归还给 Arena和os ! ! ! Free List PyIntBlock PyIntBlock
15. 字符对象池 a b c var_1 d … var_2 the same addr ! … … 单个字符38 bytes 由解释器初始化
16. 字符串对象池 0 aa en ref var_1 var_2 1 cao 2 oh 3 woyao … buyao kuai feile hash存储变量 共用地址 记录引用计数
17. PyObject_GC_TRACK func:'>func: PyList_New PyGC_Head Node Node func:'>func: list_dealloc ref: https://svn.python.org/projects/python/trunk/Objects/listobject.c
18. ref count 300 x = 300 y=x z = [x, y] ref += 1 ref += 1 X y ref += 2 Z References -> 4 !
19. What does del do? x = 300 y=x del x 300 ref -= 1 X The del statement doesn’t delete objects. • removes that name as a reference to that object • reduces the ref count by 1 y References -> 1!
20. ref count case def go(): w = 300 go() a = “fuc . ” del a b = “en, a” b = None ref count +1 w is out of scope; ref count -1 del a; ref count -1 重新赋值; ref count -1
21. class Node: def __init__(self, va): self.va = va def next(self, next): self.next = next cyclical ref Mid if del mid node: how ? mid = Node(‘root’) left = Node(‘left’) right = Node(‘right’) mid(left) left.next(right) right.next(left) left right
22. mark & sweep gc root b a w R K c G
23. 分代回收 PyGC_Head Young node node node Old node node node node Permanent node node node node node node node node node node node node 分⽽治之 提⾼效率 ⽣命周期 空间换时间
24. when gc import gc gc.set_threshold(700, 10, 5) 计数器 ? 700 ? 10 ? 5? PyMemApi 分配计数器 0代回收 > 700 1代回收 N % 10 2代回收 N%5
25. summery 分配内存 -> 发现超过阈值了 -> 触发垃圾回收 -> 将所有可收集对象链表放到⼀起 -> 遍历, 计算有效引用计数 -> 分成 有效引用计数=0 和 有效引用计数 > 0 两个集合 -> ⼤于0的, 放⼊到更老⼀代 -> =0的, 执⾏回收 -> 回收遍历容器内的各个元素, 减掉对应元素引用计数(破掉循环引用) -> 执⾏-1的逻辑, 若发现对象引用计数=0, 触发内存回收 -> python底层内存管理机制回收内存
26. weakref 弱引用 class Expensive(object): def __del__(self): print '(Deleting %s)' % self obj = Expensive() r = weakref.ref(obj) del obj print 'r():', r() 不参与引用计数 解决循环引用 class Parent(object): def __init__(self): self.children = [ Child(self) ] class Child(object): def __init__(self, parent): self.parent = weakref.proxy(parent)
27. 可变 vs 不可变 (obj) list dict string int tuple
28. container objects a = [10, 10, 11] b=a PyObject PyListObject Type list 10 rc 1 10 items 11 size … … Type integer rc 2 value 10 PyObject Type integer rc 1 value 11
29. copy.copy a = [10, 10, [10, 11] ] b = copy.copy(a) PyObject PyListObject Type list rc 1 items size … … 10 10 list rc 1 items size … … integer rc 2 value 10 ref PyListObject PyListObject Type Type 10 10 ref 10 11 PyObject Type integer rc 1 value 11
30. copy.deepcopy a = [10, [ 10, 11 ] ] b = copy.deep(a) PyObject PyListObject Type list rc 1 10 ref items size … 10 11 integer rc 2 value 10 … PyListObject Type list rc 1 items PyListObject 10 ref size … PyListObject Type … PyObject 10 Type integer 11 rc 1 value 11
31. diy gc import gc import sys gc.set_debug(gc.DEBUG_STATS gc.DEBUG_LEAK) a=[] b=[] a.append(b) print 'a refcount:',sys.getrefcount(a) # 2 print 'b refcount:',sys.getrefcount(b) # 3 del a del b print gc.collect() # 0
32. Garbage Collector Optimize memory bound 可以降低threshold来时间换空间 cpu bound 提⾼threshold来空间换时间 暂停gc, 引⼊master worker设计
33. Q&A 引用计数 跟 gil 的影响 ? gc 是否是原⼦ ? gc的 stop the world现象 ? …
34. “ END – xiaorui.cc ”