multiprocessing.shared_memory
--- 對於共享記憶體的跨行程直接存取¶
原始碼:Lib/multiprocessing/shared_memory.py
在 3.8 版被加入.
該模組提供了一個 SharedMemory
類別,用於分配和管理被多核心或對稱多處理器 (symmetric multiprocessor, SMP) 機器上的一個或多個行程存取的共享記憶體。為了協助共享記憶體的生命週期管理,特別是跨不同行程的管理,multiprocessing.managers
模組中還提供了一個 BaseManager
子類別 SharedMemoryManager
。
在此模組中,共享記憶體是指「POSIX 風格」的共享記憶體區塊(儘管不一定如此明確實作),而不是指「分散式共享記憶體 (distributed shared memory)」。這種型別的共享記憶體允許不同的行程潛在地讀取和寫入揮發性記憶體 (volatile memory) 的公開(或共享)區域。通常行程只能存取自己的行程記憶體空間,但共享記憶體允許在行程之間共享資料,從而避免需要跨行程傳遞資料的情境。與透過硬碟或 socket 或其他需要序列化/還原序列化 (serialization/deserialization) 和複製資料的通訊方式以共享資料相比,直接透過記憶體共享資料可以提供顯著的性能優勢。
- class multiprocessing.shared_memory.SharedMemory(name=None, create=False, size=0, *, track=True)¶
建立
SharedMemory
類別的實例,用於建立新的共享記憶體區塊或附加到現有的共享記憶體區塊。每個共享記憶體區塊都被分配了一個唯一的名稱。透過這種方式,一個行程可以建立具有特定名稱的共享記憶體區塊,而不同的行程可以使用該相同名稱附加到同一共享記憶體塊。作為跨行程共享資料的資源,共享記憶體區塊的壽命可能比建立它們的原始行程還要長。當一個行程不再需要存取但其他行程可能仍需要的共享記憶體區塊時,應該呼叫
close()
方法。當任何行程不再需要共享記憶體區塊時,應呼叫unlink()
方法以確保正確清理。- 參數:
name (str | None) -- 所請求的共享記憶體的唯一名稱,指定為字串。建立新的共享記憶體區塊時,如果名稱提供為
None
(預設值),則會生成一個新的名稱。create (bool) -- 控制是否建立新的共享記憶體區塊 (
True
) 或附加現有的共享記憶體區塊 (False
)。size (int) -- 指定建立新共享記憶體區塊時請求的位元組數。由於某些平台會根據該平台的記憶體頁 (memory page) 大小來選擇分配記憶體區塊,因此共享記憶體區塊的確切大小可能大於或等於請求的大小。當附加到現有共享記憶體區塊時,
size
參數將被忽略。track (bool) -- When
True
, register the shared memory block with a resource tracker process on platforms where the OS does not do this automatically. The resource tracker ensures proper cleanup of the shared memory even if all other processes with access to the memory exit without doing so. Python processes created from a common ancestor usingmultiprocessing
facilities share a single resource tracker process, and the lifetime of shared memory segments is handled automatically among these processes. Python processes created in any other way will receive their own resource tracker when accessing shared memory with track enabled. This will cause the shared memory to be deleted by the resource tracker of the first process that terminates. To avoid this issue, users ofsubprocess
or standalone Python processes should set track toFalse
when there is already another process in place that does the bookkeeping. track is ignored on Windows, which has its own tracking and automatically deletes shared memory when all handles to it have been closed.
在 3.13 版的變更: Added the track parameter.
- close()¶
Close the file descriptor/handle to the shared memory from this instance.
close()
should be called once access to the shared memory block from this instance is no longer needed. Depending on operating system, the underlying memory may or may not be freed even if all handles to it have been closed. To ensure proper cleanup, use theunlink()
method.
- unlink()¶
Delete the underlying shared memory block. This should be called only once per shared memory block regardless of the number of handles to it, even in other processes.
unlink()
andclose()
can be called in any order, but trying to access data inside a shared memory block afterunlink()
may result in memory access errors, depending on platform.This method has no effect on Windows, where the only way to delete a shared memory block is to close all handles.
- buf¶
共享記憶體區塊內容的記憶體視圖 (memoryview)。
- name¶
對共享記憶體區塊之唯一名稱的唯讀存取。
- size¶
對共享記憶體區塊大小(以位元組為單位)的唯讀存取。
以下範例示範了 SharedMemory
實例的低階使用方式:
>>> from multiprocessing import shared_memory
>>> shm_a = shared_memory.SharedMemory(create=True, size=10)
>>> type(shm_a.buf)
<class 'memoryview'>
>>> buffer = shm_a.buf
>>> len(buffer)
10
>>> buffer[:4] = bytearray([22, 33, 44, 55]) # Modify multiple at once
>>> buffer[4] = 100 # Modify single byte at a time
>>> # Attach to an existing shared memory block
>>> shm_b = shared_memory.SharedMemory(shm_a.name)
>>> import array
>>> array.array('b', shm_b.buf[:5]) # Copy the data into a new array.array
array('b', [22, 33, 44, 55, 100])
>>> shm_b.buf[:5] = b'howdy' # Modify via shm_b using bytes
>>> bytes(shm_a.buf[:5]) # Access via shm_a
b'howdy'
>>> shm_b.close() # Close each SharedMemory instance
>>> shm_a.close()
>>> shm_a.unlink() # Call unlink only once to release the shared memory
以下範例示範了 SharedMemory
類別與 NumPy 陣列的實際用法:從兩個不同的 Python shell 存取相同的 numpy.ndarray
:
>>> # In the first Python interactive shell
>>> import numpy as np
>>> a = np.array([1, 1, 2, 3, 5, 8]) # Start with an existing NumPy array
>>> from multiprocessing import shared_memory
>>> shm = shared_memory.SharedMemory(create=True, size=a.nbytes)
>>> # Now create a NumPy array backed by shared memory
>>> b = np.ndarray(a.shape, dtype=a.dtype, buffer=shm.buf)
>>> b[:] = a[:] # Copy the original data into shared memory
>>> b
array([1, 1, 2, 3, 5, 8])
>>> type(b)
<class 'numpy.ndarray'>
>>> type(a)
<class 'numpy.ndarray'>
>>> shm.name # We did not specify a name so one was chosen for us
'psm_21467_46075'
>>> # In either the same shell or a new Python shell on the same machine
>>> import numpy as np
>>> from multiprocessing import shared_memory
>>> # Attach to the existing shared memory block
>>> existing_shm = shared_memory.SharedMemory(name='psm_21467_46075')
>>> # Note that a.shape is (6,) and a.dtype is np.int64 in this example
>>> c = np.ndarray((6,), dtype=np.int64, buffer=existing_shm.buf)
>>> c
array([1, 1, 2, 3, 5, 8])
>>> c[-1] = 888
>>> c
array([ 1, 1, 2, 3, 5, 888])
>>> # Back in the first Python interactive shell, b reflects this change
>>> b
array([ 1, 1, 2, 3, 5, 888])
>>> # Clean up from within the second Python shell
>>> del c # Unnecessary; merely emphasizing the array is no longer used
>>> existing_shm.close()
>>> # Clean up from within the first Python shell
>>> del b # Unnecessary; merely emphasizing the array is no longer used
>>> shm.close()
>>> shm.unlink() # Free and release the shared memory block at the very end
- class multiprocessing.managers.SharedMemoryManager([address[, authkey]])¶
multiprocessing.managers.BaseManager
的子類別,可用於跨行程管理共享記憶體區塊。在
SharedMemoryManager
實例上呼叫start()
會啟動一個新行程。這個新行程的唯一目的是管理那些透過它建立出的所有共享記憶體區塊的生命週期。要觸發釋放該行程管理的所有共享記憶體區塊,請在實例上呼叫shutdown()
,這會觸發對該行程管理的所有SharedMemory
物件的unlink()
呼叫,然後再停止這個行程。透過SharedMemoryManager
建立SharedMemory
實例,我們無需手動追蹤和觸發共享記憶體資源的釋放。此類別提供了用於建立和回傳
SharedMemory
實例以及建立由共享記憶體支援的類串列物件 (ShareableList
) 的方法。請參閱
BaseManager
了解繼承的 address 和 authkey 可選輸入引數的描述以及如何使用它們從其他行程連接到現有的SharedMemoryManager
服務。- SharedMemory(size)¶
建立並回傳一個新的
SharedMemory
物件,該物件具有指定的 size(以位元組為單位)。
- ShareableList(sequence)¶
建立並回傳一個新的
ShareableList
物件,該物件由輸入 sequence 中的值初始化。
以下範例示範了 SharedMemoryManager
的基本作用機制:
>>> from multiprocessing.managers import SharedMemoryManager
>>> smm = SharedMemoryManager()
>>> smm.start() # Start the process that manages the shared memory blocks
>>> sl = smm.ShareableList(range(4))
>>> sl
ShareableList([0, 1, 2, 3], name='psm_6572_7512')
>>> raw_shm = smm.SharedMemory(size=128)
>>> another_sl = smm.ShareableList('alpha')
>>> another_sl
ShareableList(['a', 'l', 'p', 'h', 'a'], name='psm_6572_12221')
>>> smm.shutdown() # Calls unlink() on sl, raw_shm, and another_sl
以下範例描述了一種可能更方便的模式,即透過 with
陳述式使用 SharedMemoryManager
物件,以確保所有共享記憶體區塊不再被需要後都被釋放:
>>> with SharedMemoryManager() as smm:
... sl = smm.ShareableList(range(2000))
... # Divide the work among two processes, storing partial results in sl
... p1 = Process(target=do_work, args=(sl, 0, 1000))
... p2 = Process(target=do_work, args=(sl, 1000, 2000))
... p1.start()
... p2.start() # A multiprocessing.Pool might be more efficient
... p1.join()
... p2.join() # Wait for all work to complete in both processes
... total_result = sum(sl) # Consolidate the partial results now in sl
在 with
陳述式中使用 SharedMemoryManager
時,當 with
陳述式的程式碼區塊執行完畢時,使用該管理器建立的共享記憶體區塊都會被釋放。
- class multiprocessing.shared_memory.ShareableList(sequence=None, *, name=None)¶
提供一個類似 list 的可變物件,其中儲存的所有值都儲存在共享記憶體區塊中。這將可儲存值限制為以下內建資料型別:
它也與內建
list
型別明顯不同,因為這些 list 不能更改其總長度(即沒有append()
、insert()
等)並且不支援通過切片動態建立新的ShareableList
實例。sequence 用於填充 (populate) 一個充滿值的新
ShareableList
。設定為None
以透過其唯一的共享記憶體名稱來附加到已經存在的ShareableList
。如
SharedMemory
的定義中所述,name 是被請求之共享記憶體的唯一名稱。當附加到現有的ShareableList
時,指定其共享記憶體區塊的唯一名稱,同時將 sequence 設定為None
。備註
bytes
和str
值存在一個已知問題。如果它們以\x00
nul 位元組或字元結尾,那麼當透過索引從ShareableList
中獲取它們時,這些位元組或字元可能會被默默地剝離 (silently stripped)。這種.rstrip(b'\x00')
行為被認為是一個錯誤,將來可能會消失。請參閱 gh-106939。對於去除尾隨空值 (rstripping of trailing nulls) 會出問題的應用程式,變通解法 (workaround) 是始終無條件地在儲存時於此類值的末尾追加一個額外非 0 位元組,並在獲取時也無條件地刪除它:
>>> from multiprocessing import shared_memory >>> nul_bug_demo = shared_memory.ShareableList(['?\x00', b'\x03\x02\x01\x00\x00\x00']) >>> nul_bug_demo[0] '?' >>> nul_bug_demo[1] b'\x03\x02\x01' >>> nul_bug_demo.shm.unlink() >>> padded = shared_memory.ShareableList(['?\x00\x07', b'\x03\x02\x01\x00\x00\x00\x07']) >>> padded[0][:-1] '?\x00' >>> padded[1][:-1] b'\x03\x02\x01\x00\x00\x00' >>> padded.shm.unlink()
- count(value)¶
回傳 value 出現的次數。
- index(value)¶
回傳 value 的第一個索引位置。如果 value 不存在,則引發
ValueError
。
- shm¶
儲存值的
SharedMemory
實例。
以下範例示範了 ShareableList
實例的基本用法:
>>> from multiprocessing import shared_memory
>>> a = shared_memory.ShareableList(['howdy', b'HoWdY', -273.154, 100, None, True, 42])
>>> [ type(entry) for entry in a ]
[<class 'str'>, <class 'bytes'>, <class 'float'>, <class 'int'>, <class 'NoneType'>, <class 'bool'>, <class 'int'>]
>>> a[2]
-273.154
>>> a[2] = -78.5
>>> a[2]
-78.5
>>> a[2] = 'dry ice' # Changing data types is supported as well
>>> a[2]
'dry ice'
>>> a[2] = 'larger than previously allocated storage space'
Traceback (most recent call last):
...
ValueError: exceeds available storage for existing str
>>> a[2]
'dry ice'
>>> len(a)
7
>>> a.index(42)
6
>>> a.count(b'howdy')
0
>>> a.count(b'HoWdY')
1
>>> a.shm.close()
>>> a.shm.unlink()
>>> del a # Use of a ShareableList after call to unlink() is unsupported
以下範例描述了一個、兩個或多個行程如何透過提供後面的共享記憶體區塊名稱來存取同一個 ShareableList
:
>>> b = shared_memory.ShareableList(range(5)) # In a first process
>>> c = shared_memory.ShareableList(name=b.shm.name) # In a second process
>>> c
ShareableList([0, 1, 2, 3, 4], name='...')
>>> c[-1] = -999
>>> b[-1]
-999
>>> b.shm.close()
>>> c.shm.close()
>>> c.shm.unlink()
以下範例示範了如果需要,可以對 ShareableList
(和底層 SharedMemory
)物件進行 pickle 和 unpickle。請注意,它仍然是相同的共享物件。發生這種情況是因為反序列化的物件具有相同的唯一名稱,並且只是附加到具有相同名稱的現有物件(如果該物件仍然存在):
>>> import pickle
>>> from multiprocessing import shared_memory
>>> sl = shared_memory.ShareableList(range(10))
>>> list(sl)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> deserialized_sl = pickle.loads(pickle.dumps(sl))
>>> list(deserialized_sl)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> sl[0] = -1
>>> deserialized_sl[1] = -2
>>> list(sl)
[-1, -2, 2, 3, 4, 5, 6, 7, 8, 9]
>>> list(deserialized_sl)
[-1, -2, 2, 3, 4, 5, 6, 7, 8, 9]
>>> sl.shm.close()
>>> sl.shm.unlink()