Rusts type system requires that there only ever is one mutable reference to a valueorone or more shared references. What happens when you need multiple references to some value, but also need to mutate through them? We use a trick calledinteror mutability: to the outside world you act like a value is immutable so multiple references are allowed. But internally the type is actually mutable.
All types that provide interior mutability have anUnsafeCell
at their core.UnsafeCell
is the only primitive that allows multiple mutable pointers to its interior, without violating aliasing rules. The only way to use it safely is to only mutate the wrapped value when there are no other readers. No, the garantee has to be even stronger:we can not mutate it and can not create a mutable reference to the wrapped value while there are shared references to its value.
Boththe bookand thestd :: cell
modulegive a good alternative explanation of interor mutability.
We are going to look at four patterns:
One way to make sure there are no shared references when mutating the wrapped value is to never hand out any references at all. TheCell
type in the standard library is the prime example of this pattern.
Cell
Basic API ( see documentation
) :
(impl) (T) ***************************
>********************** (Cell) ************************** ((({ pubfnset (&
self(********************, **************************** val ************** val: (T
pubfnreplace
(&
self(********************, **************************** val ************** val: (T
->T
impl: (Copy
Cell******************
********************** (T)>
{*************************) pubfn (get the (&self****************************
>
T
impl: (Default
Cell******************
********************** (T)>
{*************************) pubfn (take take (&self****************************
>
T
******************************
The idea of Cell
is that when reading its contents you always eitherset
orgetthe entire contents of theCell
. With a single thread there can only ever be one such operation in progress, so there is no change of multiple mutable accesses: a safe API for interior mutability.
Before Rust 1. (**************************** (RFC) (Cell) was only implemented forCopy
types with theset
andgetmethods. Crates like movecelland
That brings us to the flexible API above:get
is the normal way to read the value of aCell
, but is only available forCopy
types. If a type can’t be copied, you can alwaysreplace
it with some other value. And if the type comes with aDefault
, you cantake
out the value and replace it with the default.
Cell
does not work with multiple threads. It has no way to coordinate with another thread thatset
is not used while the other usesget
. Enter the synchronization primitives for aCell
- like API below.
Did you know that Rustsatomic integersare just anUnsafeCell
around an integer?
(# [repr(C,align(4))]pub
struct AtomicU **************
{
: ******************** UnsafeCell
>
,******************************
With atomic operations the processor ensures any read or write will be synchronized across threads. What can be done for integers, can be done for any type that is small enough to fit in an integer. Usualy this means the types can be up to 8 bytes, and most - bit platforms support atomic operations on up to bytes.
Multiple crates allow using anUnsafeCell
with atomic operations on various small types:
****************************** (Crossbeam) ************ with itsAtomicCell
atomic
withAtomic (requires types areCopy
).
atomig
, withAtomic (requires a# [derive(atom)]
).
If you look past the names of the methods onAtomicCell
, just notice how close the atomic operations resemble the methods onCell
. Basic API ( see documentation
):
(impl) (T) ***************************
>********************** (AtomicCell) ********************** ((({ pubfnstore (&
self(********************, **************************** val ************** val: (T
pubfn swap(&self(********************, **************************** val ************** val: (T->T
impl: (Copy
AtomicCell******************
********************** (T)>
{*************************) pubfn load (&self****************************
>
T
impl: (Default
AtomicCell******************
********************** (T)>
{*************************) pubfn (take take (&self****************************
>
T
impl: (Copy
Eq******************
********************** (AtomicCell) T**************************** pubfn (************************** compare_and_swap
(
&self(********************, ****************************current: (T(****************************,new:********************** (T) )->T pubfncompare_exchange
(&
self(********************, ****************************
current: (T
(****************************,new:********************** (T) )-> Result,************************ (Tunsafe
impl
********************** (T) ****************************************:
Send**************************** Sync********************** (for AtomicCell{}******************************
The methodsget,setandreplace
are nowload
,store
andswap
.AtomicCell
also offerscompare_and_swap
, one of the essential tools for concurrent algorithms. Currently this operation isquestionable, because transmuting small types with padding to an atomic integer and comparing it is probably Undefined Behavior. But I am confident the maintainers will figure out a solution.
In my opinion every of one of those crates has some disadvantage. Crossbeam has the potential problem with padding bytes.atomic
has the same problem, and requires its types to beCopy
.atomigcan’t be derived for types from other crates. Still, as long as you are careful to use it on small types without padding, you are good with all of them.
Atomic operations are definitely the best concurrent solution for interior mutability on this list.
Locks
Exactly the same API as the one above forAtomicCell
can be provided for larger types, by using some locking scheme.
ASeqLock. orsequence lockprotects its wrapped value with a sequence counter. When writing withset
the sequence counter is incremented. Any read withgetwill first check the sequence counter, read the value, and check the sequence counter again. If the sequence counter changed, it will know there was a data race while reading and try again.This is technically UB in Rust and therefore a questionable locking method.The property that it does not have to do any atomic writes when just reading the value gives a very good performance. Note however that when a thread keeps writing to a seqlock, it can starve readers.
A spinlock will set and clear a flag on every read or write. It doesn’t allow for multiple concurrent reads.
atomic :: Atomic
does not include a lock with its type, but uses an global array of 41 spinlocks. A hash of the address of the data is used as index to get a lock. This gives a nice little space efficiency.
crossbeam :: AtomicCell
does the same thing with a set of (****************************************************************************************************************** global seqlocks
. For every read it will make one attempt to use a seqlock as seqlock, and if that fails use it as spinlock.
With a concurrentCell
- like API it is best to use it like the initial version ofCell
: only with types that are (Copy. The trick to move non -Copy
values in and out of aCell
withreplaceandtakerequires that you can always replace it with some other value that remains useful to other threads, which seems often impossible impossible to me. For non -Copy
types, use one of the reference based solutions below.
Note that with aCell
- like API locks are held for only a very short time, about as long as it takes to do amemcpy. So you usually don't need to ask the OS for assistance to wait until a thread is done (likeMutex
,RwLock
and co. Below).
There is currently no good crate that offers a standaloneSeqLock
(see also my previous post on (writing a seqlock in Rust
). The the seqlock
crate hands out a mutable reference while reads may be in progress, which violates the aliasing rules. I also don’t think there is a crate that offers a spinlock with the API described here. But the fallback for larger types inatomic :: Atomic
and especially incrossbeam :: AtomicCell**************************
are pretty cool. Ensuring at runtime a mutable reference is unique
It is easily possible to hand out references, we just have to keep track of the number of shared and / or mutable references that are currently handed out.Smart Pointers
give us the tools to do just that.
We register that a reference is handed out when creating a smart pointer. The smart pointer has a customDropimplementation that unregisters itself when it goes out of scope. With aDeref
implementation it acts just like a regular shared reference. And with aDerefMut
it can also be used just like a mutable reference .
Basic API (************ see documentation:
(impl) (T) ***************************
:?************************** (Sized)************************** RefCell********************** (T) **************** ()>{************************** pubfnborrow
(
&
self
****************************
>
Ref
************************** T********** pubfnborrow_mut (&self****************************
>
********** (RefMut) ************************** T**********impl:?************************** (Sized) ************************************>
Deref for********************** Ref******************{ / * ... * /(**************************}impl:?Sized****************************Drop********************** (for************************ (Ref__,,********* (T) ******************************************>{
/ * ... * /(**************************}
impl:?
Sized**************************** DerefMut********************** (for RefMut{/ * ... * /
************************
impl:?
Sized**************************** Deref********************** (for RefMut{/ * ... * /
************************
impl:?
Sized****************************Drop********************** (for RefMut__,,********* (T) ******************************************>{/ * ... * /
(**************************}******************************
ARefCell
makes use of a simple signed integerborrow
to keep track of the number of borrows at runtime. Normal references are counted as positive numbers, and a mutable reference as -1. When you try to callborrow_mut
while there are any open borrows, i.e. the integer is not 0, it panics: the essential ingredient to make this a safe API.
The methods onRefCellreturn theRef
andRefMut
types. Those have aDerefMut
and / orDeref
implementation, though which the content of theRefCellcan be accessed like they are a regular reference. When those ‘smart references’ are dropped, they decrement the reference count of theRefCell
again.
The standard library also offers atry_borrow
andtry_borrow_mut
method, which return an error instead of panicing the thread.
There currently seems to be Undefined Behavior in (some corner cases
ofRef
andRefMut
because they internally convert a raw pointer to the value inside theRefCell
too early to a normal reference. But we can be confident this is fixed in the standard library soon (ish).Mutex
Simplified API ( see documentation
, the actual API is a bit more complex because it allows the mutex to be poisoned):
(impl) (T) ***************************
:?************************** (Sized)************************ Mutex) ***************************{************************** pubfn
lock (
&self****************************
>
MutexGuard************************** T**********unsafeimpl
********************** (T) ****************************************: (****************************?
Sized******************
********************** (Send)>
Syncfor Mutex {}
impl:?Sized**************************** DerefMut********************** (for MutexGuard{/ * ... * /
impl:?
Sized**************************** Deref********************** (for MutexGuard{/ * ... * /
************************impl:?Sized****************************Drop********************** (for MutexGuard__,,********* (T) ******************************************>{/ * ... * /
******************************************************
A mutex providesmutably exclusiveaccess to its value. In a way this is synchronization primitive with only a limited part ofRefCell
s API: all borrows are mutable. LikeRefCell
all accesses happen through a smart pointer, in this caseMutexGuard
.
In modern implementations a mutex atomically sets a flag when the mutex gets locked, and unset it when done. If another thread sees the mutex is locked while it tries to access it, it will wait. The waiting can be a spin loop (see the spin crate
), but it is much better to ask the operating system for help: “notify me know when it gets unlocked ”. Because the fast path involves just an atomic write, this makes it a very efficient primitive.
Sadly not all implementations are modern, and some come with restrictions. For example Posix requires the mutex to be immovable even when not locked, forcing the standard library to put any mutex in an allocation on the heap any not allowing its use in statics. A custom implementation such as the one inparking_lotfixes these issues.
ARefCell
panics when you try to take two mutable borrows in the same thread. AMutex
useally does not detect this case, and will just start waiting to acquire the lock. But because it is the current thread that holds the lock, it will never wake up again (and block any thread that tries to lock theMutex
after that).
Simplified API ( (see documentation
), the actual API is a bit more complex because it allows theRwLock
to be poisoned):
(impl) (T) ***************************
:?************************** (Sized)************************** (RwLock) ***************************{************************** pubfn read **************(&
self
****************************
>
RwLockReadGuard
************************** T********** pubfnwrite (&self****************************
>
RwLockWriteGuard
************************** T**********unsafeimpl********************** (T) ****************************************: (****************************?Sized******************
********************** (Send)
Sync**************************** Sync********************** (for************************ (RwLock************************ ({**************) {} (**************************** impl:?Sized**************************** Deref********************** (for RwLockReadGuard{/ * ... * /
impl:
?
Sized****************************Drop********************** (for RwLockReadGuard__,,********* (T) ******************************************>{ / * ... * /
impl:?
Sized**************************** DerefMut********************** (for RwLockWriteGuard{/ * ... * /
impl:?
Sized**************************** Deref********************** (for RwLockWriteGuard{/ * ... * /
impl:
?
Sized****************************Drop********************** (for RwLockWriteGuard__,,********* (T) ******************************************>{/ * ... * /******************************** / * optional * /
impl:?
Sized**************************** (RwLockWriteGuard) {**************************** pubfn downgrade (************************ (self) **************************->RwLockReadGuard
******************************
Areader-writer lockis the full-featured multithreaded alternative toRefCell
. AnRwLock
will allow any number of readers to acquire the lock as long as a writer is not holding the lock.
There can be several strategies to implement an RW lock. Imagine this case: multiple threads keep takingreadlocks, and one thread wants to take awrite
lock. Should the reads be able to block the write indefinitely? Wikipedialists three priority policies:
************** Read-preferring RW locksallow for maximum concurrency, but can lead to write-starvation if contention is high.
Write-preferring RW locksavoid the problem of writer starvation by preventing any new readers from acquiring the lock if there is a writer queued and waiting for the lock ... The downside is that write-preferring locks allows for less concurrency in the presence of writer threads.
- Unspecified priority RW locksdoes not provide any guarantees with regards read vs. write access.
-
(************************************************************
The standard library explicitly does not specify the priority policy, it is just a thin wrapper around what the operating system provides.parking_lot
(documents) *********** It offers awrite-preferring RW lock. Thespinlock-based implementationinspin
is an example of aread-preferring RW lock.
A nice extra thatparking_lot
offers is adowngrade
method on itsRwLockWriteGuard
to turn it into aRwLockReadGuard
, without releasing the lock in the meantime.
Reentrant Mutex
Basic API (************************** (documentation):
(impl) (T) ***************************
:?************************** (Sized)**************************ReentrantMutex{************************** pubfn
lock (
&self****************************
>
ReentrantMutexGuard
************************** T**********unsafeimpl********************** (T) ****************************************: (****************************?Sized******************
********************** (Send)>
************ (Sync) ************************** for
ReentrantMutex
************ () ************T
(**************************>{}impl:?Sized**************************** Deref
********************** (for ReentrantMutexGuard{
/ * ... * /
impl:?Sized****************************Drop********************** (for ReentrantMutexGuard__,,********* (T) ******************************************>{ / * ... * /******************************************************************
A reentrant mutex may be locked more than once by the same thread, typically because of recursion. The smart pointer returned when locking theReentrantMutex
can’t be mutable, because it may not be unique. So it does not provide interior mutability by itself. But it does give the guarantees that allow using it with a single-threaded one, likeRefCell
, because:
it can hand out multiple shared references,
its wrapped type does not have to besync
, because the references are all handed out within the same thread.
Reentrant mutexes are considered a code smell, as they typically indicate you are holding a lock too long. But there are valid uses, likeStdout
which may be locked by a thread, whileprintln!
will also lock it for each write.
A reentrant mutex requires a little more bookkeeping than a normal mutex. It has to keep track of the id of the thread that currently holds a lock, and it has to keep track of the number of locks held. The standard library does not expose its implementation ofReentrantMutex
(remutex
does for an old version), butparking_lot
provides one.
Mutability only during initialization
There is a common pattern where you need to do something at runtime to initialize a variable, but can’t do so before sharing it with other threads. Statics are the prime example. lazy_staticUsed to be the go-to solution, but multiple crates created and reinvented theOnceCell
Lazy
solution, of which theonce_cell
crate seems to have arrived at the cleanest API. OnceCell (unsync)
Basic API (********************************** (documentation) :
(impl) (T) ***************************
>********************** (OnceCell) ************************** () ************************** ((({ pubfn (get the (&
self
****************************
>
********* (Option
************************** T******************************** pubfnset (&
self(********************, ****************************
value: (T->Result
T>
pubfn get_or_init
FnOnce******************
)
->
************************** (T)>(**************************************&&
********** (self) **************************
,f (**********************************************:F****************************
>
&
************************ (T)
impl
,************************ F
Lazy******************
********************** (T) ,
F**************************** pub
const fn********************** (new) **************************(
init
: ****************** F))
->
Lazy T
,
************************* (F)
impl,************************ F (****************************:
FnOnce
******************
)
->
************************** (T)>Deref******************** (for(Lazy) ********************************************** ( (F)******************************
For the singlethreaded case aOnceCell
can be in two states (which by the way allows a size optimization using niche filling likeOption:
uninitialized, which allow mutating it once withset
;
initialized, which allow taking multiple shared references withget
.
Any timeget
is called while uninitialized it will returnNone, and any timeset
is called while initialized it will returnErr
(with the provided value).get_or_init
is the nice combination: it can always return a reference, because it will initialize theOnceCellif necessary by running the provided closure.
The key insight that makesOnceCell
safe is that a mutation may really be done onlyonce. Recursive initialization (possibly accidental) within the closure ofget_or_init
therefore
must cause a panic.
Lazy
Acts like a smart pointer build on top ofOnceCell
. The initialization closure is provided once withnew, and run on the first use ofLazy
throughDeref.
mutate_once
provides an variation based on the same principles that can hand out a mutable smart pointer first. But as soon as one normal reference is returned, it can never be mutated again. OnceCell (sync)
Basic API (************************************ (documentation) ):
(impl) (T) ***************************
>********************** (OnceCell) ************************** () ************************** ((({ pubfn (get the (&
self
****************************
>
********* (Option
************************** T******************************** pubfnset (&
self(********************, ****************************
value: (T->Result
T>
pubfn get_or_init
FnOnce******************
)
->
************************** (T)>(**************************************&&
********** (self) **************************
,f (**********************************************:F****************************
>
&
************************ (T)
unsafe
impl
********************** (T) ****************************************:
Send**************************** Sync>************************ SyncforOnceCell******************
********************** (T)>{}
impl,************************ F
Lazy******************
********************** (T) ,
F**************************** pub
const fn********************** (new) **************************(
init
: ****************** F))
->
Lazy T
,
************************* (F)
impl,************************ F (****************************:
FnOnce
******************
)
->
************************** (T)>Deref******************** (for(Lazy) ********************************************** ( (F) unsafeimpl********************** (T) ,F: ****************** (Send Send
>************************ SyncforLazy******************
********************** (T) ,Fwhere OnceCell () **************************************T
(**************************:
Sync**************** ({**************) {} ******************************
The multi-threaded variant ofOnceCell
is almost the same. The state has to be kept in an atomic, and it has a third state: other threads may observe one thread isrunningthe initialization clusure. At that point they should wait, either using a spinloop but preferably with assistance from the OS, just likeMutex
.Using a seperate 'owner' or 'token' to track references (*********************
A somewhat experimental concept is to bind aCell
to an ‘owner’ or ‘token’ type. If you have a shared reference to the owner type, you can create a shared reference to the inside of the boundCell
. And if you have a mutable reference to the owner type, you can also get one to theCell
.
The owner can be shared freely between threads and passed through channels. I suppose that if you already need to do some communication through channels, this is a nice solution to avoid also taking a mutex (or passing large amounts of data through the channel).Still, when and how to use this pattern is not really clear to me, this description may not do it right.
The qcell
crate
explores this technique (see also the (initial post on Reddit
and theon the Rust Programming Language Forum.
A problem is how to conjure up a unique owner. One option is to just use a unique integer (QCell
). Another is to use your own zero-sized marker type (TCell
). And a third is to use the unique lifetime of a closure (LCell
).
(QCell)
Basic API ofQCell
:
(impl)
QCellOwner****************{ pubfnnew ()************************** ->
Self pubfn
cell**************************************** (self) ****************************
,
: ****************** T **************)->QCell************************ pubfnro********** (T ************ T)>
(
************************** &aself,,
********** (qc) ************************** (**************************:&'a
************** (QCell) ************************** ((()->&
'a T **************
pubfnrw********** (T ************ T) **************************>(&
'a************ mut
self
************, (qc) (**************************:
******************************************
'a
************ (QCell) ************************** (**********************************
)************ ->****************************'a
*********** (mut) ******************************************** (T) pubfnrw2********** (T ************ T),,********* (U) **************************>
&'a
mut********* (self
**************************,
qc1: ******************
'a(QCell) >
,
qc2: ******************
'a(QCell) > ->(&************************** (a)
mut (****************** T) ***************************
,
,
&
************************** ('a) mut (U **********************) pubfnrw3
********** (T ************ T)
,
,********* (U) **************************,(
**********
(
&
'a
mut********* (self**************************, qc1: ******************
'a(QCell) >
,
qc2: ******************
'a(QCell) >
,
qc3: ******************
'a(QCell) > ->(&************************** (a)
mut (****************** T) ***************************
,
,
&
************************** ('a)
mut (U **********************,
************ & 'amut (********************** (V) )
impl>
************************ QCell{ pubconst fn********************** (new) **************************(owner: ******************
QCellOwner,: ****************** T **************)->QCell>
impl: Send
Sync
******************
********************** (Sync) **************************
for
QCell>
******************************
A newQCellneeds a reference to aQCellOwner
when it is created by eitherQCellOwner :: cell
orQCell :: new
. One ugly part of the API is that the methods to take a reference to theQCell
,ro
, and to take a mutable reference,rw
, live on theQCellOwner
.
While one cell is borrowed mutably, it is as if the owner is mutably borrowed. No other cells bound to the sameQCellOwner
can be borrowed at the same time. That is why if you need multiple references with at least one mutable, you need to take them all at once withrw2
orrw3
.
(TCell and LCell) **************************
The main difference betweenQCell
,TCell
andLCell
is how the owner is constructed. ForQCell
it is as simple asQCellOwner :: New ()
.
ForTCell
You need to provide a marker type, that must be unique within the process, which is double-checked at runtime (there also isTLCell
, which must be unique per thread):
(struct)
Marker1
******************************
let
owner1
TCellOwner
::
******************)>
:: (new) ();******************************
AnLCellOwner
can’t be simply constructed. You have to run the code that uses it in a closure, where the owner is an argument to the closure. (GhostCell) initially explored this approach. The lifetime of the closure is the id of the owner:
(LCellOwner)
::
****************************(|mutowner**************************** / * use `owner` * /);
******************************
QCellis the most flexible.TCell
andLCell
move more checks to compile time, but require more annotations or a certain code structure:
(**************************************************************************************** (*************************************************************************************** (Cell) (Owner ID) (Cell overhead) (Owner check) Owner creation check (************************************************************************************************************************************************************************** (******************************************************************************************** QCell (integer) ********************************************************************************************** u*********************************************************************************************** (Runtime) ********************************************************************************************** (Runtime) ********************************************************************************************** (**************************************************************************************** (TCell ) orTLCell
marker type
(none) ********************************************************************************************** (Compile-time) ********************************************************************************************** (Runtime) ********************************************************************************************** (**************************************************************************************** (LCell (lifetime) ********************************************************************************************** (none) ********************************************************************************************** (Compile-time) ********************************************************************************************** (Compile-time) ********************************************************************************************** (********************************************************************************************** (************************************************************************************************
You made it to the end ;-). How do the various API’s compare?
Cell
- like, allowing no references inside:
(****************************************************************************************
type
(requirements) ********************************************************************************** (******** (Sync?
(restrictions) ***************************************************************************************** (************************************************************************************************************************************************************************** (******************************************************************************************Cell**************************************************************************************
Copy
(no) ********************************************************************************************** value must be copied or moved (****************************************************************************************Cell
**************************** (none) (no) ********************************************************************************************** value must be moved withreplace
ortake
(**************************************************************************************** Atomic**************************************************************************************************************************
Send
, integer-sized
(yes) ********************************************************************************************** (like (Cell
) ********************************************************************************************** (**************************************************************************************** AtomicCell**************************************************************************************************************************
Send
(yes) ********************************************************************************************** (like (Cell) ********************************************************************************************** (**********************************************************************************************
* Note: I usedAtomic
to describe atomic operations, andAtomicCellto describe a lock-based fallback.
RefCell
- like, ensuring at runtime there is only one& mut
reference:
(****************************************************************************************
type
(requirements) ********************************************************************************** (******** (Sync?
&
& mut
(restrictions) ***************************************************************************************** (************************************************************************************************************************************************************************** (******************************************************************************************** RefCell****************************************************************************************************** (none) (no) ********************************************************************************************** (yes) ********************************************************************************************** (yes) ********************************************************************************************** (may panic) ********************************************************************************************** (**************************************************************************************** Mutex****************************
Send
(yes) ********************************************************************************************** (no) ********************************************************************************************** (yes) ********************************************************************************************** may block or deadlock (**************************************************************************************** (RwLock **************************** SendSync
************************************************************************************** (yes) ********************************************************************************************** (yes) ********************************************************************************************** (yes) ********************************************************************************************** may block or deadlock (**************************************************************************************** ReentrantMutex****************************************************************************************************************************
Send
(yes) ********************************************************************************************** (yes) ********************************************************************************************** (yes) ********************************************************************************************** may block or panic (**********************************************************************************************
OnceCell, only mutable during initialization:
(****************************************************************************************
type
(requirements) ********************************************************************************** (******** (Sync?
&
(restrictions) ***************************************************************************************** (************************************************************************************************************************************************************************** (******************************************************************************************unsync :: OnceCell**************************************************************************************** (none) (no) ********************************************************************************************** (yes) ********************************************************************************************** (only mutable once) ********************************************************************************************** (****************************************************************************************sync :: OnceCell**********************************************************************************************
Send
Sync
************************************************************************************** (yes) ********************************************************************************************** (yes) ********************************************************************************************** may block, only mutable once (**********************************************************************************************
This post concentrated on the possible API choices, and only touched upon the choices that can be made for the synchronization primitives.
Do you know of any other patterns? Or found a mistake? Please let me know.
********************************************************************************************** (****************************************************************************************************Read More
(****************************************************************************************************
(impl) (T) ***************************
>********************** (Cell) ************************** ((({ pubfnset (&
self(********************, **************************** val ************** val: (T
pubfnreplace
(&
>********************** (Cell) ************************** ((({pubfnset (
&self(********************, **************************** val ************** val: (T
pubfnreplace
(&
self(********************, **************************** val ************** val: (T
->impl: (CopyT
Cell******************
{*************************) pubfn (get the (&self****************************
impl: (Default
Cell******************
{*************************) pubfn (take take (&self****************************
******************************
The idea of Cell
is that when reading its contents you always eitherset
orgetthe entire contents of theCell
. With a single thread there can only ever be one such operation in progress, so there is no change of multiple mutable accesses: a safe API for interior mutability.
Before Rust 1. (**************************** (RFC) (Cell) was only implemented forCopy
types with theset
andgetmethods. Crates like movecelland
Cell
is that when reading its contents you always eitherset
orgetthe entire contents of theCell
. With a single thread there can only ever be one such operation in progress, so there is no change of multiple mutable accesses: a safe API for interior mutability.
Copy
types with theset
andgetmethods. Crates like movecelland
That brings us to the flexible API above:get
is the normal way to read the value of aCell
, but is only available forCopy
types. If a type can’t be copied, you can alwaysreplace
it with some other value. And if the type comes with aDefault
, you cantake
out the value and replace it with the default.
Cell
does not work with multiple threads. It has no way to coordinate with another thread thatset
is not used while the other usesget
. Enter the synchronization primitives for aCell
- like API below.
Did you know that Rustsatomic integersare just anUnsafeCell
around an integer?
(# [repr(C,align(4))]pub
struct AtomicU **************
{
: ******************** UnsafeCell
>
,******************************
With atomic operations the processor ensures any read or write will be synchronized across threads. What can be done for integers, can be done for any type that is small enough to fit in an integer. Usualy this means the types can be up to 8 bytes, and most - bit platforms support atomic operations on up to bytes.
Multiple crates allow using an
UnsafeCell
with atomic operations on various small types:****************************** (Crossbeam) ************ with its ):AtomicCell
withatomic
Atomic (requires types are
, withCopy
).atomig
Atomic (requires a
# [derive(atom)]
).If you look past the names of the methods on
AtomicCell
, just notice how close the atomic operations resemble the methods onCell
. Basic API ( see documentation
(impl) (T) ***************************
>********************** (AtomicCell) ********************** ((({pubfnstore (&self(********************, **************************** val ************** val: (T
pubfn swap(&self(********************, **************************** val ************** val: (T->impl: (CopyT
AtomicCell******************
********************** (T)> {*************************) pubfn load (&self****************************
> T
impl: (Default
AtomicCell******************
********************** (T)> {*************************) pubfn (take take (&self****************************
> T
impl: (Copy
Eq******************
********************** (AtomicCell) T****************************(pubfn (************************** compare_and_swap
&self(********************, ****************************current: (T(****************************,new:********************** (T) )
->
T pubfn
compare_exchange
(&self(********************, ****************************
current: (T
(****************************,new:********************** (T) )
crate hands out a mutable reference while reads may be in progress, which violates the aliasing rules. I also don’t think there is a crate that offers a spinlock with the API described here. But the fallback for larger types in-> Result,************************ (T
unsafe
impl
********************** (T) ****************************************: Send**************************** Sync********************** (for AtomicCell{}******************************The methods
get
,
set
and
replace
are nowload
,store
andswap
.AtomicCell
also offerscompare_and_swap
, one of the essential tools for concurrent algorithms. Currently this operation isquestionable, because transmuting small types with padding to an atomic integer and comparing it is probably Undefined Behavior. But I am confident the maintainers will figure out a solution.In my opinion every of one of those crates has some disadvantage. Crossbeam has the potential problem with padding bytes.
atomic
has the same problem, and requires its types to beCopy
.atomig
can’t be derived for types from other crates. Still, as long as you are careful to use it on small types without padding, you are good with all of them.
Atomic operations are definitely the best concurrent solution for interior mutability on this list.
Locks
Exactly the same API as the one above for
AtomicCell
can be provided for larger types, by using some locking scheme.A
SeqLock. orsequence lockprotects its wrapped value with a sequence counter. When writing with
set
the sequence counter is incremented. Any read withget
will first check the sequence counter, read the value, and check the sequence counter again. If the sequence counter changed, it will know there was a data race while reading and try again.This is technically UB in Rust and therefore a questionable locking method.The property that it does not have to do any atomic writes when just reading the value gives a very good performance. Note however that when a thread keeps writing to a seqlock, it can starve readers.
A spinlock will set and clear a flag on every read or write. It doesn’t allow for multiple concurrent reads. ). The the seqlock atomic :: Atomic
does not include a lock with its type, but uses an global array of 41 spinlocks. A hash of the address of the data is used as index to get a lock. This gives a nice little space efficiency.
. For every read it will make one attempt to use a seqlock as seqlock, and if that fails use it as spinlock. crossbeam :: AtomicCell
does the same thing with a set of (****************************************************************************************************************** global seqlocks
With a concurrent
Cell
- like API it is best to use it like the initial version ofCell
: only with types that are (Copy. The trick to move non -Copy
values in and out of aCell
withreplace
and
take
requires that you can always replace it with some other value that remains useful to other threads, which seems often impossible impossible to me. For non -
Copy
types, use one of the reference based solutions below.Note that with a
Cell
- like API locks are held for only a very short time, about as long as it takes to do amemcpy
. So you usually don't need to ask the OS for assistance to wait until a thread is done (like
Mutex
,RwLock
and co. Below).There is currently no good crate that offers a standalone
SeqLock
(see also my previous post on (writing a seqlock in Rustatomic :: Atomic
and especially incrossbeam :: AtomicCell**************************
are pretty cool. Ensuring at runtime a mutable reference is uniqueIt is easily possible to hand out references, we just have to keep track of the number of shared and / or mutable references that are currently handed out.Smart Pointers
give us the tools to do just that.We register that a reference is handed out when creating a smart pointer. The smart pointer has a custom
Drop
implementation that unregisters itself when it goes out of scope. With a
Deref
implementation it acts just like a regular shared reference. And with aDerefMut
it can also be used just like a mutable reference .Basic API (************ see documentation:
(impl) (T) ***************************
:?************************** (Sized)************************** RefCell********************** (T) **************** ()>{************************** pubfnborrow(&self
****************************
> Ref ************************** T********** pubfnborrow_mut (&self****************************
> ********** (RefMut) ************************** T**********impl:?************************** (Sized) ************************************> Deref for********************** Ref******************{ / * ... * /(**************************}impl:?Sized****************************Drop********************** (for************************ (Ref
__,
,********* (T) ******************************************>{
/ * ... * /(**************************}impl:?
Sized**************************** DerefMut********************** (for RefMut
{/ * ... * /************************impl:?
Sized**************************** Deref********************** (for RefMut{/ * ... * /************************impl:?
Sized****************************ofDrop********************** (for RefMut
__,
,********* (T) ******************************************>{/ * ... * /
(**************************}******************************
A
RefCell
makes use of a simple signed integerborrow
to keep track of the number of borrows at runtime. Normal references are counted as positive numbers, and a mutable reference as -1. When you try to callborrow_mut
while there are any open borrows, i.e. the integer is not 0, it panics: the essential ingredient to make this a safe API.The methods on
RefCell
return the
Ref
andRefMut
types. Those have aDerefMut
and / orDeref
implementation, though which the content of theRefCell
can be accessed like they are a regular reference. When those ‘smart references’ are dropped, they decrement the reference count of the
RefCell
again.The standard library also offers a
try_borrow
andtry_borrow_mut
method, which return an error instead of panicing the thread.There currently seems to be Undefined Behavior in (some corner cases
Ref
andRefMut
because they internally convert a raw pointer to the value inside theRefCell
too early to a normal reference. But we can be confident this is fixed in the standard library soon (ish).MutexSimplified API ( see documentation
, the actual API is a bit more complex because it allows the mutex to be poisoned):
(impl) (T) ***************************
:?************************** (Sized)************************ Mutex) ***************************{************************** pubfn
lock (
&self****************************
> MutexGuard************************** T**********unsafeimpl ********************** (T) ****************************************: (****************************? Sized**************************************** (Send)> Syncfor Mutex {}
impl:?Sized**************************** DerefMut********************** (for MutexGuard
{/ * ... * /impl:?
Sized**************************** Deref********************** (for MutexGuard{/ * ... * /************************impl:?Sized), the actual API is a bit more complex because it allows the****************************), but it is much better to ask the operating system for help: “notify me know when it gets unlocked ”. Because the fast path involves just an atomic write, this makes it a very efficient primitive.Drop********************** (for MutexGuard
__,
,********* (T) ******************************************>{/ * ... * /
******************************************************A mutex providesmutably exclusiveaccess to its value. In a way this is synchronization primitive with only a limited part of
RefCell
s API: all borrows are mutable. LikeRefCell
all accesses happen through a smart pointer, in this caseMutexGuard
.In modern implementations a mutex atomically sets a flag when the mutex gets locked, and unset it when done. If another thread sees the mutex is locked while it tries to access it, it will wait. The waiting can be a spin loop (see the spin crate
Sadly not all implementations are modern, and some come with restrictions. For example Posix requires the mutex to be immovable even when not locked, forcing the standard library to put any mutex in an allocation on the heap any not allowing its use in statics. A custom implementation such as the one inparking_lotfixes these issues.
A
RefCell
panics when you try to take two mutable borrows in the same thread. AMutex
useally does not detect this case, and will just start waiting to acquire the lock. But because it is the current thread that holds the lock, it will never wake up again (and block any thread that tries to lock theMutex
after that).Simplified API ( (see documentation
RwLock
to be poisoned):
(impl) (T) ***************************
:?************************** (Sized)************************** (RwLock) ***************************{************************** pubfn read **************(&self
****************************
> RwLockReadGuard ************************** T********** pubfnwrite (&self****************************
> RwLockWriteGuard ************************** T**********unsafeimpl********************** (T) ****************************************: (****************************?Sized**************************************** (Send) Sync**************************** Sync********************** (for************************ (RwLock************************ ({**************) {} (**************************** impl:?Sized**************************** Deref********************** (for RwLockReadGuard{
/ * ... * /impl:
?Sized****************************Drop********************** (for RwLockReadGuard
__
,
,********* (T) ******************************************>{ / * ... * /
impl:?
Sized**************************** DerefMut********************** (for RwLockWriteGuard
{
/ * ... * /impl:?
Sized**************************** Deref********************** (for RwLockWriteGuard{
/ * ... * /impl:
?Sized****************************Drop********************** (for RwLockWriteGuard
__
,
,********* (T) ******************************************>{/ * ... * /******************************** / * optional * /
impl:?
Sized**************************** (RwLockWriteGuard) {**************************** pubfn downgrade (************************ (self) **************************->******************************RwLockReadGuard
Areader-writer lockis the full-featured multithreaded alternative to
RefCell
. AnRwLock
will allow any number of readers to acquire the lock as long as a writer is not holding the lock.There can be several strategies to implement an RW lock. Imagine this case: multiple threads keep taking
read
locks, and one thread wants to take a
write
lock. Should the reads be able to block the write indefinitely? Wikipedialists three priority policies:************** Read-preferring RW locksallow for maximum concurrency, but can lead to write-starvation if contention is high. Write-preferring RW locksavoid the problem of writer starvation by preventing any new readers from acquiring the lock if there is a writer queued and waiting for the lock ... The downside is that write-preferring locks allows for less concurrency in the presence of writer threads.
- Unspecified priority RW locksdoes not provide any guarantees with regards read vs. write access.
(************************************************************
The standard library explicitly does not specify the priority policy, it is just a thin wrapper around what the operating system provides.
parking_lot
(documents) *********** It offers awrite-preferring RW lock. Thespinlock-based implementationinspin
is an example of aread-preferring RW lock.A nice extra that
parking_lot
offers is adowngrade
method on its
RwLockWriteGuard
to turn it into aRwLockReadGuard
, without releasing the lock in the meantime.Reentrant Mutex
Basic API (************************** (documentation):
(impl) (T) ***************************
:?************************** (Sized)**************************ReentrantMutex{************************** pubfn
lock (
&self****************************
> ReentrantMutexGuard ************************** T**********unsafeimpl********************** (T) ****************************************: (****************************?Sized**************************************** (Send)> ************ (Sync) ************************** for ReentrantMutex************ () ************T (**************************>{}impl:?Sized**************************** Deref********************** (for ReentrantMutexGuard{/ * ... * /impl:?Sized****************************Drop********************** (for ReentrantMutexGuard__
,
,********* (T) ******************************************>{ / * ... * /******************************************************************
A reentrant mutex may be locked more than once by the same thread, typically because of recursion. The smart pointer returned when locking the
ReentrantMutex
can’t be mutable, because it may not be unique. So it does not provide interior mutability by itself. But it does give the guarantees that allow using it with a single-threaded one, likeRefCell
, because:it can hand out multiple shared references,
its wrapped type does not have to be sync
, because the references are all handed out within the same thread.crate seems to have arrived at the cleanest API. OnceCell (unsync) Reentrant mutexes are considered a code smell, as they typically indicate you are holding a lock too long. But there are valid uses, like
Stdout
which may be locked by a thread, whileprintln!
will also lock it for each write.A reentrant mutex requires a little more bookkeeping than a normal mutex. It has to keep track of the id of the thread that currently holds a lock, and it has to keep track of the number of locks held. The standard library does not expose its implementation of
ReentrantMutex
(remutex
does for an old version), butparking_lot
provides one.Mutability only during initialization
There is a common pattern where you need to do something at runtime to initialize a variable, but can’t do so before sharing it with other threads. Statics are the prime example. lazy_static
Used to be the go-to solution, but multiple crates created and reinvented the
OnceCell
Lazy
solution, of which theonce_cell
Basic API (********************************** (documentation) :
(impl) (T) ***************************
>********************** (OnceCell) ************************** () ************************** ((({pubfn (get the (&self
****************************
> ********* (Option ************************** T******************************** pubfnset (
&self(********************, ****************************
value: (T
->Result
T>
pubfn get_or_init
FnOnce****************** ) -> ************************** (T)>(**************************************&
&********** (self) ************************** ,f (**********************************************:F
****************************
> & ************************ (T)impl
,************************ FLazy******************
********************** (T) , F
****************************pub
const fn********************** (new) **************************
(
init
: ****************** F))
->
Lazy T,
************************* (F)
impl,************************ F (****************************:
FnOnce
******************
) -> ************************** (T)>Deref******************** (for(Lazy) ********************************************** ( (F)******************************For the singlethreaded case a
OnceCell
can be in two states (which by the way allows a size optimization using niche filling likeOption
:
uninitialized, which allow mutating it once with set
;initialized, which allow taking multiple shared references with get
.Any time
get
is called while uninitialized it will returnNone
, and any time
set
is called while initialized it will returnErr
(with the provided value).get_or_init
is the nice combination: it can always return a reference, because it will initialize theOnceCellif necessary by running the provided closure.
The key insight that makes
OnceCell
safe is that a mutation may really be done onlyonce. Recursive initialization (possibly accidental) within the closure ofget_or_init
thereforemust cause a panic. provides an variation based on the same principles that can hand out a mutable smart pointer first. But as soon as one normal reference is returned, it can never be mutated again. OnceCell (sync)
Lazy
Acts like a smart pointer build on top ofOnceCell
. The initialization closure is provided once withnew
, and run on the first use of
Lazy
throughDeref
.
mutate_once
Basic API (************************************ (documentation) ):
(impl) (T) ***************************
>********************** (OnceCell) ************************** () ************************** ((({pubfn (get the (&self
****************************
> ********* (Option ************************** T******************************** pubfnset (
&self(********************, ****************************
value: (T
->Result
T>
pubfn get_or_init
FnOnce****************** ) -> ************************** (T)>(**************************************&
&********** (self) ************************** ,f (**********************************************:F
****************************
> & unsafe************************ (T)impl
********************** (T) ****************************************:Send**************************** Sync>
************************ SyncforOnceCell******************
********************** (T)>{}impl,************************ F
Lazy******************
********************** (T) , F
****************************pub
const fn********************** (new) **************************
(
init
: ****************** F))
->
Lazy T,
************************* (F)
impl,************************ F (****************************:
FnOnce
******************
) -> ************************** (T)>Deref******************** (for(Lazy) ********************************************** ( (F)unsafeimpl********************** (T) , explores this technique (see also the (initial post on RedditF
: ****************** (Send Send
>************************ SyncforLazy******************
********************** (T) ,F
where OnceCell () **************************************T
(**************************: crateSync**************** ({**************) {} ******************************The multi-threaded variant of
OnceCell
is almost the same. The state has to be kept in an atomic, and it has a third state: other threads may observe one thread isrunningthe initialization clusure. At that point they should wait, either using a spinloop but preferably with assistance from the OS, just likeMutex
.Using a seperate 'owner' or 'token' to track references (*********************
A somewhat experimental concept is to bind a
Cell
to an ‘owner’ or ‘token’ type. If you have a shared reference to the owner type, you can create a shared reference to the inside of the boundCell
. And if you have a mutable reference to the owner type, you can also get one to theCell
.The owner can be shared freely between threads and passed through channels. I suppose that if you already need to do some communication through channels, this is a nice solution to avoid also taking a mutex (or passing large amounts of data through the channel).Still, when and how to use this pattern is not really clear to me, this description may not do it right.
The qcell
and theon the Rust Programming Language Forum.
A problem is how to conjure up a unique owner. One option is to just use a unique integer (
QCell
). Another is to use your own zero-sized marker type (TCell
). And a third is to use the unique lifetime of a closure (LCell
).(QCell)
Basic API of
QCell
:
(impl)
QCellOwner****************{ pubfnnew ()
************************** ->Self pubfn
cell**************************************** (self) ****************************
,
: ****************** T **************)
->
QCell************************ pubfnro********** (T ************ T)>
( ************************** &aself,
,********** (qc) ************************** (**************************:&'a ************** (QCell) ************************** ((()->&'a T **************
pubfnrw********** (T ************ T) **************************
>
(&
'a************ mut self
************, (qc) (**************************: ******************************************
'a
************ (QCell) ************************** (********************************** )************ ->**************************** 'a
*********** (mut) ******************************************** (T) pubfn rw2********** (T ************ T),
,********* (U) **************************>
&'a **************************,mut
********* (selfqc1: ******************
'a(QCell) >, qc2: ******************
'a(QCell) >->(
&
************************** (a)mut (****************** T) ***************************
,
,& ********** (T ************ T)************************** ('a) mut (U **********************)pubfnrw3,
,********* (U) **************************,
(
**********
(
&'a
mut
********* (self**************************,qc1: ******************
'a(QCell) >, qc2: ******************
'a(QCell) >, qc3: ******************
'a(QCell)******************************
>
->(
&
************************** (a)mut (****************** T) ***************************
,
,& ************************** ('a)mut (U **********************,
************ & 'amut (********************** (V) )
impl>************************ QCell{pubconst fn********************** (new) **************************(
******************owner: ******************
QCellOwner,: ****************** T **************)
->
QCell>impl: Send
Sync
********************** (Sync) **************************
for
QCell>A new
QCell
needs a reference to a
QCellOwner
when it is created by eitherQCellOwner :: cell
orQCell :: new
. One ugly part of the API is that the methods to take a reference to theQCell
,ro
, and to take a mutable reference,rw
, live on theQCellOwner
.While one cell is borrowed mutably, it is as if the owner is mutably borrowed. No other cells bound to the same
QCellOwner
can be borrowed at the same time. That is why if you need multiple references with at least one mutable, you need to take them all at once withrw2
orrw3
.(TCell and LCell) **************************
The main difference between
QCell
,TCell
andLCell
is how the owner is constructed. ForQCell
it is as simple asQCellOwner :: New ()
.For
TCell
You need to provide a marker type, that must be unique within the process, which is double-checked at runtime (there also isTLCell
, which must be unique per thread):
(struct)
Marker1****************************** let
owner1TCellOwner
::
******************)>
:: (new) ();****************************** An
LCellOwner
can’t be simply constructed. You have to run the code that uses it in a closure, where the owner is an argument to the closure. (GhostCell) initially explored this approach. The lifetime of the closure is the id of the owner:
(LCellOwner)
******************************::
****************************(|mutowner
****************************
/ * use `owner` * /);
QCell
is the most flexible.
TCell
andLCell
move more checks to compile time, but require more annotations or a certain code structure:(**************************************************************************************** (*************************************************************************************** (Cell) (Owner ID) (Cell overhead) (Owner check) Owner creation check (************************************************************************************************************************************************************************** (******************************************************************************************** QCell (integer) ********************************************************************************************** u*********************************************************************************************** (Runtime) ********************************************************************************************** (Runtime) ********************************************************************************************** (**************************************************************************************** (TCell ) or
(none) ********************************************************************************************** (Compile-time) ********************************************************************************************** (Runtime) ********************************************************************************************** (**************************************************************************************** (LCell (lifetime) ********************************************************************************************** (none) ********************************************************************************************** (Compile-time) ********************************************************************************************** (Compile-time) ********************************************************************************************** (********************************************************************************************** (************************************************************************************************TLCell
marker type) ********************************************************************************************** (**************************************************************************************** AtomicCell
You made it to the end ;-). How do the various API’s compare?
Cell
- like, allowing no references inside:(****************************************************************************************
type(requirements) ********************************************************************************** (******** (Sync? (restrictions) ***************************************************************************************** (************************************************************************************************************************************************************************** (******************************************************************************************
Cell**************************************************************************************
Copy(no) ********************************************************************************************** value must be copied or moved (**************************************************************************************** Cell
**************************** (none) (no) ********************************************************************************************** value must be moved withreplace
ortake
(**************************************************************************************** Atomic****************************
, integer-sized**********************************************************************************************
Send(yes) ********************************************************************************************** (like (Cell
****************************
**********************************************************************************************
Send(yes) ********************************************************************************************** (like (Cell) ********************************************************************************************** (********************************************************************************************** * Note: I used
Atomic
to describe atomic operations, andAtomicCellto describe a lock-based fallback.
RefCell
- like, ensuring at runtime there is only one& mut
reference:(****************************************************************************************
type(requirements) ********************************************************************************** (******** (Sync?
&
& mut
(restrictions) ***************************************************************************************** (************************************************************************************************************************************************************************** (******************************************************************************************** RefCell****************************************************************************************************** (none) (no) ********************************************************************************************** (yes) ********************************************************************************************** (yes) ********************************************************************************************** (may panic) ********************************************************************************************** (**************************************************************************************** Mutex ****************************
Send(yes) ********************************************************************************************** (no) ********************************************************************************************** (yes) ********************************************************************************************** may block or deadlock (**************************************************************************************** (RwLock **************************** Send Sync
************************************************************************************** (yes) ********************************************************************************************** (yes) ********************************************************************************************** (yes) ********************************************************************************************** may block or deadlock (**************************************************************************************** ReentrantMutex
Send**************************************************************************************************************************** (yes) ********************************************************************************************** (yes) ********************************************************************************************** (yes) ********************************************************************************************** may block or panic (********************************************************************************************** OnceCell, only mutable during initialization:
(****************************************************************************************
type(requirements) ********************************************************************************** (******** (Sync?
&
(restrictions) ***************************************************************************************** (************************************************************************************************************************************************************************** (****************************************************************************************** unsync :: OnceCell**************************************************************************************** (none) (no) ********************************************************************************************** (yes) ********************************************************************************************** (only mutable once) ********************************************************************************************** (****************************************************************************************
sync :: OnceCell**********************************************************************************************
SendSync
************************************************************************************** (yes) ********************************************************************************************** (yes) ********************************************************************************************** may block, only mutable once (**********************************************************************************************
(****************************************************************************************************This post concentrated on the possible API choices, and only touched upon the choices that can be made for the synchronization primitives.
Do you know of any other patterns? Or found a mistake? Please let me know.
********************************************************************************************** (****************************************************************************************************Read More
GIPHY App Key not set. Please check settings