MapSlim - From DictionarySlim to Fsion
14 Dec 2018PLEASE NOTE SOURCE CODE IS NO LONGER AVAILABLE AND THE POST IS HERE JUST FOR INFORMATION
This post is part of the F# Advent Calendar 2018 series. Many thanks again to Sergey Tihon for organizing these.
Recently while working on DictionarySlim and Fsion it became clear it is possible to create useful lock-free for read collections.
DictionarySlim
is a Dictionary
replacement that can be up to 2.5 times faster for the common TryGetValue
& Add
pattern.
To do this it returns ref
values.
It is also more efficient with hash codes using &
instead of the more costly %
.
It came from ideas used in the Benchmarks Game.
Fsion is a [WIP] bi-temporal database for F#. It stores a compressed set of historic values (with audit) for each entity-attribute. The idea is that functional techniques can be used to translate unstructured data to fully type safe representations. Compression and using attribute functions like Excel aims to keep database sizes minimal and possibly in memory.
Fsion locking model
For the in memory version the value DataSeries
are stored in a Dictionary<EntityAttribute,DataSeries>
collection.
Each immutable DataSeries
can be atomically replaced.
All queries are executed up to a given transaction id or time.
This simplifies the locking model as entries can be updated at the same time as a consistent set of reads are made.
The problem is that a ReaderWriterLock
would still need to be used on the Dictionary
for the read and write side.
By creating lock-free for read collections the database can be simplified to fully lock-free for read access.
MapSlim, SetSlim and ListSlim
Three new collections MapSlim, SetSlim and ListSlim have been created in F#.
The collections are lock-free for read for immutable reference types or types that can be updated atomically.
They are based on and show similar performance to DictionarySlim
.
Internally care must be taken to add to and resize the collection atomically.
List
would only require a small change for this to be true.
The limitation is the API for these collections is slimmed down and has no Remove
or Clear
.
One benefit that comes from this is that the collection entries can also be indexed into.
Calling the possibly updating GetRef
method must be done using a lock while manipulating the ref
value.
These collections have great potential for caches and functionality such as memoize. The threading model becomes much simpler and eliminates the need to switch to concurrent collections.
Testing
Performance and multithreading testing provides a good demonstration of Expecto's abilities.
In under 250 lines of code the MapSlimTests have unit tests, property tests, performance tests and threading stress tests.
The other collections tests are SetSlimTests and ListSlimTests
The tests show large performance improvements over equivalent mutable collections. In multithreading situations such as server caches this difference will be even greater.
Conclusion
Fsion makes use of immutable DataSeries
and lock-free read collections like
MapSlim
to create a lock-free row versioning style concurrency model.
This produces an interesting hybrid approach possible in a functional-first language. It simplifies often complex and error prone locking required in server caches.
In future posts I hope to demonstrate how combining all of this can lead to a simple yet high performance database and server.
Happy holidays!
References
Datomic
Datomic: Event Sourcing without the hassle
Data-First Architecture
FASTER