MapSlim - From DictionarySlim to Fsion

PLEASE NOTE SOURCE CODE IS NO LONGER AVAILABLE AND THE POST IS HERE JUST FOR INFORMATION

This post is part of the F# Advent Calendar 2018 series. Many thanks again to Sergey Tihon for organizing these.

Recently while working on DictionarySlim and Fsion it became clear it is possible to create useful lock-free for read collections.

DictionarySlim is a Dictionary replacement that can be up to 2.5 times faster for the common TryGetValue & Add pattern. To do this it returns ref values. It is also more efficient with hash codes using & instead of the more costly %. It came from ideas used in the Benchmarks Game.

Fsion is a [WIP] bi-temporal database for F#. It stores a compressed set of historic values (with audit) for each entity-attribute. The idea is that functional techniques can be used to translate unstructured data to fully type safe representations. Compression and using attribute functions like Excel aims to keep database sizes minimal and possibly in memory.

Fsion locking model

For the in memory version the value DataSeries are stored in a Dictionary<EntityAttribute,DataSeries> collection. Each immutable DataSeries can be atomically replaced. All queries are executed up to a given transaction id or time. This simplifies the locking model as entries can be updated at the same time as a consistent set of reads are made.

The problem is that a ReaderWriterLock would still need to be used on the Dictionary for the read and write side. By creating lock-free for read collections the database can be simplified to fully lock-free for read access.

MapSlim, SetSlim and ListSlim

Three new collections MapSlim, SetSlim and ListSlim have been created in F#.

The collections are lock-free for read for immutable reference types or types that can be updated atomically. They are based on and show similar performance to DictionarySlim. Internally care must be taken to add to and resize the collection atomically. List would only require a small change for this to be true.

The limitation is the API for these collections is slimmed down and has no Remove or Clear. One benefit that comes from this is that the collection entries can also be indexed into.

Calling the possibly updating GetRef method must be done using a lock while manipulating the ref value.

These collections have great potential for caches and functionality such as memoize. The threading model becomes much simpler and eliminates the need to switch to concurrent collections.

Testing

Performance and multithreading testing provides a good demonstration of Expecto's abilities.

In under 250 lines of code the MapSlimTests have unit tests, property tests, performance tests and threading stress tests.

mapslim_stress

The other collections tests are SetSlimTests and ListSlimTests

mapslim_perf

The tests show large performance improvements over equivalent mutable collections. In multithreading situations such as server caches this difference will be even greater.

Conclusion

Fsion makes use of immutable DataSeries and lock-free read collections like MapSlim to create a lock-free row versioning style concurrency model.

This produces an interesting hybrid approach possible in a functional-first language. It simplifies often complex and error prone locking required in server caches.

In future posts I hope to demonstrate how combining all of this can lead to a simple yet high performance database and server.

Happy holidays!

References

Datomic
Datomic: Event Sourcing without the hassle
Data-First Architecture
FASTER