3

EDIT: I've updated my examples to use the https://github.com/StephenCleary/AsyncEx library. Still waiting for usable hints.

There are resources, which are identified by strings (for example files, URLs, etc.). I'm looking for a locking mechanism over the resources. I've found 2 different solutions, but each has its problems:

The first is using the ConcurrentDictionary class with AsyncLock:

using Nito.AsyncEx;
using System.Collections.Concurrent;

internal static class Locking {
    private static ConcurrentDictionary<string, AsyncLock> mutexes
        = new ConcurrentDictionary<string, AsyncLock>();

    internal static AsyncLock GetMutex(string resourceLocator) {
        return mutexes.GetOrAdd(
            resourceLocator,
            key => new AsyncLock()
        );
    }
}

Async usage:

using (await Locking.GetMutex("resource_string").LockAsync()) {
    ...
}

Synchronous usage:

using (Locking.GetMutex("resource_string").Lock()) {
    ...
}

This works safely, but the problem is that the dictionary grows larger and larger, and I don't see a thread-safe way to remove items from the dictionary when no one is waiting on a lock. (I also want to avoid global locks.)

My second solution hashes the string to a number between 0 and N - 1, and locks on these:

using Nito.AsyncEx;
using System.Collections.Concurrent;

internal static class Locking {
    private const UInt32 BUCKET_COUNT = 4096;

    private static ConcurrentDictionary<UInt32, AsyncLock> mutexes
        = new ConcurrentDictionary<UInt32, AsyncLock>();

    private static UInt32 HashStringToInt(string text) {
        return ((UInt32)text.GetHashCode()) % BUCKET_COUNT;
    }

    internal static AsyncLock GetMutex(string resourceLocator) {
        return mutexes.GetOrAdd(
            HashStringToInt(resourceLocator),
            key => new AsyncLock()
        );
    }
}

As one can see, the second solution only decreases the probability of collisions, but doesn't avoid them. My biggest fear is that it can cause deadlocks: The main strategy to avoid deadlocks is to always lock items in a specific order. But with this approach, different items can map to the same buckets in different order, like: (A->X, B->Y), (C->Y, D->X). So with this solution one cannot lock on more than one resource safely.

Is there a better solution? (I also welcome critics of the above 2 solutions.)

Crouching Kitten
  • 1,048
  • 12
  • 21
  • The `Lock`/`Unlock` API looks a bit awkward and error prone. Wouldn't you prefer an API that takes advantage of the convenient [`lock`](https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/lock-statement) statement? For example: `lock (mutex.Get("some_string")) {/*protected region*/}` – Theodor Zoulias Feb 01 '20 at 20:48
  • @TheodorZoulias Thanks, I was thinking of that. The pro is that, in case of an exception the lock would be automatically removed. But as a next step i'm extending this for async cases, using the AsyncAutoResetEvent from a NuGet lib. And I can't see similar stuff implemented using the `lock` statement. – Crouching Kitten Feb 01 '20 at 21:37
  • @TheodorZoulias although I could use this: https://github.com/StephenCleary/AsyncEx#asynclock So I'll move into the `lock` statement direction, thanks. – Crouching Kitten Feb 01 '20 at 21:49
  • Implementing an async disposable locker based on `SemaphoreSlim` is pretty trivial, but on the other hand you couldn't go wrong by using Stephen Cleary's well tested library! – Theodor Zoulias Feb 02 '20 at 00:37
  • Related: [Asynchronous locking based on a key](https://stackoverflow.com/questions/31138179/asynchronous-locking-based-on-a-key) – Theodor Zoulias Nov 22 '21 at 22:10

1 Answers1

2

You could probably improve upon the first solution by removing a lock from the dictionary when it stops being in use. The removed locks could then be added to a small pool, so that the next time you need a lock you just grab one from the pool instead of creating a new one.


Update: Here is an implementation of this idea. It is based on SemaphoreSlims instead of Stephen Cleary's AsyncLocks, because a custom disposable is required in order to remove unused semaphores from the dictionary.

public class MultiLock<TKey>
{
    private object Locker { get; } = new object();
    private Dictionary<TKey, LockItem> Dictionary { get; }
    private Queue<LockItem> Pool { get; }
    private int PoolSize { get; }

    public MultiLock(int poolSize = 10)
    {
        Dictionary = new Dictionary<TKey, LockItem>();
        Pool = new Queue<LockItem>(poolSize);
        PoolSize = poolSize;
    }

    public WaitResult Wait(TKey key,
        int millisecondsTimeout = Timeout.Infinite,
        CancellationToken cancellationToken = default)
    {
        var lockItem = GetLockItem(key);
        bool acquired;
        try
        {
            acquired = lockItem.Semaphore.Wait(millisecondsTimeout,
                cancellationToken);
        }
        catch
        {
            ReleaseLockItem(lockItem, key);
            throw;
        }
        return new WaitResult(this, lockItem, key, acquired);
    }

    public async Task<WaitResult> WaitAsync(TKey key,
        int millisecondsTimeout = Timeout.Infinite,
        CancellationToken cancellationToken = default)
    {
        var lockItem = GetLockItem(key);
        bool acquired;
        try
        {
            acquired = await lockItem.Semaphore.WaitAsync(millisecondsTimeout,
                cancellationToken).ConfigureAwait(false);
        }
        catch
        {
            ReleaseLockItem(lockItem, key);
            throw;
        }
        return new WaitResult(this, lockItem, key, acquired);
    }

    private LockItem GetLockItem(TKey key)
    {
        LockItem lockItem;
        lock (Locker)
        {
            if (!Dictionary.TryGetValue(key, out lockItem))
            {
                if (Pool.Count > 0)
                {
                    lockItem = Pool.Dequeue();
                }
                else
                {
                    lockItem = new LockItem();
                }
                Dictionary.Add(key, lockItem);
            }
            lockItem.UsedCount += 1;
        }
        return lockItem;
    }

    private void ReleaseLockItem(LockItem lockItem, TKey key)
    {
        lock (Locker)
        {
            lockItem.UsedCount -= 1;
            if (lockItem.UsedCount == 0)
            {
                if (Dictionary.TryGetValue(key, out var stored))
                {
                    if (stored == lockItem) // Sanity check
                    {
                        Dictionary.Remove(key);
                        if (Pool.Count < PoolSize)
                        {
                            Pool.Enqueue(lockItem);
                        }
                    }
                }
            }
        }
    }

    internal class LockItem
    {
        public SemaphoreSlim Semaphore { get; } = new SemaphoreSlim(1);
        public int UsedCount { get; set; }
    }

    public struct WaitResult : IDisposable
    {
        private MultiLock<TKey> MultiLock { get; }
        private LockItem LockItem { get; }
        private TKey Key { get; }

        public bool LockAcquired { get; }

        internal WaitResult(MultiLock<TKey> multiLock, LockItem lockItem, TKey key,
            bool acquired)
        {
            MultiLock = multiLock;
            LockItem = lockItem;
            Key = key;
            LockAcquired = acquired;
        }

        void IDisposable.Dispose()
        {
            MultiLock.ReleaseLockItem(LockItem, Key);
            LockItem.Semaphore.Release();
        }
    }
}

Usage example:

var multiLock = new MultiLock<string>();
using (await multiLock.WaitAsync("SomeKey"))
{
    //...
}

The default pool size for unused semaphores is 10. The optimal value should be the number of the concurrent workers that are using the MultiLock instance.

I did a performance test in my PC, and 10 workers were able to acquire the lock asynchronously 500,000 times in total per second (20 different string identifiers were used).

Theodor Zoulias
  • 24,585
  • 5
  • 40
  • 69
  • In the moment the `WaitOne` returns, another thread can "lock" on the event before I remove it from the dictionary. Then a next thread would create a new event, and 2 would use the same resource at the same time. – Crouching Kitten Feb 01 '20 at 21:39
  • Yes, this solution should be implemented robustly with thread-safety in mind, and basing it on the `ConcurrentDictionary` may not cut it because this class lacks a conditional `TryRemove` method. The obvious alternative is a normal `Dictionary` protected with a `lock`, but then the question is how much contention will be created if all dictionary operations require exclusive access to a single shared lock. It depends on how **hot** will be the paths that invoke your `Lock` method. For less than 500,000 invocations per second the contention should be minuscule. – Theodor Zoulias Feb 02 '20 at 00:51
  • @CrouchingKitten I updated my answer with an implementation of the proposed solution. – Theodor Zoulias Feb 04 '20 at 06:41
  • Thanks, +1. I'll have to check it more. So the only function of the pool is to avoid performance problems from garbage collection? Hmm it's using a global lock for the dictionary, but I guess this can be solved by locking an a hash of the key and then modifying a ConcurrentDictionary, like if its keys belong to partitions. – Crouching Kitten Feb 04 '20 at 15:51
  • @CrouchingKitten yeap, with the pool only a handful of `SemaphoreSlim`s are created during the lifetime of the application. I don't think that replacing the `Dictionary`+`lock` with a lock-less `ConcurrentDictionary` is possible, because there are multiple operations that must be done atomically, when a lock is requested and when is released. These operations are very cheap though, so I don't expect this to be a problem. – Theodor Zoulias Feb 04 '20 at 19:08
  • I made an optimization in the code, removing an interface that would slow things down (casting a `struct` to an `interface` would result to a needless allocation). – Theodor Zoulias Feb 04 '20 at 19:09
  • I fixed a bug in the release logic in case of a cancellation exception. Btw the `MultiLock` is a class that I could have used myself some years ago when I was searching for something similar. But I don't think I could write it back then, due to limited multithreading knowledge. – Theodor Zoulias Feb 04 '20 at 21:51