Cache Writing: How to Design Memory Cache

Analysis and design

Assuming that a project has a relatively high concurrency, multi-level caching is used, as follows:

Before actually designing a memory cache, we need to consider the following issues:

1: Data replacement between memory and Redis to improve the hit rate of data in memory and reduce the pressure of the next level.

2: The limitation of memory capacity, need to control the number of caches.

3: Hot data updates are different, and a single key expiration time needs to be configurable.

4: Good cache expiration deletion strategy.

5: The complexity of caching data structure is as low as possible.

About replacement and hit rate: LRU algorithm is adopted, because it is easy to implement and cache key hit rate is also very good.

LRU is: to eliminate the least recently accessed data, often accessed is hot data.

About LRU data structure: Because key priority is raised and key is eliminated, sequential structure is needed. Most implementations on the Internet adopt this linked list structure.

That is, the new data is inserted into the head of the list, the hit data is moved to the head, adding complexity O(1), moving and acquiring complexity O(N).

Is there anything less complex? With Dictionary, the complexity is O(1), and the performance is the best. So how to ensure that the priority of caching is increased?

O(1)LRU Implementation

Define a LRUCache < TValue > class and construct the parameter maxKeySize to control the maximum number of caches.

Concurrent Dictionary is used as our cache container and thread security is guaranteed.

<pre style="margin: 0px; padding: 0px; white-space: pre-wrap; overflow-wrap: break-word; font-family: &quot;Courier New&quot; !important; font-size: 12px !important;"> public class LRUCache<TValue> : IEnumerable<KeyValuePair<string, TValue>> { private long ageToDiscard = 0;
//Age Starting Point for Elimination
private long currentAge = 0;
//Current cache latest age
private int maxSize = 0;
//Cache Maximum Capacity
private readonly ConcurrentDictionary<string, TrackValue> cache;
public LRUCache(int maxKeySize)
        {
    cache = new ConcurrentDictionary<string, TrackValue>();
    maxSize = maxKeySize;
}
}
</pre>

The two self-increment parameters, ageToDiscard and current Age, are defined above to mark the old and new levels of key s in the cache list.

The implementation steps are as follows:

Each time a key is added, the current Age increases and assigns the current Age value to the age of the cached value. CurrtAge increases all the time.

<pre style="margin: 0px; padding: 0px; white-space: pre-wrap; overflow-wrap: break-word; font-family: &quot;Courier New&quot; !important; font-size: 12px !important;"> public void Add(string key, TValue value)
        {
    Adjust(key);
    var result = new TrackValue(this, value);
    cache.AddOrUpdate(key, result, (k, o) => result);
}
public class TrackValue
        {
    public readonly TValue Value;
    public long Age;
    public TrackValue(LRUCache<TValue> lv, TValue tv)
                {
        Age = Interlocked.Increment(ref lv.currentAge);
        Value = tv;
    }
}
</pre>

When adding, if the maximum number is exceeded, check if there is an ageToDiscard age key in the dictionary. If there is no cyclic self-increasing check, then delete and add successfully.

Its ageToDiscard+maxSize= currentAge ensures that old data can be eliminated under O(1) rather than linked list movement.

<pre style="margin: 0px; padding: 0px; white-space: pre-wrap; overflow-wrap: break-word; font-family: &quot;Courier New&quot; !important; font-size: 12px !important;">  public void Adjust(string key)
        {
    while (cache.Count >= maxSize)
                {
        long ageToDelete = Interlocked.Increment(ref ageToDiscard);
        var toDiscard = cache.FirstOrDefault(p => p.Value.Age == ageToDelete);
        if (toDiscard.Key == null) continue;
        TrackValue old;
        cache.TryRemove(toDiscard.Key, out old);
    }
}
</pre>

When a key is obtained, it is accessed again, and the latest current Age is assigned to it to increase its age:

<pre style="margin: 0px; padding: 0px; white-space: pre-wrap; overflow-wrap: break-word; font-family: &quot;Courier New&quot; !important; font-size: 12px !important;">  public TValue Get(string key)
        {
    TrackValue value=null;
    if (cache.TryGetValue(key, out value))
                {
        value.Age = Interlocked.Increment(ref currentAge);
    }
    return value.Value;
}
</pre>

Overdue deletion strategy

In most cases, the hit rate of LRU algorithm to hot data is very high. However, if a sudden large number of occasional data access, memory will store a large number of cold data, that is, cache contamination.

It will cause LRU not to hit hot data, resulting in a sharp decline in the hit rate of the cache system. It can also use LRU-K, 2Q, MQ and other mutations to improve the hit rate.

Expired configuration

The maximum expiration time is set to avoid the cold data residing in memory as much as possible.

In most cases, the time requirement for each data cache is inconsistent, so the expiration time field of a single key needs to be added.

<pre style="margin: 0px; padding: 0px; white-space: pre-wrap; overflow-wrap: break-word; font-family: &quot;Courier New&quot; !important; font-size: 12px !important;"> private TimeSpan maxTime;
public LRUCache(int maxKeySize,TimeSpan maxExpireTime){
}
//Track Value Increases Creation Time and Expiration Time
public readonly DateTime CreateTime;
public readonly TimeSpan ExpireTime;
</pre>

Deletion strategy

With regard to key expiration deletion, the best way is to use timing deletion, which can release the occupied memory as quickly as possible, but obviously a large number of timers are very unfriendly to CPU s.

So we need to use lazy deletion, check whether the key is out of date, and delete it directly when it is out of date.

<pre style="margin: 0px; padding: 0px; white-space: pre-wrap; overflow-wrap: break-word; font-family: &quot;Courier New&quot; !important; font-size: 12px !important;">public Tuple<TrackValue, bool> CheckExpire(string key)
        {
    TrackValue result;
    if (cache.TryGetValue(key, out result))
                {
        var age = DateTime.Now.Subtract(result.CreateTime);
        if (age >= maxTime || age >= result.ExpireTime)
                        {
            TrackValue old;
            cache.TryRemove(key, out old);
            return Tuple.Create(default(TrackValue), false);
        }
    }
    return Tuple.Create(result, true);
}
</pre>

Although the performance of lazy deletion is the best, it still does not solve the problem of cache contamination for cold data, so we need to add a regular cleaning and lazy deletion to use together.

For example, if a single thread traverses every five minutes to check whether the key expires, this time strategy is configurable, and if the number of caches is large, it can traverse in batches.

<pre style="margin: 0px; padding: 0px; white-space: pre-wrap; overflow-wrap: break-word; font-family: &quot;Courier New&quot; !important; font-size: 12px !important;">public void Inspection()
        {
    foreach (var item in this)
                {
        CheckExpire(item.Key);
    }
}
</pre>

Lazy deletion and periodic deletion can basically meet most of the requirements.

summary

This paper refers to the relevant implementation of redis and Orleans.

If we continue to improve it, it is the embryonic form of the memory database, similar to redis, such as adding notification callbacks to delete key s to support more data type storage.

Friends who like this article can either click on it or pay attention to my personal topics: The Way of Java's Growth

In view of the above mentioned knowledge points, I summarized most of the architecture interview questions and answers that programmers with 1 to 5 years development experience involved in the interview, and made documents and architecture video materials for free sharing (including Dubbo, Redis, Netty, zookeeper, Spring cloud, distributed, high concurrency and other architecture technical information), hoping to help you before the interview. Reviewing and finding a good job also saves you time to study by searching for information on the Internet. You can also pay attention to my future dry goods sharing.

Data acquisition: QQ group search "708-701-457" can be free of charge.



Tags: Redis less Database Java

Posted on Mon, 22 Apr 2019 11:45:33 -0700 by symchicken