python cache

Python cache – the must read tips for code performance


Most of us may have experienced the scenarios that we need to implement some computationally expensive logic such as recursive functions or need to read from I/O or network multiple times, these functions typically requires more resources and longer CPU time, and eventually can cause performance issue if handle without care. For such case, you shall always pay special attention to it once you have completed all the functional requirements, as the additional costs on the resources and time may eventually lead to the user experience issue. In this article, I will be sharing how we can make use of the cache mechanism (aka memoization) to improve the code performance.


To follow the examples in below, you will need to have requests package installed in your working environment, you may use the below pip command to install:

pip install requests

With this ready, let’s dive into the problem we are going to solve today.

As I mentioned before, the computationally expensive logic such as recursive functions or reading from I/O or network usually have the significant impacts to the runtime, and are always the targets for optimization. Let me illustrate with a specific example, for instance, assume we need to call some external API to get the rates:

import requests
import json

def inquire_rate_online(dimension):
    result = requests.get(f"{dimension}")
    if result.status_code ==
        data = result.json()
        return data["args"]["dim"]
    return ''

This function needs to make a call through the network and return the result (for demo purpose, this API call just echo back the input as result). If you want to provide this as a service to everybody, there is a high chance that different people inquire the rate with same dimension value. And for this case, you may wish to have the result stored at somewhere after the first person inquired, so that later you can just return this result for the subsequent inquiry rather than making an API call again. With this sort of caching mechanism, it should speed up your code.

Implement cache with global dictionary

For the above example, the most straightforward way to implement a cache is to store the arguments and results in a dictionary, and every time we check this dictionary to see if the key exists before calling the external API. We can implement this logic in a separate function as per below:

cached_rate = {}
def cached_inquire(dim):
    if dim in cached_rate:
        print(f"cached value: {cached_rate[dim]}")
        return cached_rate[dim]
    cached_rate[dim]= inquire_rate_online(dim)
    print(f"result from online : {cached_rate[dim]}")
    return cached_rate[dim]

With this code, you can cache the previous key and result in the dictionary, so that the subsequent calls will be directly returned from the dictionary lookup rather than an external API call. This should dramatically improve your code performance since reading from dictionary is much faster than making an API through the network.

You can quickly test it from Jupyter Notebook with the %time magic:

%time cached_inquire(1)

For the first time you call it, you would see the time taken is over 1 seconds (depends on the network condition):

result from online : 1
Wall time: 1.22 s

When calling it again with the same argument, we should expect our cached result start working:

%time cached_inquire(1)

You can see the total time taken dropped to 997 microseconds for this call, which is over 1200 times faster than previously:

cached value: 1
Wall time: 997 µs

With this additional global dictionary, we can see so much improvement on the performance. But some people may have concern about the additional memory used to hold these values in a dictionary, especially if the result is a huge object such as image file or array. Python has a separate module called weakref which solves this problem.

Implement cache with weakref

Python introduced weakref to allow creating weak reference to the object and then garbage collection is free to destroy the objects whenever needed in order to reuse its memory.

For demonstration purpose, let’s modify our earlier code to return a Rate class instance as the inquiry result:

class Rate():
    def __init__(self, dim, amount):
        self.dim = dim
        self.amount = amount
    def __str__(self):
        return f"{self.dim} , {self.amount}"

def inquire_rate_online(dimension):
    result = requests.get(f"{dimension}")
    if result.status_code ==
        data = result.json()
        return Rate(float(data["args"]["dim"]), float(data["args"]["dim"]))
    return Rate(0.0,0.0)

And instead of a normal Python dictionary, we will be using WeakValueDictionary to hold a weak reference of the returned objects, below is the updated code:

import weakref

wkrf_cached_rate = weakref.WeakValueDictionary()
def wkrf_cached_inquire(dim):
    if dim in wkrf_cached_rate:
        print(f"cached value: {wkrf_cached_rate[dim]}")
        return wkrf_cached_rate[dim]

    result = inquire_rate_online(dim)
    print(f"result from online : {result}")
    wkrf_cached_rate[dim] = result
    return wkrf_cached_rate[dim]

With the above changes, if you run the wkrf_cached_inquire two times, you shall see the significant improvement on the performance:

python weakref cache

And the dictionary does not hold the instance of the Rate, rather a weak reference of it, so you do not have to worry about the extra memory used, because the garbage collection will reclaim it when it’s needed and meanwhile your dictionary will be automatically updated with the particular entry being removed. So subsequently the program can continue to call the external API like the first time.

If you stop your reading here, you will miss the most important part of this article, because what we have gone through above are good but just not perfect due to the below issues:

  • In the example, we only have 1 argument for the inquire_rate_online function, things are getting tedious if you have more arguments, all these arguments have to be stored as the key for the dictionary. In that case, re-implement the caching as a decorator function probably would be easier
  • Sometimes you do not really want to let garbage collection to determine which values to be cached longer than others, rather you want your cache to follow certain logic, for instance, based on the time from the most recent calls to the least recent calls, aka least recent used, to store the cache

If the least recent used cache mechanism makes sense to your use case, you shall consider to make use of the lru_cache decorator from functools module which will save you a lot of effort to reinvent the wheels.

Cache with lru_cache

The lru_cache accepts two arguments :

  • maxsize to limit the size of the cache, when it is None, the cache can grow without bound
  • typed when set it as True, the arguments of different types will be cached separately, e.g. wkrf_cached_inquire(1) and wkrf_cached_inquire(1.0) will be cached as different entries

With the understanding of the lru_cache, let’s decorate our inquire_rate_online function to have the cache capability:

from functools import lru_cache

def inquire_rate_online(dimension):
    result = requests.get(f"{dimension}")
    if result.status_code ==
        data = result.json()
        return Rate(float(data["args"]["dim"]), float(data["args"]["dim"]))
    return Rate(0.0,0.0)

If we re-run our inquire_rate_online twice, you can see the same effect as previously in terms of the performance improvement:

Python cache with lru_cache

And with this decorator function, you can also see the how the cache is used. The hits shows no. of calls have been returned from the cached results:

#CacheInfo(hits=1, misses=1, maxsize=None, currsize=1)

Or you can manually clear all the cache to reset the hits and misses to 0:



Let’s also talk about the limitations of the solutions we discussed above:

  • The cache mechanism works best for the deterministic function meaning by given the same set of inputs, it always returns the same set of results. And you would not benefit much if you try to cache the result of a nondeterministic function, e.g.:
def random_x(x):
    return x*random.randint(1,1000)
  • For keyword arguments, if you swap the position of the keywords, the two calls will be cached as separate entries
  • It only works for the arguments that are immutable data type.


In this article, we have discussed about the different ways of creating cache to improve the code performance whenever you have computational expensive operations or heavy I/O or network reads. Although lru_cache decorator provide a easy and clean solution for creating cache but it would be still better that you understand the underline implementation of cache before we just take and use.

We also discussed about the limitations for these solutions that you may need to take note before implementing. Nevertheless, it would still help you in a lot of scenarios where you can make use of these methods to improve your code performance.

You may also like

0 0 vote
Article Rating
Notify of
Inline Feedbacks
View all comments
Would love your thoughts, please comment.x