Ring: Advanced cache interface for Python

[+] suvelx|6 years ago|reply

Every example seems to follow this pattern

  client = pymemcache.client.Client(('127.0.0.1', 11211))  #2 create a client

  # save to memcache client, expire in 60 seconds.
  @ring.memcache(client, expire=60)  #3 lru -> memcache
  def get_url(url):
      return requests.get(url).content

How are you supposed to configure the client at 'runtime' instead of 'compile time' (when the code is executed and not when it's imported)?

Careful placement of imports in order to correctly configure something just introduces delicate pain points. It'll work now, but an absent minded import somewhere else later can easily lead to hours of debugging.

[+] sametmax|6 years ago|reply

   @ring.memcache(client, expire=60)   
   def get_url(url):
       return requests.get(url).content

can be written:

    def get_url(url):
        return requests.get(url).content

    get_url = ring.memcache(client, expire=60)(get_url)

Decorators are just syntactic sugar for that pattern.

You are then welcome to instantiate your ring.memcache object and bind it were it pleases you.

I would have provided a different API though:

   cache = pymemcache.client.Client(('127.0.0.1', 11211))

   @cache.lru(expire=60) # wrapper of ring.cache(client)
   def get_url(url):
       return requests.get(url).content

And accepted the alternative:

   cache = pymemcache.client.Client(conf_factory)

   def get_url(url):
       return requests.get(url).content

   get_url = cache.wraps.lru(get_url, expire=60)

It's better to not expect all people to know about the details of decorators just to use your API, and a factory is a nice hook to have anyway: it say where the code for that dynamic configuration should be and code as documentation is the best doc.

Also a patch() context manager would be nice for temporary caching:

   with cache.patch('module.lru', expire=60):
        get_url()

But it's hard to do in a thread safe way to compromised would have to be made.

[+] vngzs|6 years ago|reply

You can use a closure to pass in the configuration.

    def configure_memcache(client_ip, port):
        client = pymemcache.client.Client((client_ip, port))
        @ring.memcache(client, expire=60)
        def get_url(url):
            return requests.get(url).content

        return get_url

Then in your code which imports the above library:

    get_url = configure_memcache('127.0.0.1', 11211)
    result = get_url('https://www.google.com')

[+] youknowone|6 years ago|reply

This is a good point. asyncio backends now partially take an initializer function because calling await at importing time is a kind of non-sense.

I think it needs to take also a client-configuration or a client initializer. Any advice from your use case?

[+] unknown|6 years ago|reply

[deleted]

[+] zzzeek|6 years ago|reply

dogpile.cache author here.

The way dogpile does this is that your decorator is configured in terms of a cache region object, which you configure with backend and options at runtime.

https://dogpilecache.sqlalchemy.org/en/latest/usage.html#reg...

I got this general architectural concept from my Java days, observing what EHCache did (that's where the word "region" comes from).

[+] orf|6 years ago|reply

Surely it's just:

   client = pymemcache.client.Client(('127.0.0.1', 11211))
   cache_wrapper = ring.memcache if some_condition else ring.whatever

   @cache_wrapper(...)
   def ...

[+] coleifer|6 years ago|reply

Extremely poor design:

* Not DRY. What if I want to use a cache for production but disable caching in development? And I have 10s or even 100s of functions that rely on the cache? Because the decorators contain implementation/client-specific parameters, I now have to add another entire layer of abstraction over this.

* Implementation is tied to the decorator, e.g. `ring.memcache` -- seriously? Why does it matter?

* What about setting application defaults, such as an encoding scheme, a key prefix/namespace, a default timeout?

I'm sorry but this is over-engineered garbage and good luck to anyone who uses it.

[+] youknowone|6 years ago|reply

I agree they are missing features but still they are easy goal with small refactoring so it will be solved soon. Issue #129 is about application default. After that, dryrun is just replacing default action from 'get_or_update' to 'execute'

[+] whalesalad|6 years ago|reply

> I'm sorry but this is over-engineered garbage and good luck to anyone who uses it.

I agree and wish people would speak up and share sentiment like this more often.

[+] tyingq|6 years ago|reply

Is there a python equivalent to php's apcu? Apcu, in the PHP world, leverages mmap to provide a multi-process kv store, with fast, built in serialization. So it's simple and very fast for single server, multi-process caching.

[+] bpicolo|6 years ago|reply

It's not necessary in python (or many other server frameworks), because python doesn't typically follow a model of process-per-request. You can just stick it in memory available to all of your threads.

[+] kristoff_it|6 years ago|reply

Great project. There is only one angle that I feel is missing: multiple requests for the same resource could cause duplicated work, especially if the value generating function is slow.

I wrote a sample solution to that problem, feel free to reach out if you ever consider adding a similar feature, I'd be happy to contribute. (fyi: the current implementation is in Go)

https://github.com/kristoff-it/redis-memolock

[+] youknowone|6 years ago|reply

Actually it is common requests from the users but it wasn't solved yet. I will check the project, thanks!

[+] bsdz|6 years ago|reply

Looks extensive and I'll likely try using the module at some point.

One thing, why not stash all the function methods under a "ring" or "cache" attribute, eg

  @ring.lru()
  def foo()
    ..

  foo.cache.update()
  foo.cache.delete() 
  ..

This might be less likely to clash with any existing function attributes (if you're wrapping a 3rd party function say).

[+] youknowone|6 years ago|reply

Thanks for the great advice. I never thought about this problem.

[+] mrlinx|6 years ago|reply

Like this a lot.

How could only invalidate everything related to a specific client/customer/account?

I wonder how they cascade these invalidations at bigger and more complex systems.

[+] youknowone|6 years ago|reply

It doesn't have any cascading feature for now. @ring.redis_hash can be helpful for certain cases, but it is not a generic solution.

In future, there is a plan for indirect invalidation. It will use another key to decide expiration. Though this is not designed for cascading, but it will probably work for a part of them

[+] ergo14|6 years ago|reply

The api doesn't seem to be fleshed out compared to dogpile.cache yet.

Normally you don't want to pass cache backend instance to decorators on module level.

[+] unknown|6 years ago|reply

[deleted]

[+] TeeWEE|6 years ago|reply

How does this compare to dogpile?

[+] youknowone|6 years ago|reply

I reviewed a few cache libraries, but this is the new one I didn't checked.

Roughly, Ring consists of 6 key features - sub-functions, universal decorator, data coder, asyncio support, consistent and readable key generation and abstract-transparent back-end access. I will check dogpile soon, thanks.

[+] Dowwie|6 years ago|reply

no mutex dogpile lock or get_or_create functionality..

[+] mychael|6 years ago|reply

>Cache is a popular concept widely spread on the broad range of computer science but its interface is not well developed yet.

This sentence is grammatically incorrect. Replace "Cache" with Caching".

[+] youknowone|6 years ago|reply

Thanks, I will fix it

[+] alexeiz|6 years ago|reply

I needed something like this that allows access to and manual manipulation of the cache, and I ended up forking functools.lru_cache code. This library definitely fits the bill.

[+] tomnipotent|6 years ago|reply

> Memcached itself is out of the Python world

Don't know why this bothers me so much... but it's actually from Perl. It was born at LiveJournal, a well-known Perl shop.

[+] jteppinette|6 years ago|reply

I actually read this as outside.

[+] merlincorey|6 years ago|reply

To me, mocking of the caches for testing is super important and missing.

I searched the article, the linked "Why Ring?", and this page of responses for "mock", but no results.

Maybe it's just me!

[+] youknowone|6 years ago|reply

Thanks. I didn't think adding them to the why page. For now, the actual projects work like:

  if DEBUG:
      ring_cache = functools.partial(ring.dict, {}, default_action='execute')
  else:
      ring_cache = functools.partial(ring.redis, client)

  @ring_cache(...)
  def ...

Which is not very good solution at all. I will fix the design and properly document it. Thanks for suggesting why page and mock section.

[+] Dowwie|6 years ago|reply

no dogpile lock support?

[+] youknowone|6 years ago|reply

I want to say "not yet". It is shame that I didn't know docpile lock.

45 comments