The QuerySet Cache

Learn via video courses
Topics Covered

Overview

QuerySet is used to show your database object collection. At least one manager is always there in the model and by default, it is defined by the term object. Until you do not do any evaluation of the QuerySet, there is no occurrence of any activity in the database.

What is QuerySet in Django?

A database has a collection of objects so QuerySet is used to show the objects collection of your database. Zero, one, or many more filters can be present in this. In terms of SQL, we can define it as the SELECT statement and filters as the limiting clause like WHERE or LIMIT. The Manager model is used for getting QuerySet. At least one manager is always there in the model and by default, it is defined by the term object. The QuerySet does not hit the database in any condition such as whether QuerySet is filtered, constructed, sliced, and passed around. Until you do not do any evaluation of the QuerySet, there is no occurrence of any activity in the database. The below given example is used by us to demonstrate various examples.

Evaluation

Evaluation is the process of hitting the database. When you begin the iteration process in the QuerySet, then the rows that are matched by the QuerySet, are immediately fetched from the particular database and after that, it converted into Django models. This whole process is defined as evaluation. The built-in cache of QuerySet will store these models. This happens for the reason that if after some time you again want to iterate the QuerySet, then you do not need to hit the database again.

Enabling Cache

Save the QuerySet in the variable for the enabling cache in the QuerySet and also re-used it if required. The _result_cache variable is present in the QuerySet class of Django and it is used to store Django models or query results in the list. There may be the possibility that _result_cache is none when there is no cache otherwise it has several Django model objects in the form of a list. Iterating the QuerySet of cache is similar to iterating the _result_cache variable. The given procedure will design two query sets, evaluate them, and after it throw them as the query set is not stored by it for use in future. When the evaluation of QuerySet begins, then the given code stores the QuerySet in a variable, The result of QuerySet is stored in its cache i.e, _result_cache.

Evaluation does not happen only through the process of iteration there can be various methods of implementing evaluation. Given below are different ways in which evaluation can take place.

Iteration

An object of QuerySet is in iterable mode and the hitting of the database begins before you start iterating the first row. Then the result will be stored in the cache. The given example represents how the hitting of the database and storing of results in the cache happens before the first headline printing.

Slicing

Slicing is a non-evaluated QuerySet set that returns a new QuerySet. Further slicing is allowed on new query sets but we cannot do any more modifications like modification in order, the addition of filters, and so on. Sliced or non-sliced QuerySet always stores the result in its cache whenever you iterate in it.

If you are slicing the evaluated QuerySet then it will not return a QuerySet object but it returns a list of objects. And this is happening due to the reason that after the procedure of evaluation when the iteration process starts again then QuerySet uses its own cache i.e, a list.

The index will be used to choose one element from the non-evaluated QuerySet and it hits the database but if this index chooses the element from the evaluated QuerySet then the cache is used.

There is an exception if, in the not evaluated query set, the step parameter of Python slice syntax is used. In this situation, the query is executed immediately and then instead of a query set object, it will return a list of model objects.

Pickling/Caching

The QuerySet will be evaluated if you pickle it. The QuerySet will be evaluated if you store the QuerySet in its cache. And whenever you use this QuerySet in the future, it will return you the list of object models from its cache.

Invalidation

Data of the state must be never accessible in the cache as there is forever caching of the queries. As actually keys are not deleted at any time, but instead of that table generation progression makes it inaccessible. And in this scenario, invalidation refers to the manipulation of the generated key of the table. The query keys are designed based on various numbers of identifying aspects of a query. And this contains the SQL itself, the clause of ordering, the params, all table generations, and also the name of the database. Our goal is not to avoid databases at any cost. So for this, there are various queries for applying ordering clauses on the same dataset that are considered two different queries. The process of invalidation takes place at the level of the table, so the cached query can be made inaccessible by the table that is being modified. In this process, invalidation is greedy, so it makes sense to check Johnny against your site to see if the particular caching is advantageous or not.

Transactions

An interesting problem is represented by transactions to the caches such as Johnny. Because on the write there is an invalidation of the generation keys and code path similar to our invalidation is not used by the committed transaction. There are many scenarios involving transactions that will cause problems.

Problems are caused by many scenarios including transactions. Writing and reading within the transaction that is rolled back is considered to be the most obvious. Writing in the transaction will invalidate the cache and reading makes the new records stored in the cache, but in the database, that new data is not seen. And there are many other problems related to concurrency control with invalidating keys within the transaction regardless of the rollback of the transaction. As the modification of the generation key takes place in the Memcached so the transaction does not protect itself. Due to this, at the enabling of Johnny, in various places patching of the django.db.transaction module is done to place new hooks around the transaction rollback and committal. The transaction that is manually managed by you then it is termed a managed transaction. And when you are in these types of transactions then any cache keys are written automatically in the local store. These keys are pushed to the global cache during the commit and discarded during the process of rollback.

Using with TransactionMiddleware

django.middleware.transaction.TransactionMiddleware is a middleware shipped with a django, by which wrapping of all the requests of the transaction is done and at the time of exception thrown within the view it rollbacks. On commit, only transactional data is pushed by Johnny in the cache but transactions that are not dirty(if during the request no writes are performed) are left uncommitted by the transactionMiddleware. In simple words, we can say that use TransactionMiddleware and don’t write anything if you have views.

Savepoints

Savepoints are supported by johnny and for the savepoints, it has some comprehensive testing, the 2 backends by which this is supported do not behave in the same way. In multi-db and single-db environments, savepoints are tested from both outside and inside of the transactions.

Multiple Databases

Multiple databases are supported by johnny in different types of configurations. DATABASES.. JOHNNY_CACHE_KEY setting is used if johnny is used by you to cache the result from the slave database so that it can be ensured that similar database keys are used by the slave databases and master databases. Many of the typical problems still exist in the master-slave configuration of the database.

Usage

Middleware johnny.middleware.QueryCacheMiddleware is to be enabled for enabling the querySet cache.

Manual Invalidation

johnny.cache.invalidate is used for manual invalidation of the model or table. johnny.cache.invalidate(*tables, **kwargs) Current generation invalidation for one or more than one tables. The string specifying database table models or names is passed as an argument. For setting the database pass using in kwarg.

Using with Scripts, Management Commands, Asynchronous Workers, and The Shell

As middleware is used for enabling the QuerySet cache, the queries will neither invalidate the cache nor cached made outside of the request-response loop of Django.

Function can be used for enabling and disabling the querySet cache johnny.cache.enable() and johnny.cache.disable()

To ensure activation of the johnny in management commands, then __init__.py is used for enabling it

Conclusion

  • A database has a collection of objects so QuerySet is used to show the objects collection of your database.
  • Evaluation is the process of hitting the database.
  • Save the QuerySet in the variable for the enabling cache in the QuerySet and also re-used it if required.
  • An object of QuerySet is in iterable mode and the hitting of the database begins before you start iterating the first row.
  • Slicing is a non-evaluated QuerySet set that returns a new QuerySet.
  • Invalidation refers to the manipulation of the generated key of the table.
  • Middleware johnny.middleware.QueryCacheMiddleware is to be enabled for enabling the query set cache.