File-Based Caching, Part 1

There are a variety of caching methods available to you when writing a web app. I am sure you have used session and the intrinsic .NET caching component to cache data. But there are times when neither of these solutions is the right solution.

Let's consider when session and caching are most valuable. Session is extremely useful for user-specific data that is needed across a variety of applications. (I will assume that you are in a web farm environment, and therefore storing your data in the native SQL-Server session store. Even if you are not, this discussion applies to a lesser degree.) Since it is retrieved and stored on every page transition, it is typically good to keep values stored in it small. Sometimes I store large data elements there, but I try to keep that to a minimum, and only to data that must be access all the time.

Caching is also a great mechanism, highly efficient and proven to work well. But think about the usage of it--if your all of individual users will have their data cached there, you will be eating up a lot of precious web server memory. Compound this by the fact that each web server will need to cache the data, meaning that whatever process is needed to create the data, it may execute once for each web server you have. If this process is expensive, you haven't gained a whole lot when different servers all need to process the same request. Finally, however much memory is used per user, you are eating up with memory on every web server. The result is a large expense.

Caching is very valuable when the data is not user specific. If you have configuration settings, for example, or lookup values or whatever that can shared across all users, caching is a great approach. Not only will you significantly reduce the time to fetch the data, but your load on the infrastructure (for example if the data is being retrieved from a database) will be reduced as well.

Now let's consider a common situation that is not a good fit for session or caching. Often, data for an application that a user would retrieve is specific to the user. Maybe some other users would get the same data, but it is not enterprise-wide. Let's also suppose that it is expensive to materialize this data initially--a common example is executing a SQL query that might require several seconds to run. You know you should cache this, rather recreate it each time. Session is not a good candidate because the results coming back are large--it would be too inefficient to save and retrieve on every page transition, and it is specific to one app anyway (i.e. you don't want this data following the user around for the rest of their session).

.NET caching is not a good option because (1) we don't want to re-execute the expensive query potentially for each web server, and (2) the data is large and each user's data will be different, meaning you will be eating up a lot of memory every machine.

I recently wrote a file-based caching component to deal with this exact problem. I created a shared directory that can be accessed by every web server, and I write a file there with anything I want to store and retrieve. The process uses a BinaryFormatter to take any object is input, and be able to serialize it and deserialize it. Plus, I compress the BinaryFormatter results to shrink the size of the file written and reloaded.

Results have been impressive. Expensive requests can now be cached, and the repeated calls to the pages that use them are lightning-fast. Best of all, the solution is completely generic, so I can easily cache results from a web service call, a database call, or custom objects with no additional work on my part, thanks the BinaryFormatter.

Next time, I will provide some code and walk through usage of the component. I will also explain the use of expiration, which is a critical requirement to any caching mechanism.