ETC2
CS290F Fall 2006 - UCSB Computer Science - Thorsten von Eicken
How to implement and expire caches on paginated pages
Implementing "caches_page" or "caches_action" on your project will significantly benefit your service performance since your app will access the database much less frequently. In RoR, you can simply declare methods to be "caches_page" or "caches_action" in the controller as below :
caches_action :show_cuisines caches_action :show_reviews caches_action :user_info
The declaration as above will cause the app to create "html" pages in the /tmp/cache directory under the machine name -- if you use "caches_page", it creates in /public" directory under the model name. For instance, the app that runs on my development machine clippers.cs.ucsb.edu will have a directory
/tmp/cache/clippers.cs.ucsb.edu.3000/reviewer
,which has 3 sub directories
show_cuisines show_reviews user_info
Each of these directories has "cache" files, or "html" files if you use "caches_page", that contain html body of the pages, which have been requested by users. So, when the next request arrives on one of those pages, the app will simply return the "cache" html file rather then accessing the database to generate a new html page for the same request.
If those methods always return static pages, you don't need to expire those pages. However, if those methods requre to update pages under certain circumstance, you must delete those old cached pages in the directory accordingly. For example, in my project, whenever a user write a new review or update an existing review on some restaurant, my app must expire all the cached pages that contain information about that particular restaurant. This operation can be done by using "expire_action". For instance, in my 'update method, I added:
expire_action :action=>"show_reviews", :id=>@review.restaurant_id expire_action :action=>"user_info", :id=>session[:username]
Now, the problem that I was facing was that I wanted to cache pages that are paginated.
When you apply the 'paginate' method, it generates a URI looking like below:
/reviewer/show_cuisines/0?page=1&cuisine_type=bars
Now, the problem here is that all the pages, such as "page=1", "page=2", "page=52" and so on, share the same id, in this case "id=0".
So, the requests
/reviewer/show_cuisines/0?page=1&cuisine_type=bars /reviewer/show_cuisines/0?page=2&cuisine_type=bars ... /reviewer/show_cuisines/0?page=52&cuisine_type=bars
will receive the idential page, whichever was initially cached under the "id = 0"; when the app caches pages, it ignores all the other parameters, such as "page" and "cuisine_type", and save the cache using only the "id" value. For instance, in the directory " /tmp/cache/clippers.cs.ucsb.edu.3000/reviewer/show_cuisines", it will have a sinlge cache file "0.cache", which contains "?page=1&cuisine_type=bars" request. After this cache is save, all the following requests that has a URI prefix of "/reviewer/show_cuisines/0?" will receive this "0.cache" file, regardless of "page" and "cuisine_type" values.
In order to get around this issue, you need to modify the "config/route.rb" file.
map.show_cuisines 'reviewer/show_cuisines/:id/:cuisine_type/:page', :controller => 'reviewer', :action => 'show_cuisines' map.show_cuisines 'reviewer/show_cuisines/:id/:cuisine_type', :controller => 'reviewer', :action => 'show_cuisines'
I added above lines in my "route.rb" file, which converts the previous URI
/reviewer/show_cuisines/0?page=1&cuisine_type=bars
to
/reviewer/show_cuisines/0/bars/1
Now, my cache file in the directory looks like
/tmp/cache/clippers.cs.ucsb.edu.3000/reviewer/show_cuisines/0/bars/1.cache
If the request that I wanted before the "route.rb" modification was below:
/reviewer/show_cuisines/0?page=5&cuisine_type=fastfood
Now, I need to request it using the URL form
/reviewer/show_cuisines/0/fastfood/5
and it generates a new cache file in the directory
/tmp/cache/clippers.cs.ucsb.edu.3000/reviewer/show_cuisines/0/fastfood/5.cache
So, even with the same id, "id=0", it can generate multiple caches according to the other parameters.
Next issue is how to expire those cache pages.
Apparently, using simple 'expire_action :action => "", :id => ""' will not work; you must specify more much detailed information on which cache under the "action/id" directory you'd like to expire -- it could be under "/bars" or "/fastfood".
In my case, if a review is submitted or modified for the restaurant id = 1001, I need to delete all the cached pages under
/reviewer/show_reviews/1001/
This directory might contain 1.cache, 2.cache, 3.cache and so on.
To solve this issue, I found this beautiful expire fuction "expire_fragment", which is what "expire_action" and "expire_page" use internally.
It takes many form of inputs, but I liked its feature that it allows you specify regular expression. So, my expire function in my "update" or "submit" methods look like:
expire_fragment( %r{.*/reviewer/show_reviews/#{@review.restaurant_id}/.*} )
expire_fragment( %r{.*/reviewer/user_info/#{session[:username]}/.*} )
The functions above delete every file in the matching directory, thus eliminating all the cache pages generated by "paginate" method.
Problem in Distributed Application Servers
A problem arises when application servers run on multiple, distributed machines where cached pages are disseminated across machines; the RoR cache expiration mechanism decribed above only removes local cache. Thus, in order to expire outdated cache on all app machines, I created a simple bash script that deletes targeted cache files using ssh command:
#!/bin/bash # be sure that every machine's name is in root's .ssh/known_host # change the hostnames accordingly #web host webserver="domu-12-31-33-00-03-8c.usma1.compute.amazonaws.com" #app host hostname1="domu-12-31-33-00-03-8c.usma1.compute.amazonaws.com" hostname2="domu-12-31-33-00-03-41.usma1.compute.amazonaws.com" hostname3="domu-12-31-33-00-03-76.usma1.compute.amazonaws.com" review=$1 user=$2 chmod 600 /home/user/Lib/Rails/u/apps/hotspots/current/ec2.key ssh -i /home/user/Lib/Rails/u/apps/hotspots/current/ec2.key root@$hostname1 rm -fr /home/user/Lib/Rails/u/apps/hotspots/current/tmp/cache/$webserver/reviewer/show_reviews/$review ssh -i /home/user/Lib/Rails/u/apps/hotspots/current/ec2.key root@$hostname1 rm -fr /home/user/Lib/Rails/u/apps/hotspots/current/tmp/cache/$webserver/reviewer/user_info/$user ssh -i /home/user/Lib/Rails/u/apps/hotspots/current/ec2.key root@$hostname2 rm -fr /home/user/Lib/Rails/u/apps/hotspots/current/tmp/cache/$webserver/reviewer/show_reviews/$review ssh -i /home/user/Lib/Rails/u/apps/hotspots/current/ec2.key root@$hostname2 rm -fr /home/user/Lib/Rails/u/apps/hotspots/current/tmp/cache/$webserver/reviewer/user_info/$user ssh -i /home/user/Lib/Rails/u/apps/hotspots/current/ec2.key root@$hostname3 rm -fr /home/user/Lib/Rails/u/apps/hotspots/current/tmp/cache/$webserver/reviewer/show_reviews/$review ssh -i /home/user/Lib/Rails/u/apps/hotspots/current/ec2.key root@$hostname3 rm -fr /home/user/Lib/Rails/u/apps/hotspots/current/tmp/cache/$webserver/reviewer/user_info/$user
Then, in the method, you can run the script above by adding a line as below:
system "./clean_cache.sh #{@review.restaurant_id} #{session[:username]}"
Thanks and let me know if you have any question
Kyo
