The first step we undertook was creation of full text search implementation for the search feature of Gamezone. A MyISAM table for products table ( containing catalogid and the name of the game) was created with full text index on ProductName column. This was done to improve the performance of search which is in our critical path ( about 20% )
Previously the search was being done on an Innodb table with no indexes which could prove to be a bottle neck, hence we modified the same. We were using a sql query like " select * from
where gamename like %blah% " , which used to lock the entire table and hence causing a severe penalty on performance.
Limited our sql search query to return the first 100 results as it is only logical to assume that the user wouldn't generally navigate till the last page to find the product he is looking for but would rather modify his search. We found this method being used by several popular portals, including Y! shopping. Also, it is worth noting that several portals do limit the frequency with which a search can be performed by a user so as to prevent users from overwhelming the server.
In order to keep the hits to the database low, we denormalized the Reviews table to include the Reviewer name to avoid joins and enable faster retreival. This was one of the strategies we had planned since very early in the project.
We did a lot of revamping of the database. We dropped the unused columns from the database, we pulled out some un-necessary error checks, etc in the App. Our database schema designed as per the Y! Shopping web service had a few columns which were never used by our application, Hence we removed these columns.
Workload Justification
After a lot of debate, we agreed upon the following to be an appropriate ratio of our critical paths.
- Path 1 : Home -> Shop Online(Default Sorted by Top Rating) -> Paginate -> View Product -> View Product Specification -> Back to Shop Online Page : 25% of users
- Path 2 : Home -> (Search by Title) -> Paginate -> View Product -> View Product Specification -> Back to Shop Online Page : 25% of users
- Path 3 : Home -> Shop Online -> View Reviews -> Paginate -> Back to Product Details : 10% of users
- Path 4 : Home -> Login -> (Search by Title) / (Browse by Category) -> Paginate -> View Product -> Back to Shop Online Page -> Logout : 10% of users
- Path 5: Home -> Login -> View Product -> View Reviews -> Paginate -> Back to Product Details -> Logout : 10% of users
- Path 6: Home -> Login -> Shop Online -> (Sort by Top Price/Rating) / (Search by Title) / (Browse by Category) -> Paginate -> View Product -> View Product Specification -> -> Add to cart -> Checkout -> Logout. 20% of users
We worked under the following assumptions. Primarily, that more than 50% of users of the store would almost never login. They are at a portal mostly checking it out, window shopping, or just there to read the reviews. They may be interested in seeing "whats new". Of the people who do create logins, they can further be divided into 3 categories. Ones who just look for games, and not buy, the ones who see new products and reviews (may be write some once in a while) and the final and the most useful kind, the ones who actually buy something. We assume the last category to be an optimistic 20%.
Use of httperf and its parameters
We used httperf with the following parameters.
httperf --hog --server domU-12-31-34-00-02-A2.usma2.compute.amazonaws.com --wsesslog=100,0,gz_test1 --session-cookie --rate 5
And here is the explainatiion of teh parameters used.
--hog : This option requests to use up as many TCP ports as necessary.Without this option, httperf is typically limited to using
ephemeral ports (in the range from 1024 to 5000).
--server : This option is used to address the Host with which we are required to establish a connection and test for performance.
--wsesslog N1,N2,filename: This option requests the generation and measurement of sessions instead of individual requests. N1 is
the total number of sessions to generate. N2 is the time(in seconds) that user think separates consecutive call bursts. We keep N2
as , for faster execution of the load tests. Filename is the file which contains the session information for the simulates user.We
had 6 files for the six critical paths. Later, based on class recommendation, we designed the critical paths into one file, that
contined a proportal number of paths of each of the critical paths, so as to simulate the user-mix that was desirable.
--port : This option specifies the port number N on which the web server is listening for HTTP requests. By default, httperf uses
port number 80. hence we did not specify this option as our instances were running on Port 80 itself.
--rate : This option specifies the fixed rate at which connections or sessions are created. Since we requied to load the server
with concurrent and not sequential sessions, we varied the rate from 5 to 75 sessions per second.
--session-cookie : This option takes care of cookie on per session basis.
- Designed certain perl scripts to execute the httperf and mine data for later use in graphs.
- All our tests ran for atleast 2 mins.
- most of the time on the servers(during the tests) was spent in the MySQL database.
Observations and Results
- The above graph shows the reply time from server before and after optimizations have been performed. It is worth noting that the "after" graph has almost 1000 ms difference. However, one must remember that though the addition of the Index has had an impact on the response time, the impact is not very widely felt as only about 20% of the critical paths take advantage of the full text search MyISAM table that was added. Further, due to the limitations of this expermental setup, we were unable to evaluate the full impact by varying out search query so as to not hit the DB catches every time.
- This graph is very similar in nature to the one we saw above. It plots the request rate vs the rate at which the sessions were created. Request rate gives the rate at which HTTP requests were issued and the period that this rate corresponds to.
- Well, this graph is perhaps the most interesting of them all. It clearly shows that there has been improvement because of MyISAM table and also several other optimizations that we have performed. However, we feel that further optimizations are needed and there is still room for further scaling. We observed by running a "top" command on the server that most of the time was being spent in MySQL. This gives us a hint that we probably need to poke harder at the queries and probably optimise them either by good old query optimisation techniques or by result caching. We are able to hit a performance of atmost 16 replies/second.
Inspired by this and by this , we set out to try if we could get better performance by increasing the number of mongrel servers. We found that it was not easy to estimate. We were startled by what we discovered. It is something that we definitely want to dig deeper during scaling to multiple servers.
Unsolved Mysteries
- So far we have not been able to scale to any rate higher than 16 reply/second. We would like to scale that to ~20reply/second. To do this, we would like to go back to the logs and do some quality data mining and determine where the bottleneck exactly is. We also would like to investigate further into the possibility of finding an ideal ratio of number of mongrel servers per cpu and use it to our advantage. Also, our database size is 15 MB. we would like to know if this has any effect on the numbers that we are finding. Further, the use of another id (catalogid) as the primary key and not the "id" provided by rails definitely has penalties which we would want to know more about.
UPDATE (Dec 1st 06):: sweet sweet 23 :)
We were previously using the default action provided by Rails scaffolding for listing all the products in the database. This proved to be very inefficient as it did a
"select count(*) from products" for every click on the Shop Online page. This page was on our critical paths several times and was bleeding us dry. The MySQL slowlog was our first clue... it had select count (*) on our products several times. We later found out that was about 91% of our processing time.
From
Processing GamesController#index (for 169.231.18.19 at 2006-12-02 00:13:59) [GET]
Session ID: 623be41ed351d60b33547488f19a2551
Parameters: {"action"=>"index", "controller"=>"games"}
Rendering actionlistlayoutfalse within layouts/games
Rendering games/list
Completed in 0.27817 (3 reqs/sec) | Rendering: 0.02139 (7%) | DB: 0.25500 (91%) | 200 OK
To
Processing GamesController#index (for 169.231.18.19 at 2006-12-02 00:16:11) [GET]
Session ID: 623be41ed351d60b33547488f19a2551
Parameters: {"action"=>"index", "controller"=>"games"}
Rendering actionlistlayoutfalse within layouts/games
Rendering games/list
Completed in 0.09805 (10 reqs/sec) | Rendering: 0.07565 (77%) | DB: 0.00157 (1%) | 200 OK
nearly a 3 times gain! We are now able to get a 23 reply/second... Here is the graph (as compared to the one without the last optimization)
We modified the list action provided by Rails Scaffolding and introduced a limit on the number of records retreived, so a
select count(*) on products was not required
This is our new code for limiting the number of records retreived
DEFAULT_OPTIONS[:first_n] = nil
def count_collection_for_pagination(model, options)
unless options[:first_n].nil?
return options[:first_n]=== UPDATE (Dec 1st 06):: sweet sweet 23 :) ===
else
return model.count(:conditions => options[:conditions],
:joins => options[:join] || options[:joins],
:include => options[:include],
:select => options[:count])
end
end
def list
@products_pages, @products = paginate :products, :first_n => 500, :conditions => ['numberavailable > 0'] , :per_page => 10
end
This is our old code without any limit on the number of records retreived
def list
@products_pages, @products = paginate :products, :conditions => ['numberavailable > 0'] , :per_page => 10
end
