Fishbowl P4
CS290F Fall 2006 - UCSB Computer Science - Thorsten von Eicken
Contents |
Fishbowl Project 4
Live Link
Please note that all the graphs are drawn on a logscale. For a list of data points on the x-axis: please see point table
SnapShots
Part A: Scale a Single Server
Optimization
The following graphs were produced by varying the rate of session requests between .1 and 52 sessions/sec and increasing exponentially. The number of total sessions per datapoint varied so that the experiment lasted ~1 min per point (point table) The red lines represent the performance pre-optimization, and the green lines represent the performance post-optimization.
| Graphs | |
|---|---|
| Replies/sec | |
| Response Time | |
Scaling a Single Server
We simulated 100 users, of which 57% of users were on Path 1, 29% on Path 3 and 14% on Path 2.
We chose these proportions because a large majority of users will be browsing for and applying jobs, while only some of those applications will be reviewed by an employer, and an even smaller proportion of new listings will be posted. We simulated between 2 and 14 connections generated per second, incrementing by 2.
Results (x-axis: requests/s y-axis:response time in ms)
Part B: Scaling Multiple Servers
Explanation of Setup
All of our scenarios were tested over the following pairs of sessions/sec and total number of sessions: point table Also, every configuration was tested for each of our 3 paths.
- Scenarios:
- 1 machine with mongrel running directly on port 80 (without apache as load balancer)
- 1 machine with 2 mongrel instances
- 1 machine with 2 mongrel instances + memcache
- 2 machines with 4 mongrel instances on each + memcache
- 4 machines with 4 mongrel instances on each + memcache
- 8 machines with 4 mongrel instances on each + memcache
Httperf config options
httperf --hog --server domu-12-31-33-00-03-61.usma1.compute.amazonaws.com --wsesslog=384,0,fishbowl.path2 --session-cookie --rate=6.4
Critical Path 1 : Login Search and Checkout
/ /account/login /account/login method=POST contents="commit=Log+in&login=u1&password=u1" /home/search method=POST contents="commit=Go%21&query[value]=0&query[type]=1&page=0" /home/add_to_cart/1 method=POST contents="query_val=0&query_type=1&page=0" /home/display_cart /appliedjobs/apply_for_job/1 method=POST contents="commit=Apply&apply[1]=1&apply[2]=0" /home/delete_from_cart/1 /home/display_cart
Critical Path 2 : Create a new Joblisting
/ /account/login /account/login method=POST contents="commit=Log+in&login=u2&password=u2" /home/list /employer/list /joblisting/new/1502 /joblisting/create method=POST contents="commit=Create&joblisting[city]=Goleta&joblisting[title]=Farmer&joblisting[state]=CA&joblisting[snippet]=Be+A+Farmer&employer_id=1502" /employer/list
Critical Path 3 : Review applications for a joblisting
/ /account/login /account/login method=POST contents="commit=Log+in&login=u3&password=u3" /home/list /employer /joblisting/list/451 /appliedjobs/list_by_joblisting/126099 /appliedjobs/show/3 /appliedjobs/edit_status/3 method=POST contents="commit=Change+Status&app[status]=Hired" /appliedjobs/list_by_joblisting/126099
Graphs of Performance (logscale)
| Critical Path 1 | Critical Path 2 | Critical Path 3 | |
|---|---|---|---|
| Replies/sec | | | |
| Response Time | | | |
| Total Errors | | | |
Conclusions
- Adding memcached is a sure way to get performance enhancement. We observed that even on a single machine with two mongrels, memcache produces an improvement in the performance.
- In our project, we never reached the point where the database was the bottleneck. Part of the reason is that jobpostings on our website are text only snippets and even with 110K records, our database size was below 50 MBs and would easily fit in the memory.
- We observed a linear increase in the performance by scaling the number of machines.
- One of our initial performance boosts was seen while scaling on a single machine, by adding index to our Postgres database.
- We found that in some cases it was more efficient to perform two or three small efficient queries rather than one complex query using JOIN tables.
- Our use of Postgres finally paid off against MySql with the advantage that we did not have to change the format of our tables (like InnoDB to MyISM in Mysql) and with TSearch2 plugin, we could efficiently perform full text queries on the original database.
- It is interesting to note that effects of scaling can be seen in the graphs, not only in the vertical trends, where replies/sec increase and response time decreases on scaling, but also in the horizontal trends, where it can be clearly seen that as we go on scaling we can handle a much larger (an order magnitude) rate of session creation. With 32 mongrels on 8 machines, we were able to handle a rate of 102 session creations per second, whereas with a single machine even reaching a figure of 25 was not feasible.
- Finally all the three phases can be seen clearly in the graphs above :
- Unloaded case
- Peak performance
- Degradation in performance due to overloading
This trend is clearly reflected in the response time increasing steadily and then finally taking a dip due to too many outstanding requests and httperf finally giving up due to too many open file descriptors. Also, a clear trend can be seen in the number of replies per second which reach a peak and then take a slight dip before settling down. Finally the total number of errors also increase as the load on the website is increased.
