Fishbowl P4

CS290F Fall 2006 - UCSB Computer Science - Thorsten von Eicken

Jump to: navigation, search

Contents

Fishbowl Project 4

Live Link

Fishbowl


Please note that all the graphs are drawn on a logscale.
For a list of data points on the x-axis: please see point table


SnapShots

Fishbowl_Snapshots

Part A: Scale a Single Server

Optimization

The following graphs were produced by varying the rate of session requests between .1 and 52 sessions/sec and increasing exponentially. The number of total sessions per datapoint varied so that the experiment lasted ~1 min per point (point table) The red lines represent the performance pre-optimization, and the green lines represent the performance post-optimization.

Path 1 Optimization results
Graphs
Replies/sec
Response Time


Scaling a Single Server

We simulated 100 users, of which 57% of users were on Path 1, 29% on Path 3 and 14% on Path 2.

We chose these proportions because a large majority of users will be browsing for and applying jobs, while only some of those applications will be reviewed by an employer, and an even smaller proportion of new listings will be posted. We simulated between 2 and 14 connections generated per second, incrementing by 2.


Results (x-axis: requests/s y-axis:response time in ms)

Part B: Scaling Multiple Servers

Explanation of Setup

All of our scenarios were tested over the following pairs of sessions/sec and total number of sessions: point table Also, every configuration was tested for each of our 3 paths.

  • Scenarios:
    • 1 machine with mongrel running directly on port 80 (without apache as load balancer)
    • 1 machine with 2 mongrel instances
    • 1 machine with 2 mongrel instances + memcache
    • 2 machines with 4 mongrel instances on each + memcache
    • 4 machines with 4 mongrel instances on each + memcache
    • 8 machines with 4 mongrel instances on each + memcache

Httperf config options

httperf --hog --server domu-12-31-33-00-03-61.usma1.compute.amazonaws.com 
--wsesslog=384,0,fishbowl.path2 --session-cookie --rate=6.4

Critical Path 1 : Login Search and Checkout

/
/account/login
/account/login method=POST contents="commit=Log+in&login=u1&password=u1"
/home/search method=POST contents="commit=Go%21&query[value]=0&query[type]=1&page=0"
/home/add_to_cart/1 method=POST contents="query_val=0&query_type=1&page=0"
/home/display_cart
/appliedjobs/apply_for_job/1 method=POST contents="commit=Apply&apply[1]=1&apply[2]=0"
/home/delete_from_cart/1
/home/display_cart

Critical Path 2 : Create a new Joblisting

/
/account/login
/account/login method=POST contents="commit=Log+in&login=u2&password=u2"
/home/list
/employer/list
/joblisting/new/1502
/joblisting/create method=POST contents="commit=Create&joblisting[city]=Goleta&joblisting[title]=Farmer&joblisting[state]=CA&joblisting[snippet]=Be+A+Farmer&employer_id=1502"
/employer/list

Critical Path 3 : Review applications for a joblisting

/
/account/login
/account/login method=POST contents="commit=Log+in&login=u3&password=u3"
/home/list
/employer
/joblisting/list/451
/appliedjobs/list_by_joblisting/126099
/appliedjobs/show/3
/appliedjobs/edit_status/3 method=POST contents="commit=Change+Status&app[status]=Hired"
/appliedjobs/list_by_joblisting/126099

Graphs of Performance (logscale)

Performance Graphs
Critical Path 1 Critical Path 2 Critical Path 3
Replies/sec
Response Time
Total Errors

Conclusions

  1. Adding memcached is a sure way to get performance enhancement. We observed that even on a single machine with two mongrels, memcache produces an improvement in the performance.
  2. In our project, we never reached the point where the database was the bottleneck. Part of the reason is that jobpostings on our website are text only snippets and even with 110K records, our database size was below 50 MBs and would easily fit in the memory.
  3. We observed a linear increase in the performance by scaling the number of machines.
  4. One of our initial performance boosts was seen while scaling on a single machine, by adding index to our Postgres database.
  5. We found that in some cases it was more efficient to perform two or three small efficient queries rather than one complex query using JOIN tables.
  6. Our use of Postgres finally paid off against MySql with the advantage that we did not have to change the format of our tables (like InnoDB to MyISM in Mysql) and with TSearch2 plugin, we could efficiently perform full text queries on the original database.
  7. It is interesting to note that effects of scaling can be seen in the graphs, not only in the vertical trends, where replies/sec increase and response time decreases on scaling, but also in the horizontal trends, where it can be clearly seen that as we go on scaling we can handle a much larger (an order magnitude) rate of session creation. With 32 mongrels on 8 machines, we were able to handle a rate of 102 session creations per second, whereas with a single machine even reaching a figure of 25 was not feasible.
  8. Finally all the three phases can be seen clearly in the graphs above :
    1. Unloaded case
    2. Peak performance
    3. Degradation in performance due to overloading

This trend is clearly reflected in the response time increasing steadily and then finally taking a dip due to too many outstanding requests and httperf finally giving up due to too many open file descriptors. Also, a clear trend can be seen in the number of replies per second which reach a peak and then take a slight dip before settling down. Finally the total number of errors also increase as the load on the website is increased.

Personal tools