Biblioful Project 4

CS290F Fall 2006 - UCSB Computer Science - Thorsten von Eicken

Jump to: navigation, search

In Project 4, we extended our analysis of the performance and scalability of biblioful, benchmarking the application under increasing loads with the httperf tool.

Contents

Single Server

The application is deployed entirely on one site: there is one instance of rails running and one database. The goal of this phase of the tests is to measure the performance of the application when running on a single machine.

The database that we use during the tests has about 22K entries in the authors table, 12K entries in the papers table, and 10K entries in the bibliographies table.

Methodology

To execute meaningful performance measurements, we have two requirements:

  • Generating increasing loads with which to test the application,
  • Generating loads that are representative of the expected/intended/realistic usage mix of the application.


As required, we use httperf to generate the load. While httperf is adequate to satisfy the first requirement, it seems to have several limitations that make more difficult to achieve the second one. We found the following limitations to be particularly difficult to work-around:

  • We could not find a way to parameterize the behavior of a session on the basis of the number of times the session has been executed in the current test run. This feature would be helpful, for example, to log in the application as one user the first time the session is executed, and as a different user the second time. We solved this problem simply by redefining a session multiple times using different values for these parameters.
  • We could not find a way to dynamically modify the behavior of a session on the basis of previous responses from the application. This feature would be helpful, for example, to view the first paper in a list returned by a search operation. Our solution to this problem consists in predetermining the application responses and hard-coding them in the session specification. One problem with this approach is that it could reduce the set of items accessed during a test. Therefore, when caching is used, this would result in a number of cache-hits higher than the one realistically expected.
  • httperf provides two parameters to tweak the number of concurrent request to the application: the total number of sessions to initiate (the first parameter to the wsesslog option) and the session initiation rate (the option rate). Requests are issued concurrently when two or more sessions are active concurrently, e.g., the second session starts before the first one is finished. However, the parameters provided by httperf make very difficult to directly and precisely control the number of concurrent requests. In fact, the life-span of a session depends on many factors, such as its length, the load on the application, etc. Furthermore, the number of concurrent requests issued depends on the load of the web application (requests inside a session are issued sequentially and thus a request can be performed only when the previous one has terminated) and on the load of the testing machine. We chose the value of these parameters after a process of trials and errors. In particular, we initially found convenient to fix the session rate initiation and keep as the only independent parameter the maximum number of sessions to initiate. Note that, until the load does not saturate the application or testing machine, the number of requests issued will be roughly proportional to the maximum number of sessions to initiate.

In our experiments, initially, we ran httperf as follows:

httperf --hog --server $server_ip --wsesslog=N,0,$filename --session-cookie --rate=0.7 --timeout 10

The parameters are:

  • N is the total number of sessions to be initiated. We let it vary from 1 to 25. In conjunction with the value of rate it controls the concurrency level in our tests (only indirectly as discussed above). We did not obtain significantly different results by allowing N to take values greater than 25.
  • The session initiation rate is fixed at 0.7.
  • 10 seconds after which a request timeouts and returns error. This models the fact that real users are only willing to wait for a response to a request for a limited amount of time. 10 seconds seems a reasonable timeout.
  • $filename is the session log filename (one per critical path, see below), which contains sessions for 100 different users

However, after performing a series of optimizations (described later), we were able to significantly improve the performance of our system. As a result, keeping the session initiation rate at 0.7 ses/sec was not producing enough load on the system anymore. Consequently, after optimizations, we ran httperf as follows:

httperf --hog --server $server_ip --wsesslog=50,0,$filename --session-cookie --rate=$i --timeout 10

In this case we were always issuing 50 sessions for 50 different users and varying the session initiation rate from 0.1 to 5.


Our application runs on an EC2 instance and httperf is run from a different EC2 instance. Note that EC2 does not provide any way to select the datacenter where an instance will be run. Therefore, it is possible that the two instances are allocated in different datacenters. We briefly experimented this configuration and noticed that the round-trip time required to exchange messages from one instance to the other increases of a factor of two in this configuration. Consequently, we expect worse performances when instances are not allocated in the same datacenter.


For each critical path, at the increasing of the number of sessions to be initiated (and, thus, at the increasing of the number of concurrent requests), we measure the number of requests issued per second and the response time. We also measure the percentage of errors (generally, user timeouts).

Path 1

Corresponds to the following navigation sequence:

  • a user visits the entry page of the application
  • the user logs in
  • the user visits his/her bibliography page
  • the user searches an author by name (here names starting with G)
  • a result of the search is shown (here author 29272)
  • the user views one of author's paper (here paper 17630)
  • the user changes the publication year of the paper and views the update paper information
  • the user logs out

This path exercises the database with read operations and modifies the session.

An example of a session in the corresponding wsesslog file is shown below:

/
    /stylesheets/biblioful.css
/account/login method=POST contents='login=user1&password=test&commit=Log+in'
/bibliography/list
/author/search
/author/do_search method=POST contents='author%5Bname%5D=G&commit=Search'
/author/show/29272
/paper/show/17630
/paper/edit/17630
/paper/update/17630 method=POST contents='paper%5Byear%5D=2003&commit=Edit'
/paper/show/17630
/account/logout
Path 2

Corresponds to the following navigation sequence:

  • a user visits the entry page of the application
  • the user logs in
  • the user visits his/her bibliography page
  • the user searches a paper by title (here title starting with A)
  • a result of the search is shown (here paper 16188)
  • the user visits his/her bibliography page (here paper 16475)
  • a paper in the user's bibliography is shown
  • the user update the grade for this paper
  • the user logs out

This path exercises the database with read operations and modifies the session.

An example of a session in the corresponding wsesslog file is shown below:

/
    /stylesheets/biblioful.css
/account/login method=POST contents='login=user1&password=test&commit=Log+in'
/bibliography/list
/paper/search
/paper/do_search method=POST contents='paper%5Btitle%5D=A&commit=Search'
/paper/show/16188
/bibliography/list
/bibliography/show/16475
/bibliography/edit/16475
/bibliography/update/16475 method=POST contents='bibliography%5Bgrade%5D=3&commit=Edit'
/bibliography/show/16475
/account/logout
Path 3

Corresponds to the following navigation sequence:

  • a user visits the entry page of the application
  • the user logs in
  • the user visits his/her bibliography page
  • a paper in the user's bibliography is shown (here paper 16479)
  • the user update the grade for this paper
  • a different paper in the user's bibliography is shown (here paper 16472)
  • the user updates the grade for this paper
  • the user logs out

This path exercises the database with both read and write operations. The session is also updated.

An example of a session in the corresponding wsesslog file is shown below:

/
    /stylesheets/biblioful.css
/account/login method=POST contents='login=user1&password=test&commit=Log+in'
/bibliography/list
/bibliography/show/16479
/bibliography/edit/16479
/bibliography/update/16479 method=POST contents='bibliography%5Bgrade%5D=3&commit=Edit'
/bibliography/show/16479
/bibliography/list
/bibliography/show/16472
/bibliography/edit/16472
/bibliography/update/16472 method=POST contents='bibliography%5Bgrade%5D=3&commit=Edit'
/bibliography/show/16472
/bibliography/list
/account/logout
Path 4

Corresponds to the following navigation sequence:

  • a user visits the entry page of the application
  • the user logs in
  • the user visits his/her bibliography page
  • the user views papers in the bibliography tagged with a selected tag (security)
  • the user views papers in the bibliography tagged with a selected tag (user1)

This path exercises the database with read operations and modifies the session.

An example of a session in the corresponding wsesslog file is shown below:

/
    /stylesheets/biblioful.css
/account/login method=POST contents='login=user1&password=test&commit=Log+in'
/bibliography/list
/bibliography/list/security
/bibliography/list/user1
/account/logout

Results

Before Optimizations

In this phase, we ran our performance measurements on the unoptimized version of the application. This version does contain the database optimizations (query reduction, indexes, use of session) introduced at the end of Biblioful Project 3.

Path 1

Average Request Rate, Average Response Time, and Error Percentage at the varying number of initiated sessions:

Average request rate at the varying of the number of session to initiate Average response time and error percentage at the varying of the number of sessions to initiate


Note: All errors are due to user timeouts.

Path 2

Average Request Rate, Average Response Time, and Error Percentage at the varying number of initiated sessions:


Note: All errors are due to user timeouts

Path 3

Average Request Rate, Average Response Time, and Error Percentage at the varying number of initiated sessions:

Note: All errors are due to user timeouts.

Path 4

Average Request Rate, Average Response Time, and Error Percentage at the varying number of initiated sessions:

Note: All errors are due to user timeouts.

Discussion

An analysis of the log messages produces during the tests showed that the 3 most expensive operations are:

  1. The login action in the account controller.
  2. The list action in the bibliography controller.
  3. The show action in the paper controller.

According to the logs, these operations were completed at the rate of 1-2 reqs/sec comparing to the average of 24 reqs/sec for other operations. In all 3 cases databases accesses were taking 70-80% of the total request execution time.

An analysis of the source code and database queries revealed the following causes of the high response time, timeouts, and the low request rate for the above operations:

  1. After a user logs in, in order to reduce the number of queries issued to the database, a number of items are stored in the session. In particular, tags defined by the user are pre-fetched. The unoptimized version of the controller retrieves the user's tags by retrieving each bibliography item (user.bibliographies.each) and extracting the tag from the item. This approach results in the execution of (at least) one query per bibliography item. This can become an issue when the number of elements in the bibliography is large, even if the problem is mitigated by the fact that it only affects a user when logging in.
  2. The bibliography/list action lists the items in a user's bibliography. For each element, this requires to fetch the following information from the database: paper's title and paper's author names. In order to reduce the load on the database, we decided to cache the page fragment consisting of the bibliography listing. The difficulty with caching this information relies in the fact that the fragment depends on both session data (the current user) and request parameters (the page number since the listing is paginated). We name fragments using the user id and page number information. The fragments are also invalidated when the bibliography is modified (currently, when adding or deleting a paper from it).
  3. The paper/show action shows all information about a paper. At the bottom of the page we are showing a button "Add to Bibliography" to the users who do not have this paper in their bibliographies. We were referring to the list of user's papers as current_user.bibliographies which was fetching all user's bibliographies from the database. This was becoming a real bottleneck for the users with relatively large bibliographies.


To improve the performance in the described situations, we did the following:

  • We implemented a method that fetches all tags defined by a user in one query.
  • We implemented caching for the bibliography/list page.
  • Rather than checking all papers in user's bibliography, we changed the check for whether a given paper is in a user's bibliography to one parameterized query.
  • We added the following additional indexes to the database:
    • index on the login field in the users table,
    • index on the user_id field in the bibliographies table,
    • indexes on the bibliographies_tags table
  • We implemented loading with associations for most of the database queries in list, show, and edit actions, in order to eagerly fetch associations used in the same action.

After Optimizations

We re-ran our test on a version of Biblioful extended with the improvements described above. Since we modified the used httperf command to generate a better load to the system after optimization, graphs showing performance after optimizations are not directly comparable to the graphs showing performance before. However, the results can be easily compared by looking at the performance on the optimized graphs at the session initiation rate of 0.7 and pre-optimized graphs at the N of 25.


Path 1

Average Request Rate, Average Response Time, and Error Percentage at the varying number of initiated sessions:

Note: At the session initiation rate of 0.7, the Request Rate improved by the factor of 6-7 while the Response Time improved by a factor of 10-15.

Path 2

Average Request Rage, Average Response Time, and Error Percentage at the varying number of initiated sessions:

Note: At the session initiation rate of 0.7 the Request Rate improved by a factor of 3-4 and the Response Time improved by the factor of 2.

Path 3

Average Request Rage, Average Response Time, and Error Percentage at the varying number of initiated sessions:

Note: At the session initiation rate of 0.7, the request rate improved by the factor of 12 while the response time improved by the factor of 6.

Path 4

Average Request Rate, Average Response Time, and Error Percentage at the varying number of initiated sessions:

Note: At the session initiation rate of 0.7, the request rate improved by the factor of 5, while response time improved by the factor of 100.

Discussion

Overall, the results show that the modifications were effective. The request rate improves by a factor of 3-7 times. The response time improves by a factor of 2-20 times. No errors are generated at the session initiation rate of 0.7. Also, an analysis of the application log shows that the time required to satisfy a request for any of these actions reduced 4-100 times.

We achieved most of the improvements on the critical path 4 by eliminating a majority of the database accesses by caching. In contrast, caching did not contribute to the improvement of the results on the critical paths 2 and 3 due to a number of update operations along these paths. All the improvements along paths 2 and 3 were due to optimizations in database accesses. The least of the improvement was achieved on the critical path 2 mainly because on this path every user performs prefix search on papers, which quickly becomes a bottleneck.


Databases Comparison

All the tests described in the previous sections were run using PostgreSQL as the database backend. To get an estimate of the impact of different database engines on the overall performance of our application, we re-ran our tests switching to MySQL.

The following images show the performance along critical path 1:

In comparison with the results obtained when using PostgreSQL, performances with MySQL improve under all metrics:

  • The peak request rate increases to 18 req/sec (a 38% improvement).
  • The response time decreases at all session initiation rate and is less than 2000 ms at the maximum rate measured (with an improvement of 20%)
  • No errors (timeouts) are generated.

Along different paths the improvements are less marked but are still consistent with those presented here.

We haven't fully explored this issue, however, one reason for the noted differences might be that on the test machines access to PostgreSQL is performed through the pure ruby library postgres-pr, while access to MySQL happens through the native library ruby-mysql.

Multiple Servers

The application is deployed on multiple sites:

  • A machine is used for the database,
  • A machine is used for the apache frontend,
  • A machine is used as the memcached session storage.
  • Finally, a varying number of machines is used for the rails application itself.

We run memcached allocating 1Gb of space. We store in the memcached machine the session information. Unfortunatelly, memcached does not support fragments invalidation with regular expression—the type of invalidation we used to expire cached fragments for user's bibliography list. As a result, we were not able to use memcached to cache fragments in the multiple server setup. Each server was using the local filesystem cache for the caching of fragments. Fortunately, this does not produce inconsistent results for the given critical paths because, even though there are updates to the data, they do not touch the data which could create inconsistencies.

In a more realistic scenario, we would need to use different methods to tackle this problem. For example, we could use the database store to hold the fragments, or we could choose fragments of different granularity, whose expiration does not require the use of regular expressions (e.g., cache each item in a bibliography list, rather than the entire list).

We run the same tests that we used for the after-optimization benchmarks of the single server setup. In the following sections, we discuss the obtained results.

Results

The following results were obtained when running on one to five application servers.

Path 1

Path 2

Path 3

Path 4

Discussion

The results show that our application scales fairly well on multiple servers. We obtained almost a linear speedup on all analyzed critical paths, except for the critical path 2. On the critical path 2, the request rate does not improve much as we introduce more servers. The main bottleneck on this path are the prefix search on the papers table and the edit operation on users' bibliographies, which requires several database indexes updates.

Personal tools