Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[orion-dev] Orion Server stress test results

Apologies if this is a bit exhaustive.  This morning Simon and I stress-tested a test Orion server.  The test only lasted about 10 minutes since it became clear we could not kill it.  Herein is my attempt at offering some sysadmin insight.

Test method: a shell script that logs in, and loops indefinitely, performing any one of the six actions below.  Unlike in the real world, each action has an equal, random chance of being executed.

- Get a random file, including a "zip-on-the-fly" export link
- Open a random directory
- Get the 'export' version of a directory or project (zip)
- Create a file in a random directory
- Edit an existing file (download a file, add 6KB of random content, then save it)
- Create a new folder somewhere in the tree

A single instance of the script was launched, then another, until we eventually had 12 instances running.


Server Details:
Intel SR1600UR with 16 CPU cores (Intel E5540 @ 2.53 GHz)
SuSE Linux Entreprise Server 11 running under Xen virtualization
8G RAM @ 1066MHz
Disk I/O: two regular SATA drives connected to an LSI raid controller (RAID 1, mirroring used)


By the numbers:
Total user accounts on server: over 20,000
Runtime: a few minutes
Total requests: 21261
GETs: 10650
POSTs: 9001
PUTs: 1609

For 2 minutes the system handled 25+ req/sec, peaking at 50+ req/sec for over 10 seconds.

Peak CPU load was 4 cores @ 100% with 12 idle cores
Peak Disk Read load was absolute zero.  The entire workspace was kept in RAM.
Peak Disk Write load was 21 MB/sec, with an average of about 12 MB/sec during peak period.  Disk writes were quite high since the odds of a new login (creating a workspace) and adding 6KB to a file were high.

Memory usage at start:         Free: 6.9G Write Buffers: 181M File Cache: 660M
Memory usage at end:           Free: 6.3G Write Buffers:195M File Cache 936M    (server workspace grew about 330M and was entirely contained in RAM)

Conclusion:
A single server has the potential of serving hundreds of concurrent, active users.  Scalability could be achieved by placing the serverworkspace on a shared filesystem and load-balancing the Orion servers, and perhaps adding a caching http server between the client and the Orion server. The two first bottlenecks would be Disk I/O and CPU.  I suspect CPU usage is partly due to the zip-on-the-fly nature of the export links.

I hope some of this is helpful.  Let me know if you have any questions.

Denis

Back to the top