![]()
Princeton University
|
Computer Science 448
|
|
For this writeup you will assume a startup that is considering moving to Amazon's infrastructure. Your startup is based on cloud storage of video files and providing services for automatic format conversion of these files. You will work in groups. Both group partners will submit a file writeup_6.txt individually. And you will submit another file equations.pdf as a group (designate one person to submit this file, and mention the name of the other group partner). The word limit for writeup_6.txt is reduced to 300 words for this writeup. The page limit for equations.pdf is 1-page. You can write the equations in any format (Microsoft Word, Latex, hand-written and then scanned) as long as you convert to PDF.
You currently host your system on a small cloud provider that offers a flat rate per month for a compute instance. The rate includes the total hardisk space, allocated memory, and maximum bandwidth allowed. Your startup is getting more clients and is growing. Your CEO has asked you to present a report where you analyze the decision of moving to Amazon's cloud infrastructure from an optimization point-of-view. Amazon is offering three levels of services:
- On-demand Instances: You are interested in standard on-demand, large Linux instances. Current price is $0.34 per hour.
- Reserved Instances: You are interested in standard reserved, large Linux instances. Current price is $0.12 per hour.
- Spot Instances: You are interested in standard spot, large Linux instances for a variable price. Current price is $0.117 per hour, but it fluctuates a lot and is updated every 5 minutes. You can specify a price maximum; the price at which Amazon will shutdown/terminate your instance instead of charging you. You buy spot instances by bidding and purchasing instances for an hour i.e., after one hour you bid again and the price can change.
Your current CPU utilization (on current cloud provider) looks like this:
![]()
In the above figure, the X-axis is time divided into "slots" of 2-hours each. Each data point is CPU load average of the 2-hours slot. During a single day you can see more than one spike in load. On the Y-axis is CPU percent utilization. The scale is 0%-to-800%, instead of the simpler 0%-to-100%. That's because you have 8-cores, so the graph gives the CPU average of each core. For example, a data point of 215% means that "2-cores are used at full capacity and 1-core is used at 15%". If you see another data point of say 230%, that doesn't mean that the same 2-cores are used at 100% and 1-core is used at 30%, rather a total of 2-cores are used at 100% and 1-core (any core) is used at 30%. The horizontal line X is similar to what was discussed in class.
Now consider the following parameters (these are a subset of parameters you'll need):
![]()
For this writeup, each team should:
- In a group: Prepare the cost equation for deciding when to buy 'on-demand' instances and when to buy 'reserved' instances. You don't necessarily need to derive the equations, but need to explain briefly how you came up with the equation. Submit this in equations.pdf.
- In a group: Prepare the cost equation for replacing 'on-demand' instances with 'spot' instances. Spot instances usually cost less than on-demand, but they are terminated if the price exceeds your maximum specified price. Include this 'risk of termination' in the equation. Again, you don't need to derive the equations, but need to explain briefly how you came up with the equation. Submit this in equations.pdf.
- Individual: One team member writes a memo saying why they think moving to Amazon's infrastructure is a good idea that the company should pursue. Submit this as writeup_6.txt.
- Individual: The other team member writes a memo saying why moving to Amazon's infrastructure is not a good idea. Again, submit this as writeup_6.txt.
For the writeup, think broadly, not just about the cost equations. You're encouraged to use arguments based on all kinds of matters, including future growth, competition, ease of management, etc. You are welcome to use arguments based on the cost equations (that you developed as a group) as well.