What is EC2
The elastic computing cloud is essentially this - Amazon has a vast array of servers running virtualization software. They rent out usage of these servers based on the hours that you have your own instance turned on and on bandwidth. They are priced competitively with dedicated hosting.The Problems with EC2
The first and biggest problem with EC2 is documentation. There is very little of it, the forums are barely searchable, and what there is of official documentation is very sparse - and it tends to lead to more questions than answers. Only those of us who are dedicated to the task will weed our way through and get things up and running properly. Others will be forced to try out external services (increasing costs and adding a disparity of knowledge to the mix) or they will end up giving up scalability in favor of ease of use.I have started to document a few things on my wiki, but admittedly, my knowledge is still sparse at the moment.
There are a few other problems, however. One is the fact that your machine will end up with an unknown IP address, and on the same note, which I'm not entirely positive about at the moment is that your MAC address will end up unknown as well. The IP address problem presents DNS issues which can be resolved with dynamic DNS but definitely present a non-optimal situation. The MAC address issue will present licensing problems and will probably force the use of virtualization on top of virtualization where those problems occur. Again - workable, but far from optimal.
Another issue I'm seeing is that your kernel ends up being replaced by Amazon's Xen kernel. Their kernel is optimized, but you are at amazon's mercy as far as updating your kernel for security or performance reasons. The upside is that since everybody is in the same boat, any major security patches should be fixed pretty quickly. I suppose if you are willing and able to use the kernel version they use, you are in fine shape.
Amazon does have a series of public machine images that can be used - for a variety of linux distro's and many preconfigured LAMP/etc. instances, but they really just offer basic amazon ratings for those instances, and you don't really know what you are getting. I'm more comfortable with my own base install with some poorly chosen package inclusions than with a base machine that has not been properly vetted for security or that may or may not have a configuration issue involved from the get go. There is no indication of whether any of the machine images they offer have any sort of support tied to them or not. I have noticed in the forums that some images seem to receive more attention from their creators than others.
The biggest problem that I can see with Amazon's EC2 structure is support and lack thereof. I haven't heard of anybody "left in the lurch", but really - if things aren't working - whether it be due to your own stupidity or because of problems within Amazon, your only real support options are the forums and email. There's no number to call in an emergency. Granted, you should be able to start up a new machine if you see a failure, and the S3 storage system is super-redundant, but the lack of direct phone support does leave you a little bit gunshy.
Interesting Challenges
I am interested in understanding the physical location of my ec2 servers, and their general geographic disparity. You could feasibly run one instance as a database server, and two instances as load balanced web servers, but how close are the machines physically to one another? If you need to scale beyond one instance, what performance bottlenecks are you introducing to the system?Along the same lines, in a load balanced situation, how exactly are you going to resolve the load balancing act? A dns hack? Running a front end proxy server instance? Either of those two solutions present prospective problems by themselves.
These questions, given the potential of non-optimal answers, could present a whole different set of opportunities. Geographically disparate instances can make for a great low cost edge-caching scenario. The lack of a real load balancer would force the issue of design-time optimization for high scalability - which in turn would make any web applications much more usable long term.
We'll see what happens. As it stands right now, I've got a base OS installed and I'm almost ready to bundle it, upload it, and turn it on for the first time. One very positive thing I see is that the entire environment is essentially firewalled from the beginning - outside the OS. This is a definite bonus and something you would have to pay for from any dedicated hosting provider.
I finally got started with EC2 Commentary
