- Users can now import database from a file through the shelly gem. Check out the documentation for details.
- It's now possible to download a gziped log file for a given day, also described in the documentation.
- Due to widespread demand we made a clear and comprehensive list of native extensions we support.
Wednesday, June 12, 2013
Thursday, June 6, 2013
With Shelly it's really easy to forget about servers and burdens of managing them. But for some applications (or just some really curious developers) you still want to know exactly how it does its magic. Achieving high performance and reliability takes a lot of effort and understanding underlying architecture helps a great deal.
Deploying new code is basically a two step process: you have to somehow deliver the code to servers and then make sure all services using the old code are restarted to use the new code. Doing this in an automated way across multiple heterogeneous servers complicates matters a bit. To keep the image clear, today we're gonna focus on the simplest (and the most common) scenario: deploying new code to a running cloud, without changing its layout. What happens when Cloudfile is changed is a different matter, as are failure scenarios. We touch on this topic a bit in the FAQ section. We hope this will not only be useful to Shelly users, but also to other platform builders. Feel free to ask detailed questions about our deployment process in the comments.
1. Git push
Everything starts with a push to git remote on Shelly Cloud. Gemfile and Cloudfile are validated first and if everything checks up, the process continues.
2. Deployment on the first server
From the pool of servers the application is running on, a random one is chosen and the following series of steps are performed:
- Crontab is cleared.
- New code is copied over to the server, with spec and test directories removed.
- Configuration files for databases (e.g. database.yml) are written along with any custom configuration files added by user.
- Bundle install is executed, downloading and installing gems as specified in Gemfile. Bundler is run with options: --without test development cucumber --deployment --clean.
- Gems are compressed and packaged in preparation for transfer to other servers.
- Persistent file system is linked under disk/ directory.
- Before_migrate user hook, if present, is executed.
- Rake db:migrate task is run.
- Before_symlink user hook, if present, is executed.
- Directory containing new code is linked to "current" and symlinks are created to tmp/pids/ and log/ directories.
- Before_restart user hook, if present, is executed.
- All processes running user code are restarted. This includes web servers (thins and pumas), delayed_job and sidekiq workers, and clockwork daemon.
- After_restart user hook, if present, is executed.
- Old release directory is removed.
- Whenever, if enabled on this server, is run to update crontab.
If the whole procedure is successful, deployment continues in parallel on the other servers.
3. Deployment on the other servers
Steps executed on the other servers are similar, and only differ in two regards.
First, instead of running bundle install again, previously prepared gem package is uncompressed and placed in proper place, so gems are ready for use much faster.
Second, db:migrate is not run at all.
It's worth repeating that this proceeds simultaneously on all remaining servers. The first server already serves requests using new code while the remaining ones are being deployed.
4. Deployment wrap up
When the procedure is finished on all servers, cache is flushed in Varnish to ensure no old cache pages are being served.
Finally, after_successful_deploy user hook, if present, is executed on a random server. When it finishes, deployment is done.
Q: Is there a way to achieve zero-downtime deployment?A: Yes, but you need at least two servers for serving requests, so that when one is being deployed, another handles the traffic.
Q: At any point during deployment, is it possible for the old code to be running on the new (migrated) database schema?A: Yes, rake db:migrate is run on the first server, even before any of the application servers are restarted. If you have more servers, the situation will complicate even more, as you will have the old and the new code running simultaneously on the new schema. In this case you should plan your migrations accordingly, but fortunately there are techniques to deal with this.
Q: At any point during deployment, is it possible for the new code to be running alongside the old code?A: Yes, but only if you have more than one application server.
Q: How the deployment process affects application throughput?A: It depends. Restarting application servers means there won't be as many of them to handle requests as during normal operation. Moreover, any in-process caches will have to be rebuilt and any JIT optimizations (when running JRuby) will be lost. After the initial deployment on the first server, it goes on simultaneously on the rest, so in the worst case for the time of the deployment only one server will be serving requests, while others are restarting. In practice restarts on different servers happen at slightly different points in time, so it evens up a little. Exact figure depends on the number of servers and the time needed to restart the application.
Q: What happens if migrations fail on the first server?A: The deployment process is immediately stopped. You will get an email with details.
Q: What happens when my after_successful_deploy_hook fails?A: It keeps the new code running, but marks the deployment as failed. You will get an email with details.
Q: What happens when I interrupt (^C) the git push?A: Once "Deploying your application" has been printed, deployment will continue no matter what you do with the terminal. If you close it, you can track the deployment progress with shelly info or via the web dashboard.
Q: Can I stop a deployment in progress?A: No.
Wednesday, May 29, 2013
- Due to problems with XHR requests on the latest Firefox version, SPDY support for applications is now disabled by default. If you want to enable SPDY support for you application, you can do so manually in the Cloudfile.
- Upgraded MongoDB on virtual servers to 2.4.3.
- Upgraded Elasticsearch on virtual servers to 0.90.0.
- Added support for creating configuration files for sidekiq with custom queues.
- Resolved issues some of our users experienced when deploying with git push --force.
Tuesday, May 21, 2013
Last week we attended Atmosphere Conference in Poznan and today we'd like to share with you some thoughts about it.
Organization of the conference was very good. Lectures were delivered on schedule, Wi-Fi was fast and the tasty catering was praised by all. Oh, and you can't forget a free Raspberry Pi that we're already putting to good use! So even though this was the first Atmosphere, organizers did a great job.
Based on presented lectures and discussions with developers our general impression is that while big companies build and manage their own servers there is a growing interest in cloud solutions. In this case "in the cloud" meaning more "in automated self-contained hosting" than "somewhere on the Internet". The market is expanding fast and some cloud hosting companies are now shifting toward being more of tool providers than complete solution providers. Rackspace with OpenStack or dotCloud with Docker are only two examples of this trend. Cloud hosting space is opening up and diversifying. Those are very exciting times for hosting companies and we at Shelly Cloud are happy to be part of that.
Below are some highlights of the conference, lectures and people that we feel taught us most.
SaaS Systems Lifecycle by Brian McCallister was the opening lecture of the conference. Brian described evolution of a "typical" (if there even is such thing) web startup application. From a monolitic Rails/Django app with a SQL datastore, through ever-growing set of side-components (load balancers, queueing systems, data stores, API wrappers, etc.) dictated by performance needs of the moment, all up to a phase of refactoring this mess and implementing a custom architecture spanning multiple data centers. Interestingly, according to Brian, the move from the cloud to a custom solution is not usually dictated by cost, but by performance needs.
In a similar vein, Paul Hammond covered an issue of Infrastructure for Startups. He shared his experience on a diverse range of topics, including choosing a web framework, a database, queueing or monitoring tools, a hosting provider or a CDN. In addition to technology, he also mentioned the importance of building business infrastructure. We can attest from experience of Shelly Cloud that some formalities regarding company creation or payments are not always as simple and swift as they should be. Paul had some cautionary words about trusting new and cool technologies, advising to opt for a stable and actively maintained software instead.
Andrzej Grzesik and Marcin Sawicki continued exploration of the human factor in their talk Continuous Delivery Antipatterns. They shared their experience with migrating systems and teams from yearly to weekly deployments. Lots of funny anegdotes and valuable tips in there.
There were a number of talks from Allegro, the main sponsor of the event. Kamil Benedykciński and Marek Gawiński described architecture of the charity platform. Especially interesting to us was the way they utilize three levels of Varnish, taking advantage of its ESI features. Przemysław Nowaczyk talked about performance requirements for the most traffic intesive components of their site: images, search and auction pages. During peak hours they serve 70k images per second, 6k searches per second and 5k auction pages per second. A nice surprise happened during lightning talks when Allegro announced open sourcing of two pieces of software they use internally: a monitoring tool called Selena and Protocol Buffers library for PHP.
Maciej Kuźniar from Oktawave gave an inspiring talk about autoscaling possibilities that cloud solutions open, in particular with on-demand processing power and storage. While not heavy on technical details it was nevertheless worth listening to.
Whenever you host your application in the cloud or not, you can't avoid the topic of monitoring. Lorenzo Alberton tackled this problem from a perspective of human perception and interpretation of monitoring data. He stressed the importance of clear presentation of data. Each chart and diagram has a person on the receiving end and it's all to easy for engineers to forget that (yes, we're looking at you munin). We'll surely remember those insights during our work at Shelly Cloud.
Wes Mason from Server Density described in his talk how HTTP is a good fit for their service-oriented system. They utilize REST APIs and also serve a lot of asynchronous traffic through WebSockets, all unified with sockii, a node.js proxy they open sourced recently. The slides are also a pleasure to look at, so be sure to check them out.
Finally, since Brian McCallister did the first talk, it was only right that he would do the last one as well. His hands-on lecture presented major features of Docker: resource isolation, union filesystem management and versioning, port mapping, logging and deployment, all in a neat and easy to use package. If you want to recreate Brian's setup, he has the code on github.
As already mentioned we learned a lot and we'll be definitely attending Atmosphere next year. See you there!
Photos courtesy of Atmosphere Facebook page.
Wednesday, May 15, 2013
- Sometimes we need to do some maintenance work on a user's cloud and it requires to block deployments for a while, so we've introduced a new "maintenance" status which explicitly says that your cloud is running, but deployment is blocked. Of course, if you want more details about maintenance on your cloud feel free to contact support.
- Shelly Cloud has been on the first edition of the Atmosphere Conference. We had a great time and learned a lot. We'll post more about our experience soon.
Wednesday, May 8, 2013
- We've added a new and free support channel for registered users - direct messages, just click on "Messages" in navigation bar to ask a question or report problems
- On May 13 we are adding additional restrictions what can be hosted on Shelly Cloud to our Terms of Service - review changes on the draft page
- New html emails
Monday, May 6, 2013
Let's talk performance today. Have you ever wondered how Shelly Cloud's performance compares to Heroku's? How much power you get for a buck? What are the pros and cons of each model? If yes, you're in luck, as you're gonna find out all of that below. And if you don't trust us, you can run the test yourself.
The chart above shows response times for a sample sinatra application running on Shelly and on Heroku with the same number of endpoints. Heroku application runs 10 thins, each on a separate dyno, while Shelly Cloud used a small (1Gb) server, also with 10 thins. When talking about response times the lower value the better. Difference in response times is clearly visible with Heroku being on average 50ms slower and response times being less stable.
Test was performed with ab tool with 10 concurrent connections. Heroku setup achieved 121.96 request per second, while the same app on Shelly got 291.12 rps. Of course performance cannot be assessed without correlating it with costs. It's obvious that you can buy higher performance by paying more. The more interesting question then is what is the performance per dollar? Application with 10 dynos on Heroku costs about $335/month, while a small server on Shelly only about $28/month.
So it seems, at least when considering baseline computational performance, Shelly is the most cost-effective solution. With a similar web server setup Shelly Cloud is not only 2.4 faster than Heroku, but also 12 times cheaper.
Seasoned Heroku users may say, "wait a minute, you should use unicorns instead of thins". OK then, we'll see where that will get us. But first a short digression.
Why we even bother with a comparison instead of presenting bare latency and rps values? Hosting is a competitive business, so there's always a room for improvement. We've chosen to compare ourselves with Heroku, because their model is substantially different than ours. They also recently opened up a data center in Europe, and we were curious how that performed. We hope the comparison will be genuinely interesting. Moreover, we've open sourced all code used to gather data presented below, so there's nothing stopping you from recreating the results.
Heroku gives you "dynos", a virtualized containers for processes that run on top of EC2 instances, which are virtual machines by themselves. Dynos from different applications share the same CPU and RAM of those virtual instances, although they are still isolated from each other. In Shelly Cloud you get virtual servers, that are rougly equivalent to EC2 instances, but with dead-simple deployment procedures. Each server is owned by a single application, so there's no sharing involved at that level.
To be as fair as we could we tested this from a box in Germany with Shelly servers running in France and Heroku dynos running in the new European region in Ireland. Heroku dyno gets 512MB, while a small server on Shelly has 1GB of RAM, so we've assumed you can run twice as many processes of your application on Shelly small server than on a Heroku dyno.
That's why, getting back to our unicorn setup, we've configured the application to use 5 workers per unicorn. This way we can get away with running on just 2 dynos. Here are the results.
We've cropped the result, so you can actually compare them. Performance for about 70% of the requests is the same, but it degrades for the rest with 5% of requests responding in more than one second.
That's the full scale diagram and you can see that for unicorn setup response times go quickly to 5 seconds. So it seems that what you saved in dollars you have to pay back in performance.
Even more revealing is a comparison of requests per second. Here are rps values for all three setups next to each other:
And here are the monthly costs of each setup:
This looks too good to be true, but Shelly offers best performance for the lowest cost.
More concurrent connections
So far we've arranged everything so that application setup corresponds to the load it is tested on, i.e. we've tested 10 concurrent connections over 10 processes. Now let's see how each of our setups behave under higher load. Below are results of running the same benchmark, but with concurrency set to 30.
This time thin setup on Heroku behaves more stable, although big response times happen there as well. About 6% of all requests take more than one second to complete when small Shelly server is used, but that number jumps to 27% when two Heroku dynos with unicorns are used. This "sudden" jump in response times for a subset of requests is an effect of requests waiting in line to be processed.
At concurrency 30 rps values also changed:
Topic of web application performance is a complex one. There are many aspects to consider and each application has its own specific needs. Importance of empirical data cannot be overstated. Different cloud hosting solutions may seem all abstract and alike, but this couldn't be further from the truth: as we've shown in this post they may differ significantly. We encourage everyone to measure actual performance of their application before committing to a single solution. We hope the code in baseline-performance repository will come in handy. It should be easy enough to test a different Ruby cloud or a different kind of application (e.g. utilizing file system or database). Send us a pull request and we'll be sure to publish the results.