Thursday 26 November 2009

gem bundle with capistrano

[Update: There's a capistrano recipe in bundler now, if you just require "bundler/capistrano" apparently it automatically runs bundler install and installs the gems in shared/vendor_bundle]

Ok, so now you've converted your existing rails project to using gem bundle, and it runs fine on your development box, how do you update your capistrano recipes to do it automatically?

Current wisdom is varied. You can:

  1. check *everything* into your source control (very heavy!)
  2. check in only your Gemfile and the vendor/bundler_gems/cache (which holds all the *.gem files) into source control (still pretty heavy, but makes sure you don't need internet access every time you deploy - use 'gem bundle --cached' in your recipes
  3. symlink your vendor/bundler_gems/cache directory to a shared location, and then all you need to checkin is the Gemfile (quicker, still needs to unpack/bundle gems each time)
  4. symlink all your vendor/bundler_gems/ directories to a shared location, and then all you need to checkin is the Gemfile (quick and not too heavy)
  5. symlink the entire vendor/bundler_gems directory to a shared location (much quicker).

5: Symlink the lot!

This would be my preference. For a single project deployed on production, there should be no reason why we can't just symlink the whole bundled-gem directory, and let the Gemfile keep that directory up-to-date. This feels no different to using a shared file-attachments directory.

Sadly doesn't work due to this:
No such file or directory - MyProjectRailsRoot/vendor/bundler_gems/../../Gemfile
which will resolve to two directories above the *shared symlinked* directory :P

This is because the bundled_gems generates an 'environment.rb' that points back up the directory at the Gemfile that created it... by using a relative path that unfortunately hits the symlink as described above. It'd be nice to be able to tell gem_bundler to make that link absolute...

If anyone knows a way around this, please do let me know!

So we fall back on 3 or 4. My first attempt was to use 3:

3: ci the Gemfile, symlink the cache directory

This seems to be a reasonable compromise. Our deployments are getting pretty damn bloated as it is - with all those plugins and vendored rails etc etc... we don't want to add anything more if we can avoid it. Even gems. We have had no problem with downloading gemfiles, so there's no need to check them in, as long as the deployed directory has access to them when needed. Thus, checking the Gemfile, symlink the cache directory... maybe even symlink, the 'gems' directory sometime too.

So, how to do it?

First - prepare the bundler_gem directory for the first time login to your deployment server, and create the shared directory in, eg:

mkdir ../shared/bundler_gems 
mkdir ../shared/bundler_gems/cache

Next add the symlink and gem bundle commands to your capistrano deployment recipe:

# symlink to the shared gem-cache path then bundle our gems
namespace :gems do
  task :bundle, :roles => :app do
    run "cd #{release_path} && ln -nfs #{shared_path}/bundler_gems/cache #{release_path}/vendor/bundler_gems/cache && gem bundle"
  end
end

after 'deploy:update_code', 'gems:bundle'

And you *should* be able to deploy fine from there. My experience wasn't completely smooth, and I had to set up on the server manually the first time - but from then on the capistrano-task worked fine. Would love to hear your own experiences...

Hudson

If you're using hudson for Continuous integration, you can achieve the same effect by updating the Hudson build script adding:

# prepare gems and shared gem-cache path
ln -nfs <hudson config path>/bundler_gems/cache  vendor/bundler_gems/cache
gem bundle

Also - if you have test/development-environment-only gems... make sure you add your integration-server's environment to the Gemfile environment list, or Hudson won't be able to find, say, rcov or rspec - and then the build will break.

4: ci the Gemfile, symlink the sub-directories

I started with the solution described above, but it still takes a while unpacking all the gems. I'd much prefer to use the solution #5, but that fails due to the relative links in environment.rb. So the ugly compromise is to symlink everything *except* the environment.rb

It works and we can deploy ok with capistrano... but I'm looking for a better solution.

...and after a few deploys...

Well, it was a nice try, but after a few deploys suddenly we started getting breakages... the application couldn't find shoulda for some reason. Now we use a hacked version of shoulda and have vendored it as that's a quickie way to monkey patch a gem.

We told gem bundle where we'd vendored it, and it all seemed to work fine. Unfortunately it broke, and a quick look at the symlinked 'gems' directory tells us why:

rails-2.3.4 -> /var/www/apps/my_app/releases/20091201120443/vendor/rails/railties/
shoulda-2.10.2 -> /var/www/apps/my_app/releases/20091201120443/vendor/bundler_gems/dirs/shoulda

What you can't see are these lines blinking red. What you can see are that these gems are pointing at a specific revision's vendor directory... and not the latest one! Coupled with a :keep_releases = > 4... that release's directory is quite likely to disappear very quickly - and in this case, it already has. This makes these symlinks (set up by gem bundle during release 20091201120443) point to a vendor directory that no longer exists. So it's really not as much of a surprise that our app can't find shoulda anymore.

I think the problem comes along because of gem bundle's rather useful feature of not reloading a gem if it's already installed. It looks to see if the specification exists - if so, it assumes that it's been installed correctly - it doesn't check that the vendored location still exists. That unfortunately spells our doom when capistrano deploys: because gem bundle runs from the newly-created release-directory, that's where the symlink is initially set up. gem bundle doesn't then check it later - even though later its been swept away in a release-cleanup.

So - we're currently working on a fix. Two options present themselves:
1) find a way for gem bundle to symlink from 'releases/current'. This means it has to exist before we do a gem bundle... and that's dangerous because it lets users through into a not-yet-working release. Or
2) we could not vendor any gems - but set up our own gem-server. This will work, but a bit more of a hassle than we prefer for vendored gems.

Troubleshooting:

Git repository 'git://github.com/taryneast/shoulda.git' has not been cloned yet

The recipes online all tell you to use gem bundle --cached and that's a great idea if network connectivity is scarce. As it uses gems that are already in a cache (without going to fetch them remotely)... but it will fail if you don't already have the gems in the cache. SO it relies on you applying solution number 2 above.

There are two solutions:
The first is to just use gem bundle always in your recipes.
The other is to use gem bundle on your development directory - then check the cache into your source control instead of symlinking it to a shared directory. (ie use solution 2 above). This will work just fine, and if you add a new gem, you'll have to make sure you run gem bundle on your working directory before checking in.

I'm not sure of a nice way to get around this if you're using solution 3. It might be worth setting up a rake task that checks if the Gemfile changed and opportunistically runs gem bundle instead of the --cached version (in fact, it'd be nice if gem bundle --cached had a --fetch-if-not-present option!). If you have a solution that worked for you, let me know.

rake aborted! no such file to load -- spec/rake/spectask

I got this when running rcov. It just means you need to add a few extras gems to your Gemfile. We needed all of these. YMMV

  only [:development, :test, :integration] do
    gem 'rcov'  # for test coverage reports
    gem 'rspec' # rcov needs this...
    gem 'ci_reporter' # used somehow by rake+rcov
    gem 'ruby-prof' # used by Hudson for profiling
  end

:bundle => false with disable_system_gems

Looks like these two don't play nice. ie if you choose for a single gem to be unbundled - but have disable_system_gems set - it isn't smart enough to realise that this one gem is meant to be an exception to the 'disable' rule. If you have any unbundled gems, make sure you remove disable_system_gems - or your application simply won't find it.

Wednesday 25 November 2009

Convert from config.gem to gem bundler

Why gem bundler?

Our sysadmin hates rake gems:install

It seems to work for me, but all sorts of mayhem ensues when he tries to run it on production... of course I have a sneaking suspicion it's really our fault. After all - we tend to forget that we already happened to globally-install some gem while we were just playing around with it... which means we didn't bother putting it into the config.gem section of environment.rb... oops!

However, there's a new option on the horizon that looks pretty interesting, and is built to sort out some of the uglier issues involved in gem-herding: gem bundler

Yehuda has written a very thorough a tutorial on how to set up gem bundler. But I find it kinda mixes rails and non-rails instructions and it's not so clear on where some things go. I found it a little fiddly to figure out. So here's the step-by step instructions that I used to convert an existing Rails 2.3 project.

Step-by-step instructions

1: Install bundler

sudo gem install bundler

2: create the preinitializer.rb file

Yehuda gave some code in his tutorial that will load everything in the right order. You only need to copy/paste it once and it will then Just Work.

Go grab the code from his tutorial (about halfway down the page) and save it in a file: config/preinitializer.rb

Don't forget to add that file to your source control!

Update: If you're using Passenger, update the code to use:

module Rails
  class Boot
  #...
  end
end

Instead of:

class Rails::Boot
  #...
end

3. create the new gems directory

mkdir vendor/bundler_gems

Add this directory to your source control now - while it's still empty!

While you're at it, open up your source-control's ignore file (eg .gitignore) and add the following:

vendor/bundler_gems/gems
vendor/bundler_gems/specifications
vendor/bundler_gems/doc
vendor/bundler_gems/environment.rb

4. Create the Gemfile

Edit a file called Gemfile in the rails-root of your application (ie right next to your Rakefile)

At the top, add these lines (comments optional):

# because rails is precious about vendor/gems
bundle_path "vendor/bundler_gems"
# this line forces us to use only the bundled gems - making it safer to
# deploy knowing that we won't accidentally assume a gem in existence
# somewhere in the wider world.
disable_system_gems

Again: don't forget to add the Gemfile to your source control!

5. Add your gems to the Gemfile

Now comes the big step - you must copy all your config.gem settings from environment.rb to Gemfile. You can do this almost completely wholesale. For each line, remove the config. from the beginning and then, if they have a version number, remove the :version => and just put the number as the second param. I think an example is in order, so the following line:
config.gem 'rubyist-aasm', :version => '2.1.1', :lib => 'aasm'
becomes:
gem 'rubyist-aasm', '2.1.1', :require => 'aasm'

For most simple gem config lines, this should be enough so that they Just Work. For more complicated config.gem dependencies, refer to the Gemfile documentation in Yehuda's tutorial.

If you already have gems in vendor/gems You can specify that bundler uses them - but you have to be specific about the directory. eg:
gem 'my_cool_gem', '2.1.1', :path => 'vendor/gems/my_cool_gem-2.1.1'

Extra bonus: if you have gems that are *only* important for, say, your test environments, you can add special 'only' and 'except' instructions (or whole sections!) that are environment-specific and keep your staging/production environments gem-free eg:

gem "sqlite3-ruby", :require => "sqlite3" , :only => :development
only :cucumber do
  gem 'cucumber'
  gem 'selenium-client', :require => 'selenium/client'
end
except [:staging, :production] do
  gem 'mocha'          # http://mocha.rubyforge.org/
  gem "redgreen"       # makes your test output pretty!
  gem "rcov"
  gem 'rspec'
end

5a: Add rails

Now... at the top of the Gemfile, add:
gem 'rails', '2.3.4'
(or whatever version you currently use)... otherwise your rails app won't do much! :)

Obviously - if you've vendored rails you will need to specify that in the Gemfile way eg:
gem 'rails', '2.3.4', :path => 'vendor/rails/railties'

If you've opted *not* to disable_system_gems, you won't need this line at all. Alternatively, you could tell the Gemfile to use the system-version anyway thus:
gem 'rails', '2.3.4', :bundle => false

Also, I'd recommend adding the following too:

 gem 'ruby-debug', :except => 'production'  # used by active-support!

6. Let Rails/Rake find your gems:

Edit 'config/environment.rb' and at the bottom (just immediately after the Rails::Initializer.run block, add:

# This is for gem-bundler to find all our gems
require 'vendor/bundler_gems/environment.rb' # add dependenceies to load paths
Bundler.require_env :optional_environment    # actually require the files

7. Give it a whirl

From the rails root directory run gem bundle

The bundler should tell you that it is resolving dependencies, then downloading and installing the gems. You can watch them appearing in the bundler_gems/cache directory :)

and you're done!

...unless something fails and it can't find one - which means you probably forgot to add it to config.gems in the first place! ;)

PS: ok... so I've also noticed you sometimes have to specify gems that your plugins use too - so it may not be entirely your fault... ;)

PPS: if Rake can't find your bundled gems - check that config/preinitializer.rb is set up correctly!