Rails 3 with jumploader

Java applets went out of fashion some years ago, and you don't see them very often now. But they still can be useful. We had a file upload problem that was not solved by plupload, uploadify, swfupload or any other common solutions. The answer was JumpLoader - with this we can upload files bigger than 2Gb, and it's got a host of other nice options and features. (Yeh, I know it may not be a good idea to try to send files >2Gb through a browser, but that's what we need to offer - and at least we can arrange to resume failures.) The downside is that it needs to be code-signed if you do not want your users to see a self signed certificate warning when they run it for the first time. Those certificates are quite expensive ($200 per year depending). But assuming that's not a problem, let's dive in and get it working with Rails 3. There are two things you need to send when posting a file upload to rails - the request forgery protection token in with the post params, and the cookie that identifies you in the header. Fortunately, JumpLoader offers us both possibilities. I've installed the java files in public/java.  So this is how my applet tag looks, in HAML: And the resulting tag in the HTML output: Here we have set the requestProperties to the session cookie, which ensures it is sent with the request headers. We also tell the applet to 'fireAppletInitialized' which is a javascript function. Here is the javascript from the head of the document, in ERB this time: This sets the token in a parameter that will be sent with the posted file. Now the upload should pass the csrf check. Note the 1 second delay before setting the parameter in the applet. This is because I was seeing hangs in Safari when setting the parameter directly from the appletInitialized function.

Much faster Rails tests if you use MyISAM

Here's a little insight I had the other day. If you use MyISAM tables, and there are various performance related reasons you might, then you've been stuck having to turn transactional fixtures off in your tests. But unless you are using some special feature of MyISAM that is not present in InnoDB, then why not use InnoDB tables in your test database? I wrote a simple plugin that has a rake task that clones the development database to the test while changing the ENGINE to InnoDB. You just run the task, turn transational fixtures on in test/test_helper.rb, and you're off. Result: our tests used to take 8.5 minutes to run. Now they take just 3 minutes. That's about 65% less time than it used to take. The plugin is here: http://github.com/sdsykes/fast_fixture

FastImageInline - inline images in your html code

You may not know this, and I didn't until recently, but you can place images directly in your img tags. Not just the address of them, but the whole binary data - base 64 encoded. The technique is based on the Data URI scheme. Take a look at the source of Google News. Scroll down a bit and you should start to see image tags that contain base64 encoded data (well you will if you are on IE8 or another make of browser - not IE7). That's what I'm talking about. The technique is particularly suitable if you have have small images that change often and you do not need the browser to cache them. In this case, the saved http connection and fetch that inlining the images affords you is a performance win. In case you find it useful I have extended my FastImage series with FastImage Inline, which is a gem and rails plugin that will take care of inlining your images. It's simple to use - just like image_tag there is now a helper method inline_image_tag (and inline_image_path for image_path).   Example
inline_image_tag("bullet.gif")
Result for request from a data-uri capable browser:
<img alt="Bullet" src="data:image/gif;base64,R0lGODlhCAANAJECAAAAAP///////wA
  AACH5BAEAAAIALAAAAAAIAA0AAAITlI+pyxgPI5gAUvruzJpfi0ViAQA7" />
Result for a non-capable browser (eg IE7 or below):
<img alt="Bullet" src="/images/bullet.gif?1206090639" />
  Installation Note that the FastImage gem must be installed first, check the requirements section below.   As a Rails plugin
./script/plugin install git://github.com/sdsykes/fastimage_inline.git
  As a Gem
sudo gem install sdsykes-fastimage_inline -s http://gems.github.com
Install the gem as above, and configure it in your environment.rb file as below:
...
  Rails::Initializer.run do |config|
    ...
    config.gem "sdsykes-fastimage_inline", :lib=>"fastimage_inline"
    ...
  end
  ...
  Requirements * FastImage http://github.com/sdsykes/fastimage   Browser support All modern browsers support this technique except for IE versions 7 and below. This is still a major segment of the market of course, but as IE users migrate to IE 8 this will become less of a problem. FastImage Inline uses a simple browser detection mechanism by looking at the user agent string. If the browser is known to not have support, or if we do not recognise it at all, we serve a normal image tag which includes the path to the image file in the src attribute. But if we know the browser can handle it, we send the image inline, and the browser won't need to fetch it separately.   Limits Reportedly IE8 will not handle data strings longer than 32k bytes. But it is probably unwise to inline images this big anyway. Google news serves images that are up to about 3.5k in length, and this seems a reasonable approach. However, FastImage Inline does not enforce any particular constraints, it is for you to decide. FastImage Inline does not cache the images it has read - so every time an image is sent it will be read from disk. This feature may be added in a later release.   Conclusion Inlining images is not for everyone, but it's a useful technique in your toolbox for optimising delivery of certain kinds of pages or content. For more information check the comprehensive list of advantages and disadvantages on the Data URI scheme wikipedia page.

Recent code - FastImage resize, Scrooge and Read From Slave

A roundup of some of my projects that may be of interest: 1. FastImage Resize This builds on my work on FastImage to provide an image resize facility.  The resize code calls libgd to do the work of resampling and resizing the image - this is a library that is very likely to be already installed on your system if it is some flavour of unix / linux or even OSX.  And if not, it is very easy to install.  This is a light and simple option if you don't wish to install heavier libraries such as RMagick (which relies on ImageMagick or GraphicsMagick) or ImageScience (which relies on FreeImage). 2. Scrooge This is a plugin and gem to optimise queries to the database based on a learning algorithm that looks at how the results of each query are used.  I worked on this with Lourens Naudé earlier this year, and I will shortly make a minor release with a few further optimisations and tests.  Try this if your database is slowing you down, but also see slim-attributes if you are using MySQL. 3. Read From Slave A gem to force your database reads to a slave database while your writes go to the master.  It's fast and simple, it works a treat, and we have it in production use.

FastImage finds image dimensions fast using minimal resources

I just released a gem to find image dimensions and type information fast. I have previously done some work in this area, but this is a much more comprehensive solution, and fixes problems with certain jpegs. FastImage finds the size or type of an image given its uri by fetching as little as needed The problem Your app needs to find the size or type of an image. This could be for adding width and height attributes to an image tag, for adjusting layouts or overlays to fit an image or any other of dozens of reasons. But the image is not locally stored – it’s on another asset server, or in the cloud – at Amazon S3 for example. You don’t want to download the entire image to your app server – it could be many tens of kilobytes, or even megabytes just to get this information. For most image types, the size of the image is simply stored at the start of the file. For JPEG files it’s a little bit more complex, but even so you do not need to fetch most of the image to find the size. FastImage does this minimal fetch for image types GIF, JPEG, PNG and BMP. And it doesn’t rely on installing external libraries such as RMagick (which relies on ImageMagick or GraphicsMagick) or ImageScience (which relies on FreeImage). You only need supply the uri, and FastImage will do the rest. Examples
require 'fastimage'

FastImage.size("http://stephensykes.com/images/ss.com_x.gif")
=> [266, 56]  # width, height
FastImage.type("http://stephensykes.com/images/pngimage")
=> :png

Installation Gem
sudo gem install sdsykes-fastimage -s http://gems.github.com
Rails Install the gem as above, and configure it in your environment.rb file as below:
...
Rails::Initializer.run do |config|
...
config.gem "sdsykes-fastimage", :lib=>"fastimage"
...
end
...
Then you’re off – just use FastImage.size() and FastImage.type() in your code as in the examples. Documentation http://rdoc.info/projects/sdsykes/fastimage

Ferret on Ruby 1.9.1

I took the trouble to port ferret to ruby 1.9.1 yesterday evening.  I have it working on my mac. Here's a gem for you to try - I have labelled it 0.11.6.19.  If you use it let me know how it runs, but it's at your own risk, I haven't extensively tested it. [UPDATE: this gem has been updated 5th April 2009 - please test.  There is also a fork at github] I've made mostly simple changes in the code:

  • Changed all struct RString -> ptr to use the RSTRING_PTR macro, except for cases where it was being used to add items to an array where rb_ary_store was used.
  • Changed all struct RString -> len to use the RSTRING_LEN macro
  • Changed all struct RArray -> ptr to use the RARRAY_PTR macro
  • Changed all struct RArray -> len to use the RARRAY_LEN macro
  • Removed manual adjustment of the len member of RArray. In fact ruby 1.9 stores small arrays of 3 items or less differently from larger ones, and this adds complexity. It is better to use the rb_ary_store method which will use the correct pointer and will keep the length in sync with the number of items in the array.
  • Changed all struct RHash -> tbl to ntbl
  • Removed references to rb_thread_critical
  • Removed 4th argument from calls to rb_cvar_set
  • Included ruby/re.h and not regex.h, and altered tokenizer code to correctly use the new regexp library
  • Included ruby/st.h and not st.h
  • Some other minor changes to error messages formats causing compiler warnings

By the way, acts_as_ferret also runs with some very minor surgery, Thomas von Deyen has a fork here.

Rails 2.3 breakage and fixage

Rails 2.3 will be with us soon, so I took the time to update our app to be compatible. It's a reasonably large app (26,000 LOC), so there's bound to be some issues. The first thing to notice is that the PStore store for sessions has completely gone away. This means that saying something like config.action_controller.session_store = :p_store in your environment file will no longer work. We don't use the cookie store because our sessions can get bigger than the 4k limit in certain circumstances. So we use the memcache store on the production machines, and pstore on the dev and test machines. And we can't do that any more, which is a shame as it worked well - particularly with Hongli Lai's improvements. The remaining options are DRb, Memcached, or SQL. We didn't want to add complexity to our environments, so none of those looked attractive. So we ended up rewriting some of our code so that the cookie store would be usable in most cases. We'll keep memcache as the store on the production systems though. Talking of memcache, it seems we now need to include require 'memcache' in our production.rb file. It's not automatically loaded before we want to configure it. The rest of the problems weren't with the app itself, but with the incredible amount of failed and erroring tests due to changes in the Rails testing system. Firstly all the unit tests were not even running because they all inherited from Test::Unit::TestCase. Nowerdays they need to inherit from ActiveSupport::TestCase, and this is necessary in Rails 2.3. Also make sure your test_helper.rb opens the right class:
ENV["RAILS_ENV"] = "test"
require File.expand_path(File.dirname(__FILE__) + "/../config/environment")
require 'test_help'

class ActiveSupport::TestCase
...

Next, if you were using assert_valid model_item you must change this to assert model_item.valid?, see here. No deprecation warning in 2.2 that it would be removed, but never mind, the fix is quite easy. We have tests for our routing. They live in the unit tests - it's handy to test all the routes in one place. But in 2.3 the assert_routing method has disappeared. In fact it's just not automatically available in unit tests any more, you can retrieve it by doing this in your test class:
include ActionController::Assertions::RoutingAssertions
But the routing assertion also needs clean_backtrace which seems to be part of the ActionController::TestCase. We opted to just define it in test_helper.rb (for a quick fix, just add this code):
  def clean_backtrace(&block)
    yield
  rescue ActiveSupport::TestCase::Assertion => error
    framework_path = Regexp.new(File.expand_path(
                                     "#{File.dirname(__FILE__)}/assertions"))
    error.backtrace.reject! {|line| File.expand_path(line) =~ framework_path }
    raise
  end
We also test cookies in functional tests, and the usage has changed in 2.3. So you'll need to check through those. If you send multipart emails and have file fixtures (of the expected email contents) to test them, we noticed that instead of just saying Content-Type: text/plain in the header before the mime encoded parts, we now get Content-Type: text/plain; charset=iso-8859-1. Those need to be edited. Finally, if you are using assert_select_email in your tests for your mailer classes, you will find it is also no longer available. The fast solution is to put include ActionController::Assertions::SelectorAssertions in your mailer test class. We have worked around some of the issues presented to us with minimul changes to our code. It seems like Rails is encouraging us to organise our tests differently, particularly where functionality in ActionController::Assertions is no longer automatically available to unit tests. Working around this feels somewhat unclean, so we'll take a look again whether tests should be moved or rewritten once the dust has settled on 2.3.

Using acts as ferret with phusion passenger / mod_rails

The passenger manual makes it clear that you need to close and reestablish your connections to things like memcached after it forks to avoid inadventently sharing file handles. The reason is well and clearly explained there. The api to do this is simple - just place this kind of code in your environment.rb file:
if defined?(PhusionPassenger)
  PhusionPassenger.on_event(:starting_worker_process) do |forked|
    if forked
      # We're in smart spawning mode.
      -- reestablish connections --
    else
      # We're in conservative spawning mode. We don't need to do anything.
    end
  end
end
All well and good, but how exactly do you reestablish those connections? In our case we have to deal with memcached and ferret (with ferret running in a DRb server via the acts_as_ferret plugin). Memcached is dead easy:
CACHE.reset
Ferret not so easy. It turns out that DRb has no in-built way to close its pool of connections. So a monkey patch is the only thing to do. I was inspired by some code you can find here. But since we want to blindly close all the connections, our case is simpler:
  class DRb::DRbConn
    def self.close_all
      @mutex.synchronize do
        @pool.each {|c| c.close}
        @pool = []
      end
    end
  end

  DRb::DRbConn.close_all

DRb will happily reconnect by itself when needed after its connection pool has been emptied. Putting it all together, it looks like this:
if defined?(PhusionPassenger)
  # monkey patch drb so we can close its connections
  class DRb::DRbConn
    def self.close_all
      @mutex.synchronize do
        @pool.each {|c| c.close}
        @pool = []
      end
    end
  end

  PhusionPassenger.on_event(:starting_worker_process) do |forked|
    if forked
      # We're in smart spawning mode.
      CACHE.reset  # memcached
      DRb::DRbConn.close_all  # ferret
    else
      # We're in conservative spawning mode. We don't need to do anything.
    end
  end
end

Breakage and fixage in Rails 2.2

Finally our app is completely Rails 2.2 ready. Some quick notes on some issues and things that needed to be fixed: 1. Default error messages This call is no good any more
ActiveRecord::Errors.default_error_messages
Use this instead:
I18n.translate('activerecord.errors.messages')
2. Use ActiveSupport::Inflector rather than Inflector The warning tells you all you need to know:
DEPRECATION WARNING: Inflector is deprecated! 
Use ActiveSupport::Inflector instead.
3. Integration tests are broken if you are not using the cookie store See here for details.  If you are seeing "NoMethodError: You have a nil object when you didn't expect it!" inexplicably from your integration tests, then this could be the issue. I ended up placing this in environments/test.rb, even though it should not be needed:
config.action_controller.session = { :session_key => "_myapp_session", 
  :secret => "some secret phrase of at least 30 characters" }
4. Use of string keys in assert_redirected_to in tests no longer works Consider this code:
assert_redirected_to "host"=>"foobar.com", 
  "action"=>"something", "controller"=>"hw"
It used to work, but in rails 2.2 it does not.  You need to use symbols for the keys, like this:
assert_redirected_to :host=>"foobar.com", 
  :action=>"something", :controller=>"hw"
I think the change was made in this commit. 5. Render_partial is gone If you are still using render_partial in places, you should replace it with render :partial=>"partial_name" 6. HAML is not yet compatible with Rails 2.2 There is a problem in HAML that causes output from calls to content_tag (and other tag helpers) to be lost.  See this thread for details. If you use HAML I do not recommend upgrading to rails 2.2 until this issue has been sorted out.  However, it'll probably be fixed in a day or two, and the thread I linked to contains details of the patch I used if you need to fix it before that. 7. Components are deprecated We had to rewrite some old code that was using components.  It actually wasn't too much effort in the end, and the resulting refactoring was an improvement anyway. -- As ever our tests were extremely valuable during this process.  Once HAML is sorted out we will be upgrading our production server to Rails 2.2, so hopefully that will be in a day or so.

Slim-Attributes v0.5.0 released

I just released a new version of slim-attributes.  There are some small speed gains and some other minor changes from 0.4.1, but there are no big changes. Read more about slim-attributes at the slim-attributes homepage, or read on below. Introduction Slim-attributes is a small patch to the ActiveRecord Mysql adaptor that stops rails from immediately making ruby strings from the column names and data from your database queries. Because you probably don't need them all! So ruby strings are lazily created on demand - it's faster and uses less memory. And it drops directly in, requiring only the installation of a gem and adding 1 line to environment.rb. Measuring with just ActiveRecord code - fetching stuff from the database - we see anything up to a 50% (or more) speed increase, but it really depends on your system and environment, and what you are doing with the results from the database.  The more columns your tables have, the better the improvement will likely be.  Measure your own system and send me the results! Installation Try:
gem install slim-attributes -- --with-mysql-config
or:
gem install slim-attributes
then add this to environment.rb:
require 'slim_attributes'
Description Normally the mysql adaptor in Rails returns a hash of the data returned from the database, one hash per active record object returned by the query. The routine that generates these hashes is called all_hashes, and this is what we replace. The reason for overriding all_hashes is threefold:
  • making a hash of each and every row returned from the database is slow
  • rails makes frozen copies of each column name string (for the keys) which results in a great many strings which are not really needed
  • we observe that it's not often that all the fields of rows fetched from the database are actually used
So this is an alternative implementation of all_hashes that returns a 'fake hash' which contains a hash of the column names (the same hash of names is used for every row), and also contains the row data in an area memcpy'd directly from the mysql API (which is much faster than creating ruby strings). The field contents are then instantiated into Ruby strings on demand - ruby strings are only made if you need them - when you ask for a particular attribute from the model object. Note that if you always look at all the columns when you fetch data from the database then this won't necessarily be faster that the unpatched mysql adapter.  But it won't be much slower either, and we do expect that most times not all the columns from a result set are accessed. Future development I speculate that further speed gains might be had through keeping the mysql result objects from mysql-ruby around, and not copying the data from them at all until it is needed. However, mysql-ruby limits the non freed result sets to just 20 before calling GC.start, so surgery inside mysql-ruby would be required to achieve this.