Filed in
rails, ruby |
5 November, 2009
Here’s a little insight I had the other day.
If you use MyISAM tables, and there are various performance related reasons you might, then you’ve been stuck having to turn transactional fixtures off in your tests.
But unless you are using some special feature of MyISAM that is not present in InnoDB, then why not use InnoDB tables in your test database?
I wrote a simple plugin that has a rake task that clones the development database to the test while changing the ENGINE to InnoDB.
You just run the task, turn transational fixtures on in test/test_helper.rb, and you’re off.
Result: our tests used to take 8.5 minutes to run. Now they take just 3 minutes. That’s about 65% less time than it used to take.
The plugin is here: http://github.com/sdsykes/fast_fixture
Filed in
rails, ruby |
23 September, 2009
You may not know this, and I didn’t until recently, but you can place images directly in your img tags. Not just the address of them, but the whole binary data – base 64 encoded.
The technique is based on the Data URI scheme.
Take a look at the source of Google News. Scroll down a bit and you should start to see image tags that contain base64 encoded data (well you will if you are on IE8 or another make of browser – not IE7). That’s what I’m talking about.
The technique is particularly suitable if you have have small images that change often and you do not need the browser to cache them. In this case, the saved http connection and fetch that inlining the images affords you is a performance win.
In case you find it useful I have extended my FastImage series with FastImage Inline, which is a gem and rails plugin that will take care of inlining your images.
It’s simple to use – just like image_tag there is now a helper method inline_image_tag (and inline_image_path for image_path).
Example
inline_image_tag("bullet.gif")
Result for request from a data-uri capable browser:
<img alt="Bullet" src="data:image/gif;base64,R0lGODlhCAANAJECAAAAAP///////wA
AACH5BAEAAAIALAAAAAAIAA0AAAITlI+pyxgPI5gAUvruzJpfi0ViAQA7" />
Result for a non-capable browser (eg IE7 or below):
<img alt="Bullet" src="/images/bullet.gif?1206090639" />
Installation
Note that the FastImage gem must be installed first, check the requirements section below.
As a Rails plugin
./script/plugin install git://github.com/sdsykes/fastimage_inline.git
As a Gem
sudo gem install sdsykes-fastimage_inline -s http://gems.github.com
Install the gem as above, and configure it in your environment.rb file as below:
...
Rails::Initializer.run do |config|
...
config.gem "sdsykes-fastimage_inline", :lib=>"fastimage_inline"
...
end
...
Requirements
* FastImage http://github.com/sdsykes/fastimage
Browser support
All modern browsers support this technique except for IE versions 7 and below. This is still a major segment of the market of course, but as IE users migrate to IE 8 this will become less of a problem.
FastImage Inline uses a simple browser detection mechanism by looking at the user agent string. If the browser is known to not have support, or if we do not recognise it at all, we serve a normal image tag which includes the path to the image file in the src attribute. But if we know the browser can handle it, we send the image inline, and the browser won’t need to fetch it separately.
Limits
Reportedly IE8 will not handle data strings longer than 32k bytes. But it is probably unwise to inline images this big anyway. Google news serves images that are up to about 3.5k in length, and this seems a reasonable approach. However, FastImage Inline does not enforce any particular constraints, it is for you to decide.
FastImage Inline does not cache the images it has read – so every time an image is sent it will be read from disk. This feature may be added in a later release.
Conclusion
Inlining images is not for everyone, but it’s a useful technique in your toolbox for optimising delivery of certain kinds of pages or content. For more information check the comprehensive list of advantages and disadvantages on the Data URI scheme wikipedia page.
Filed in
rails, ruby |
14 July, 2009
A roundup of some of my projects that may be of interest:
1. FastImage Resize
This builds on my work on FastImage to provide an image resize facility. The resize code calls libgd to do the work of resampling and resizing the image – this is a library that is very likely to be already installed on your system if it is some flavour of unix / linux or even OSX. And if not, it is very easy to install. This is a light and simple option if you don’t wish to install heavier libraries such as RMagick (which relies on ImageMagick or GraphicsMagick) or ImageScience (which relies on FreeImage).
2. Scrooge
This is a plugin and gem to optimise queries to the database based on a learning algorithm that looks at how the results of each query are used. I worked on this with Lourens Naudé earlier this year, and I will shortly make a minor release with a few further optimisations and tests. Try this if your database is slowing you down, but also see slim-attributes if you are using MySQL.
3. Read From Slave
A gem to force your database reads to a slave database while your writes go to the master. It’s fast and simple, it works a treat, and we have it in production use.
Filed in
rails, ruby |
11 June, 2009
I just released a gem to find image dimensions and type information fast. I have previously done some work in this area, but this is a much more comprehensive solution, and fixes problems with certain jpegs.
FastImage finds the size or type of an image given its uri by fetching as little as needed
The problem
Your app needs to find the size or type of an image. This could be for adding width and height attributes to an image tag, for adjusting layouts or overlays to fit an image or any other of dozens of reasons.
But the image is not locally stored – it’s on another asset server, or in the cloud – at Amazon S3 for example.
You don’t want to download the entire image to your app server – it could be many tens of kilobytes, or even megabytes just to get this information. For most image types, the size of the image is simply stored at the start of the file. For JPEG files it’s a little bit more complex, but even so you do not need to fetch most of the image to find the size.
FastImage does this minimal fetch for image types GIF, JPEG, PNG and BMP. And it doesn’t rely on installing external libraries such as RMagick (which relies on ImageMagick or GraphicsMagick) or ImageScience (which relies on FreeImage).
You only need supply the uri, and FastImage will do the rest.
Examples
require ‘fastimage’
FastImage.size("http://stephensykes.com/images/ss.com_x.gif")
=> [266, 56] # width, height
FastImage.type("http://stephensykes.com/images/pngimage")
=> :png
Installation
Gem
sudo gem install sdsykes-fastimage -s http://gems.github.com
Rails
Install the gem as above, and configure it in your environment.rb file as below:
…
Rails::Initializer.run do |config|
…
config.gem "sdsykes-fastimage", :lib=>"fastimage"
…
end
…
Then you’re off – just use FastImage.size() and FastImage.type() in your code as in the examples.
Documentation
http://rdoc.info/projects/sdsykes/fastimage
Filed in
rails, ruby |
24 March, 2009
I took the trouble to port ferret to ruby 1.9.1 yesterday evening. I have it working on my mac.
Here’s a gem for you to try – I have labelled it 0.11.6.19. If you use it let me know how it runs, but it’s at your own risk, I haven’t extensively tested it.
[UPDATE: this gem has been updated 5th April 2009 - please test. There is also a fork at github]
I’ve made mostly simple changes in the code:
- Changed all struct RString -> ptr to use the RSTRING_PTR macro, except for cases where it was being used to add items to an array where rb_ary_store was used.
- Changed all struct RString -> len to use the RSTRING_LEN macro
- Changed all struct RArray -> ptr to use the RARRAY_PTR macro
- Changed all struct RArray -> len to use the RARRAY_LEN macro
- Removed manual adjustment of the len member of RArray. In fact ruby 1.9 stores small arrays of 3 items or less differently from larger ones, and this adds complexity. It is better to use the rb_ary_store method which will use the correct pointer and will keep the length in sync with the number of items in the array.
- Changed all struct RHash -> tbl to ntbl
- Removed references to rb_thread_critical
- Removed 4th argument from calls to rb_cvar_set
- Included ruby/re.h and not regex.h, and altered tokenizer code to correctly use the new regexp library
- Included ruby/st.h and not st.h
- Some other minor changes to error messages formats causing compiler warnings
By the way, acts_as_ferret also runs with some very minor surgery, Thomas von Deyen has a fork here.
Filed in
rails, ruby |
4 March, 2009
Rails 2.3 will be with us soon, so I took the time to update our app to be compatible. It’s a reasonably large app (26,000 LOC), so there’s bound to be some issues.
The first thing to notice is that the PStore store for sessions has completely gone away. This means that saying something like config.action_controller.session_store = :p_store in your environment file will no longer work.
We don’t use the cookie store because our sessions can get bigger than the 4k limit in certain circumstances. So we use the memcache store on the production machines, and pstore on the dev and test machines. And we can’t do that any more, which is a shame as it worked well – particularly with Hongli Lai’s improvements.
The remaining options are DRb, Memcached, or SQL. We didn’t want to add complexity to our environments, so none of those looked attractive. So we ended up rewriting some of our code so that the cookie store would be usable in most cases. We’ll keep memcache as the store on the production systems though.
Talking of memcache, it seems we now need to include require ‘memcache’ in our production.rb file. It’s not automatically loaded before we want to configure it.
The rest of the problems weren’t with the app itself, but with the incredible amount of failed and erroring tests due to changes in the Rails testing system.
Firstly all the unit tests were not even running because they all inherited from Test::Unit::TestCase. Nowerdays they need to inherit from ActiveSupport::TestCase, and this is necessary in Rails 2.3.
Also make sure your test_helper.rb opens the right class:
ENV
["RAILS_ENV"] =
"test"
require File.
expand_path(File.
dirname(__FILE__) + "/../config/environment")
require ‘test_help’
class ActiveSupport::TestCase
…
Next, if you were using assert_valid model_item you must change this to assert model_item.valid?, see here. No deprecation warning in 2.2 that it would be removed, but never mind, the fix is quite easy.
We have tests for our routing. They live in the unit tests – it’s handy to test all the routes in one place. But in 2.3 the assert_routing method has disappeared. In fact it’s just not automatically available in unit tests any more, you can retrieve it by doing this in your test class:
include ActionController::Assertions::RoutingAssertions
But the routing assertion also needs clean_backtrace which seems to be part of the ActionController::TestCase. We opted to just define it in test_helper.rb (for a quick fix, just add this code):
def clean_backtrace(&block)
yield
rescue ActiveSupport::TestCase::Assertion => error
framework_path = Regexp.new(File.expand_path(
"#{File.dirname(__FILE__)}/assertions"))
error.backtrace.reject! {|line| File.expand_path(line) =~ framework_path }
raise
end
We also test cookies in functional tests, and the usage has changed in 2.3. So you’ll need to check through those.
If you send multipart emails and have file fixtures (of the expected email contents) to test them, we noticed that instead of just saying Content-Type: text/plain in the header before the mime encoded parts, we now get Content-Type: text/plain; charset=iso-8859-1. Those need to be edited.
Finally, if you are using assert_select_email in your tests for your mailer classes, you will find it is also no longer available. The fast solution is to put include ActionController::Assertions::SelectorAssertions in your mailer test class.
We have worked around some of the issues presented to us with minimul changes to our code. It seems like Rails is encouraging us to organise our tests differently, particularly where functionality in ActionController::Assertions is no longer automatically available to unit tests. Working around this feels somewhat unclean, so we’ll take a look again whether tests should be moved or rewritten once the dust has settled on 2.3.
Filed in
rails, ruby |
2 March, 2009
The passenger manual makes it clear that you need to close and reestablish your connections to things like memcached after it forks to avoid inadventently sharing file handles. The reason is well and clearly explained there.
The api to do this is simple – just place this kind of code in your environment.rb file:
if defined?(PhusionPassenger)
PhusionPassenger.on_event(:starting_worker_process) do |forked|
if forked
# We’re in smart spawning mode.
– reestablish connections –
else
# We’re in conservative spawning mode. We don’t need to do anything.
end
end
end
All well and good, but how exactly do you reestablish those connections?
In our case we have to deal with memcached and ferret (with ferret running in a DRb server via the acts_as_ferret plugin).
Memcached is dead easy:
CACHE.reset
Ferret not so easy. It turns out that DRb has no in-built way to close its pool of connections. So a monkey patch is the only thing to do. I was inspired by some code you can find here. But since we want to blindly close all the connections, our case is simpler:
class DRb::DRbConn
def self.
close_all
@mutex.
synchronize do
@pool.
each {|c
| c.
close}
@pool =
[]
end
end
end
DRb::DRbConn.close_all
DRb will happily reconnect by itself when needed after its connection pool has been emptied.
Putting it all together, it looks like this:
if defined?
(PhusionPassenger
)
# monkey patch drb so we can close its connections
class DRb::DRbConn
def self.
close_all
@mutex.
synchronize do
@pool.
each {|c
| c.
close}
@pool =
[]
end
end
end
PhusionPassenger.on_event(:starting_worker_process) do |forked|
if forked
# We’re in smart spawning mode.
CACHE.reset # memcached
DRb::DRbConn.close_all # ferret
else
# We’re in conservative spawning mode. We don’t need to do anything.
end
end
end
Filed in
rails, ruby |
29 November, 2008
Finally our app is completely Rails 2.2 ready.
Some quick notes on some issues and things that needed to be fixed:
1. Default error messages
This call is no good any more
ActiveRecord::Errors.default_error_messages
Use this instead:
I18n.translate('activerecord.errors.messages')
2. Use ActiveSupport::Inflector rather than Inflector
The warning tells you all you need to know:
DEPRECATION WARNING: Inflector is deprecated!
Use ActiveSupport::Inflector instead.
3. Integration tests are broken if you are not using the cookie store
See here for details. If you are seeing “NoMethodError: You have a nil object when you didn’t expect it!” inexplicably from your integration tests, then this could be the issue.
I ended up placing this in environments/test.rb, even though it should not be needed:
config.action_controller.session = { :session_key => "_myapp_session",
:secret => "some secret phrase of at least 30 characters" }
4. Use of string keys in assert_redirected_to in tests no longer works
Consider this code:
assert_redirected_to "host"=>"foobar.com",
"action"=>"something", "controller"=>"hw"
It used to work, but in rails 2.2 it does not. You need to use symbols for the keys, like this:
assert_redirected_to :host=>"foobar.com",
:action=>"something", :controller=>"hw"
I think the change was made in this commit.
5. Render_partial is gone
If you are still using render_partial in places, you should replace it with render :partial=>”partial_name”
6. HAML is not yet compatible with Rails 2.2
There is a problem in HAML that causes output from calls to content_tag (and other tag helpers) to be lost. See this thread for details.
If you use HAML I do not recommend upgrading to rails 2.2 until this issue has been sorted out. However, it’ll probably be fixed in a day or two, and the thread I linked to contains details of the patch I used if you need to fix it before that.
7. Components are deprecated
We had to rewrite some old code that was using components. It actually wasn’t too much effort in the end, and the resulting refactoring was an improvement anyway.
–
As ever our tests were extremely valuable during this process. Once HAML is sorted out we will be upgrading our production server to Rails 2.2, so hopefully that will be in a day or so.
Filed in
rails, ruby |
14 October, 2008
I just released a new version of slim-attributes. There are some small speed gains and some other minor changes from 0.4.1, but there are no big changes.
Read more about slim-attributes at the slim-attributes homepage, or read on below.
Introduction
Slim-attributes is a small patch to the ActiveRecord Mysql adaptor that stops rails from immediately making ruby strings from the column names and data from your database queries. Because you probably don’t need them all!
So ruby strings are lazily created on demand – it’s faster and uses less memory. And it drops directly in, requiring only the installation of a gem and adding 1 line to environment.rb.
Measuring with just ActiveRecord code – fetching stuff from the database – we see anything up to a 50% (or more) speed increase, but it really depends on your system and environment, and what you are doing with the results from the database. The more columns your tables have, the better the improvement will likely be. Measure your own system and send me the results!
Installation
Try:
gem install slim-attributes -- --with-mysql-config
or:
gem install slim-attributes
then add this to environment.rb:
require 'slim_attributes'
Description
Normally the mysql adaptor in Rails returns a hash of the data returned from the database, one hash per active record object returned by the query. The routine that generates these hashes is called all_hashes, and this is what we replace. The reason for overriding all_hashes is threefold:
- making a hash of each and every row returned from the database is slow
- rails makes frozen copies of each column name string (for the keys) which results in a great many strings which are not really needed
- we observe that it’s not often that all the fields of rows fetched from the database are actually used
So this is an alternative implementation of all_hashes that returns a ‘fake hash’ which contains a hash of the column names (the same hash of names is used for every row), and also contains the row data in an area memcpy’d directly from the mysql API (which is much faster than creating ruby strings).
The field contents are then instantiated into Ruby strings on demand – ruby strings are only made if you need them – when you ask for a particular attribute from the model object.
Note that if you always look at all the columns when you fetch data from the database then this won’t necessarily be faster that the unpatched mysql adapter. But it won’t be much slower either, and we do expect that most times not all the columns from a result set are accessed.
Future development
I speculate that further speed gains might be had through keeping the mysql result objects from mysql-ruby around, and not copying the data from them at all until it is needed. However, mysql-ruby limits the non freed result sets to just 20 before calling GC.start, so surgery inside mysql-ruby would be required to achieve this.
Filed in
rails, ruby |
6 October, 2008
I was reading what Paul Barry had to say about splitting models into smaller files. It resonated with me a little – some of our models are approaching 1000 lines.
But I felt the name ‘concerned_with’ did not fully / appropriately describe what is being done, and that there should be an easier way than having to specify every file to be required.
So I ended up modifying the code to be a little easier to use. If you place it in an initializer (i.e. in a file in your initializers directory), then you can specify in your model that you wish to require all the files from a subdirectory of the same name as the model.
So if you model is called Customer, then the model file is customer.rb. Now you can also have a subdirectory called customer that contains further files containing model code.
In the original model class file you should add require_class_subdirectory to it, like this:
class Customer
require_class_subdirectory
...
end
This will cause all the files in the subdirectory to be required.
In each file in the subdirectory you should open the model class like so:
class Customer
def something # you can cut/paste code in from the main model file
end
...
end
The filenames you use don’t matter – in the above case it could be ’something.rb’ for instance.
So, to recap, your main class file customer.rb has ‘require_class_subdirectory’ added to it. You create a folder called ‘customer’ in your models directory, and place some .rb files in there. In each of those files you re-open the class (’class Customer’) and place code there just as if you were writing into the main class file.
This allows you to separate code according to function within a model, and to keep file sizes manageable.
Here is the code to put in the initializer:
class << ActiveRecord::Base
def require_class_subdirectory
ActiveSupport::Dependencies.load_paths.select{|lp| lp =~ /app\/models/}.each do |path|
Dir["#{path}/#{name.underscore}/*.rb"].each do |filename|
require_dependency "#{name.underscore}/#{File.basename(filename)}"
end
end
end
end
Other approaches to this problem are possible. In particular it may be feasible to patch or hook the constant missing mechanism in rails to automatically load the files in the subdirectory, which would remove the need for the require_class_subdirectory line in your main model file.
Finally, not even everyone thinks this is a problem that needs to be solved. My colleague who uses Aptana says it has a good outline mode that means it’s easier to work with one large file for a model than lots of smaller ones. In Textmate I find the smaller files easier.
Next Page »