<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Speed up mysql in rails</title>
	<atom:link href="http://pennysmalls.com/2008/04/02/speed-up-mysql-in-rails/feed/" rel="self" type="application/rss+xml" />
	<link>http://pennysmalls.com/2008/04/02/speed-up-mysql-in-rails/</link>
	<description>A blog about Ruby, Rails and other tech.  Mostly.</description>
	<lastBuildDate>Thu, 02 Sep 2010 19:56:26 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: roger faith</title>
		<link>http://pennysmalls.com/2008/04/02/speed-up-mysql-in-rails/comment-page-1/#comment-323</link>
		<dc:creator>roger faith</dc:creator>
		<pubDate>Thu, 09 Oct 2008 23:37:32 +0000</pubDate>
		<guid isPermaLink="false">http://pennysmalls.com/2008/04/02/speed-up-mysql-in-rails/#comment-323</guid>
		<description>Wow that was fast :)
That makes me wary of ruby_xmalloc. Maybe it wasn&#039;t such a great idea :)

Another further thought would be:
imagine that a db row will only have one column accessed for it [quite common for web driven stuff--you want the content of the blog entry, or the name of the school, and that&#039;s all you want].

Could be optimized maybe.
ex: have a flag per row which says &quot;the cache has ever been hit&quot; then check that each time--if not you can avoid the cache check [or does it already do this?]
Maybe you could cache the index of the most recently requested column.
Anyway just thinking out loud.
Thanks for your help.  slim_attributes is WAY faster than even a C based all_hashes.
-=R</description>
		<content:encoded><![CDATA[<p>Wow that was fast :)<br />
That makes me wary of ruby_xmalloc. Maybe it wasn&#8217;t such a great idea :)</p>
<p>Another further thought would be:<br />
imagine that a db row will only have one column accessed for it [quite common for web driven stuff--you want the content of the blog entry, or the name of the school, and that's all you want].</p>
<p>Could be optimized maybe.<br />
ex: have a flag per row which says &#8220;the cache has ever been hit&#8221; then check that each time&#8211;if not you can avoid the cache check [or does it already do this?]<br />
Maybe you could cache the index of the most recently requested column.<br />
Anyway just thinking out loud.<br />
Thanks for your help.  slim_attributes is WAY faster than even a C based all_hashes.<br />
-=R</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: roger faith</title>
		<link>http://pennysmalls.com/2008/04/02/speed-up-mysql-in-rails/comment-page-1/#comment-321</link>
		<dc:creator>roger faith</dc:creator>
		<pubDate>Thu, 09 Oct 2008 15:23:26 +0000</pubDate>
		<guid isPermaLink="false">http://pennysmalls.com/2008/04/02/speed-up-mysql-in-rails/#comment-321</guid>
		<description>Some comments on the gem:

Just throwing these out:
instead of using malloc you may want to use ruby_xmalloc [so ruby can keep track of the memory growth]
Also, for a small speedup you may want to cache the ids for your &quot;@variables&quot; and then can call rb_ivar_set

Looks good though.  I might roll it into an upcoming version of mysqlplus :)
-=R</description>
		<content:encoded><![CDATA[<p>Some comments on the gem:</p>
<p>Just throwing these out:<br />
instead of using malloc you may want to use ruby_xmalloc [so ruby can keep track of the memory growth]<br />
Also, for a small speedup you may want to cache the ids for your &#8220;@variables&#8221; and then can call rb_ivar_set</p>
<p>Looks good though.  I might roll it into an upcoming version of mysqlplus :)<br />
-=R</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Stephen Sykes</title>
		<link>http://pennysmalls.com/2008/04/02/speed-up-mysql-in-rails/comment-page-1/#comment-200</link>
		<dc:creator>Stephen Sykes</dc:creator>
		<pubDate>Sun, 31 Aug 2008 17:45:54 +0000</pubDate>
		<guid isPermaLink="false">http://pennysmalls.com/2008/04/02/speed-up-mysql-in-rails/#comment-200</guid>
		<description>Thanks for the info.  I advise you to use the gem - it works perfectly with 2.1
http://rubyforge.org/frs/?group_id=5954</description>
		<content:encoded><![CDATA[<p>Thanks for the info.  I advise you to use the gem &#8211; it works perfectly with 2.1<br />
<a href="http://rubyforge.org/frs/?group_id=5954" rel="nofollow">http://rubyforge.org/frs/?group_id=5954</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: roger faith</title>
		<link>http://pennysmalls.com/2008/04/02/speed-up-mysql-in-rails/comment-page-1/#comment-197</link>
		<dc:creator>roger faith</dc:creator>
		<pubDate>Sat, 30 Aug 2008 04:25:27 +0000</pubDate>
		<guid isPermaLink="false">http://pennysmalls.com/2008/04/02/speed-up-mysql-in-rails/#comment-197</guid>
		<description>Re: Note that the mysql patch by Stefan Kaes
has been basically incorporated into the Mysql lib as of version
2.8pre1

MysqlPlus is a fork of 2.8pre4, so also incorporates it.

Unfortunately the current gem is an earlier version and thus still creates more garbage than necessary.  But anyway if people do want it they can upgrade.

I&#039;m hoping that the rails team will help with the slow rendering time. One can hope.

Also does this plugin work with 2.1?
Thanks!
-=R</description>
		<content:encoded><![CDATA[<p>Re: Note that the mysql patch by Stefan Kaes<br />
has been basically incorporated into the Mysql lib as of version<br />
2.8pre1</p>
<p>MysqlPlus is a fork of 2.8pre4, so also incorporates it.</p>
<p>Unfortunately the current gem is an earlier version and thus still creates more garbage than necessary.  But anyway if people do want it they can upgrade.</p>
<p>I&#8217;m hoping that the rails team will help with the slow rendering time. One can hope.</p>
<p>Also does this plugin work with 2.1?<br />
Thanks!<br />
-=R</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Stephen Sykes</title>
		<link>http://pennysmalls.com/2008/04/02/speed-up-mysql-in-rails/comment-page-1/#comment-130</link>
		<dc:creator>Stephen Sykes</dc:creator>
		<pubDate>Sun, 03 Aug 2008 10:40:07 +0000</pubDate>
		<guid isPermaLink="false">http://pennysmalls.com/2008/04/02/speed-up-mysql-in-rails/#comment-130</guid>
		<description>Ok, I had a look at our caching code - I won&#039;t post it because it&#039;s rather too complex for what you need because we also use the same code to cache external resources.

I agree there may be some performance value in checking updated_at in C before anything is set up as ruby objects so an in-process cache can be used, but I fear it won&#039;t just drop-in like slim-attributes does now.

Anyway, we find that only part of the performance issue is solved by doing things at the AR level.  Much time is also spent processing and rendering.  So we also try to avoid this by using cached fragments whose expiry is tied to particular DB models or DB rows (whatever you specify).

If an object changes or any object in a model changes then we automatically invalidate a bunch of cached items as appropriate (done via after_save hooks).  It&#039;s all automatic, and we know whether we need to run controller code also based on whether the view cache is valid or not.

For this we employ a home grown system that is strikingly similar to this, although they were developed independently:
http://blog.evanweaver.com/files/doc/fauna/interlock/files/README.html</description>
		<content:encoded><![CDATA[<p>Ok, I had a look at our caching code &#8211; I won&#8217;t post it because it&#8217;s rather too complex for what you need because we also use the same code to cache external resources.</p>
<p>I agree there may be some performance value in checking updated_at in C before anything is set up as ruby objects so an in-process cache can be used, but I fear it won&#8217;t just drop-in like slim-attributes does now.</p>
<p>Anyway, we find that only part of the performance issue is solved by doing things at the AR level.  Much time is also spent processing and rendering.  So we also try to avoid this by using cached fragments whose expiry is tied to particular DB models or DB rows (whatever you specify).</p>
<p>If an object changes or any object in a model changes then we automatically invalidate a bunch of cached items as appropriate (done via after_save hooks).  It&#8217;s all automatic, and we know whether we need to run controller code also based on whether the view cache is valid or not.</p>
<p>For this we employ a home grown system that is strikingly similar to this, although they were developed independently:<br />
<a href="http://blog.evanweaver.com/files/doc/fauna/interlock/files/README.html" rel="nofollow">http://blog.evanweaver.com/files/doc/fauna/interlock/files/README.html</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: roger</title>
		<link>http://pennysmalls.com/2008/04/02/speed-up-mysql-in-rails/comment-page-1/#comment-126</link>
		<dc:creator>roger</dc:creator>
		<pubDate>Sat, 02 Aug 2008 21:48:55 +0000</pubDate>
		<guid isPermaLink="false">http://pennysmalls.com/2008/04/02/speed-up-mysql-in-rails/#comment-126</guid>
		<description>Sure go for it.
Yeah appropriate expiry would be a concern, especially for changing tables.  I didn&#039;t worry about it too much because our tables only change like once a day, so a restart would work :)

Other options would be: tracking somehow the time of the last write to each table [memcache? mysql?] to know when to invalidate local caches--which would be annoying because of the setup involved.

Possibly use an updated_at column or compute a hash or checksum of each row to make sure it matches our cached version [in C, or use mysql&#039;s md5()].

In terms of fitting it into RAM, it appears that with our app, rails by default uses 50MB, and with all of the &quot;common&quot; objects in memory it uses 74MB.  So a little bit of a hit but probably not too bad.  The mongrels tend to grow to about 100MB total, over time, anyway, so it&#039;s not out of scope.  Also we could use an LRU cache to limit memory used.

With regard to a time expiry, that&#039;s also a possibility [1].

An interesting thought is that if you only had one mongrel and only updated your DB through AR, you could just LRU cache everything you ever use.  And never worry about having to refresh them.  Which would be scary but imagine the speedup :)

Just some ideas.  For all I know somebody has written this plugin I just don&#039;t know about it :)

Take care.
-R
[1] http://www.nongnu.org/pupa/ruby-cache-README.html</description>
		<content:encoded><![CDATA[<p>Sure go for it.<br />
Yeah appropriate expiry would be a concern, especially for changing tables.  I didn&#8217;t worry about it too much because our tables only change like once a day, so a restart would work :)</p>
<p>Other options would be: tracking somehow the time of the last write to each table [memcache? mysql?] to know when to invalidate local caches&#8211;which would be annoying because of the setup involved.</p>
<p>Possibly use an updated_at column or compute a hash or checksum of each row to make sure it matches our cached version [in C, or use mysql's md5()].</p>
<p>In terms of fitting it into RAM, it appears that with our app, rails by default uses 50MB, and with all of the &#8220;common&#8221; objects in memory it uses 74MB.  So a little bit of a hit but probably not too bad.  The mongrels tend to grow to about 100MB total, over time, anyway, so it&#8217;s not out of scope.  Also we could use an LRU cache to limit memory used.</p>
<p>With regard to a time expiry, that&#8217;s also a possibility [1].</p>
<p>An interesting thought is that if you only had one mongrel and only updated your DB through AR, you could just LRU cache everything you ever use.  And never worry about having to refresh them.  Which would be scary but imagine the speedup :)</p>
<p>Just some ideas.  For all I know somebody has written this plugin I just don&#8217;t know about it :)</p>
<p>Take care.<br />
-R<br />
[1] <a href="http://www.nongnu.org/pupa/ruby-cache-README.html" rel="nofollow">http://www.nongnu.org/pupa/ruby-cache-README.html</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Stephen Sykes</title>
		<link>http://pennysmalls.com/2008/04/02/speed-up-mysql-in-rails/comment-page-1/#comment-125</link>
		<dc:creator>Stephen Sykes</dc:creator>
		<pubDate>Sat, 02 Aug 2008 17:49:33 +0000</pubDate>
		<guid isPermaLink="false">http://pennysmalls.com/2008/04/02/speed-up-mysql-in-rails/#comment-125</guid>
		<description>Well your number 1 problem is deciding when to expire your cached objects (particularly if you run multiple instances - eg mongrels or thins).  IF expiry can be handled in a simple way for your case, and IF storing all your objects in-process does not use too much RAM (don&#039;t forget the AR associations may be stored there too) then this could be a good approach.

In our app we do, for instance, cache in-memory the results to queries to our Countries table.  For various reasons the available countries we wish to show do in fact change, but not that often.  We use a 5 minute timeout for this cache, which means that if we change something it will show up reasonably quickly.  I&#039;ll post the code if you are interested.</description>
		<content:encoded><![CDATA[<p>Well your number 1 problem is deciding when to expire your cached objects (particularly if you run multiple instances &#8211; eg mongrels or thins).  IF expiry can be handled in a simple way for your case, and IF storing all your objects in-process does not use too much RAM (don&#8217;t forget the AR associations may be stored there too) then this could be a good approach.</p>
<p>In our app we do, for instance, cache in-memory the results to queries to our Countries table.  For various reasons the available countries we wish to show do in fact change, but not that often.  We use a 5 minute timeout for this cache, which means that if we change something it will show up reasonably quickly.  I&#8217;ll post the code if you are interested.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: roger</title>
		<link>http://pennysmalls.com/2008/04/02/speed-up-mysql-in-rails/comment-page-1/#comment-124</link>
		<dc:creator>roger</dc:creator>
		<pubDate>Sat, 02 Aug 2008 12:12:06 +0000</pubDate>
		<guid isPermaLink="false">http://pennysmalls.com/2008/04/02/speed-up-mysql-in-rails/#comment-124</guid>
		<description>Very interesting about the results in conjunction and compared to hash_extension.  I guess what matters is if the number of columns received is large and also if any of those columns are, for example, extremely large [and unused], both of which would seem to benefit more from slim_attributes.

My latest thought with this style speedup is I wonder if you can&#039;t do something like object cacheing...
background:
if I remember correctly,
def action_name
  @@my_setting &#124;&#124;= Setting.find(:some_thing)
end
is actually quite fast for every run [except the first :P ]

Furthermore, at least where I work, we have a few small&#039;ish tables that we constantly query for &#039;similar&#039; data.  Say one table with 100, and another with 3000 are basically queried over and over, but typically with differing queries, and different elements returned.

Also note that, as mentioned with hash_extension, lots of time you just want to view an AR object, not edit it.

So I wonder if there&#039;s an object cache around somewhere that does something like the following.
a = Program.find(:some_conditions, :cache_the_individual_objects =&gt; true)
b = Program.find(:some_conditions_which_will_make_it_share_some_objects_with_a, :cache_the_individual_objects =&gt; true)

When it runs b, it gets back the results, then for each entry in the results, instead of immediately creating a new AR object, it first checks to see if that exact object is already existing somewhere [pre-cached].  If it is then just use it [avoid creating an AR object--just re-use the existing one].

Now throw this all into an LRU cache to not cache too much and, for some queries from a common small set, I could see this resulting in a speedup.  

I suppose there could even be further optimizations, like querying &#039;only the id&#039; from the DB [then figuring out which ones aren&#039;t cached, creating objects for them] and there could be other optimizations like a C based &#039;heap based&#039; cache, freezing the AR instance attributes, etc.

Thoughts?
-R</description>
		<content:encoded><![CDATA[<p>Very interesting about the results in conjunction and compared to hash_extension.  I guess what matters is if the number of columns received is large and also if any of those columns are, for example, extremely large [and unused], both of which would seem to benefit more from slim_attributes.</p>
<p>My latest thought with this style speedup is I wonder if you can&#8217;t do something like object cacheing&#8230;<br />
background:<br />
if I remember correctly,<br />
def action_name<br />
  @@my_setting ||= Setting.find(:some_thing)<br />
end<br />
is actually quite fast for every run [except the first :P ]</p>
<p>Furthermore, at least where I work, we have a few small&#8217;ish tables that we constantly query for &#8217;similar&#8217; data.  Say one table with 100, and another with 3000 are basically queried over and over, but typically with differing queries, and different elements returned.</p>
<p>Also note that, as mentioned with hash_extension, lots of time you just want to view an AR object, not edit it.</p>
<p>So I wonder if there&#8217;s an object cache around somewhere that does something like the following.<br />
a = Program.find(:some_conditions, :cache_the_individual_objects =&gt; true)<br />
b = Program.find(:some_conditions_which_will_make_it_share_some_objects_with_a, :cache_the_individual_objects =&gt; true)</p>
<p>When it runs b, it gets back the results, then for each entry in the results, instead of immediately creating a new AR object, it first checks to see if that exact object is already existing somewhere [pre-cached].  If it is then just use it [avoid creating an AR object--just re-use the existing one].</p>
<p>Now throw this all into an LRU cache to not cache too much and, for some queries from a common small set, I could see this resulting in a speedup.  </p>
<p>I suppose there could even be further optimizations, like querying &#8216;only the id&#8217; from the DB [then figuring out which ones aren't cached, creating objects for them] and there could be other optimizations like a C based &#8216;heap based&#8217; cache, freezing the AR instance attributes, etc.</p>
<p>Thoughts?<br />
-R</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Stephen Sykes</title>
		<link>http://pennysmalls.com/2008/04/02/speed-up-mysql-in-rails/comment-page-1/#comment-86</link>
		<dc:creator>Stephen Sykes</dc:creator>
		<pubDate>Fri, 02 May 2008 08:35:26 +0000</pubDate>
		<guid isPermaLink="false">http://pennysmalls.com/2008/04/02/speed-up-mysql-in-rails/#comment-86</guid>
		<description>Dan: I will email to you directly the code I am using to benchmark against hash_extension.

The meat of it is like this (there are 100 items in the DB):

&lt;pre&gt;
2000.times do &#124;n&#124;
    product = Product.find(:all)[n % 100]
    x = product.name
    y = product.comment
    z = product.created_at
 end
&lt;/pre&gt;

Just running it now gives these figures:

Without slim-attributes: 11.133781s
Without slim-attributes with hash_extension: 9.673203s
With slim-attributes: 6.658067s
With slim-attributes with hash_extension: 5.422445s

Without slim-attributes - hash_extension improvement: 13.12%
With slim-attributes - hash_extension improvement: 18.56%
Without hash_extension - slim-attributes improvement: 40.20%
With hash_extension - slim-attributes improvement: 43.94%
With hash_extension &amp; slim-attributes improvement over plain AR: 51.30%

The table used has over 40 columns - this is realistic for our live application.  If you use a smaller table then hash_extension begins to make a bigger difference than slim-attributes.

I will look at writing a postgresql implementation, it shouldn&#039;t be too hard to port.

Roger: hash_extension naturally uses slim-attributes when slim-attributes is installed because it uses connection.select_all.  This is what eventually calls all_hashes.

The benefit in hash_extension is avoiding the active record object creation.  The benefit of slim-attributes is avoiding the creation of a hash, and that the attributes are lazily created as ruby objects on demand.  And you get ultimate performance by combining the two - assuming you don&#039;t need active record objects.</description>
		<content:encoded><![CDATA[<p>Dan: I will email to you directly the code I am using to benchmark against hash_extension.</p>
<p>The meat of it is like this (there are 100 items in the DB):</p>
<pre>
2000.times do |n|
    product = Product.find(:all)[n % 100]
    x = product.name
    y = product.comment
    z = product.created_at
 end
</pre>
<p>Just running it now gives these figures:</p>
<p>Without slim-attributes: 11.133781s<br />
Without slim-attributes with hash_extension: 9.673203s<br />
With slim-attributes: 6.658067s<br />
With slim-attributes with hash_extension: 5.422445s</p>
<p>Without slim-attributes &#8211; hash_extension improvement: 13.12%<br />
With slim-attributes &#8211; hash_extension improvement: 18.56%<br />
Without hash_extension &#8211; slim-attributes improvement: 40.20%<br />
With hash_extension &#8211; slim-attributes improvement: 43.94%<br />
With hash_extension &#038; slim-attributes improvement over plain AR: 51.30%</p>
<p>The table used has over 40 columns &#8211; this is realistic for our live application.  If you use a smaller table then hash_extension begins to make a bigger difference than slim-attributes.</p>
<p>I will look at writing a postgresql implementation, it shouldn&#8217;t be too hard to port.</p>
<p>Roger: hash_extension naturally uses slim-attributes when slim-attributes is installed because it uses connection.select_all.  This is what eventually calls all_hashes.</p>
<p>The benefit in hash_extension is avoiding the active record object creation.  The benefit of slim-attributes is avoiding the creation of a hash, and that the attributes are lazily created as ruby objects on demand.  And you get ultimate performance by combining the two &#8211; assuming you don&#8217;t need active record objects.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Roger</title>
		<link>http://pennysmalls.com/2008/04/02/speed-up-mysql-in-rails/comment-page-1/#comment-85</link>
		<dc:creator>Roger</dc:creator>
		<pubDate>Thu, 01 May 2008 21:37:12 +0000</pubDate>
		<guid isPermaLink="false">http://pennysmalls.com/2008/04/02/speed-up-mysql-in-rails/#comment-85</guid>
		<description>How were you able to &#039;combine&#039; hash_extension + this?

Also is there a benefit of using hash_extension in that &#039;later&#039; accesses are faster?  [like calling instance.name would be faster using hashes].  Thoughts?</description>
		<content:encoded><![CDATA[<p>How were you able to &#8216;combine&#8217; hash_extension + this?</p>
<p>Also is there a benefit of using hash_extension in that &#8216;later&#8217; accesses are faster?  [like calling instance.name would be faster using hashes].  Thoughts?</p>
]]></content:encoded>
	</item>
</channel>
</rss>
