Amazon DynamoDB - SSD backed non-relational DB is here!

So - pretty exciting stuff, for us at least - Amazon Dynamo DB (previously mentioned here) - has now seen the light of day, and is appearing within the console. Instructions are linked from within the console.

We're already getting cracking on giving it a good play - we'll report back findings of interest!

Discuss this entry

We're hiring!

Bucking the trend of interplanetary meltdown, we're looking for some 'awesome' people to join our team. We're fleshing out the full details right now and will be posting ads in all the predictable places in the next few weeks, but in the meantime if you fall into any of the following job descriptions and are interested in applying send your CV and examples of your work through to .

These are all permanent positions, salaries are to be decided based on experience and skills but will be based on market rates, as will the overall package on offer. We're currently based in Fareham, Hampshire but may also be relocating soon - wherever we go will be within 15 miles and will be near a mainline station. So if you're currently London based then you'll need to suffer a nice commute (we've all done it the opposite way round!) or perhaps a nice reloaction to the Hampshire countryside.

Openings:

User Interface/Web Designer
We need someone with strong skills in designing web based interfaces and graphics. Our clients require designs that are clean, professional and modern with a sprinkling of innovation and spark. We're also interested in people who dabble in pixel art as it's a current favourite of ours and will give us a head start on some of our current internal projects!
Ruby Developer (x2)
We predominantly work with Ruby based applications. Whether it be Rails, Sinatra or anything else, we need someone proficient in Ruby as a language with a strong foundation in clean, sensible programming. The more skills you have, the more chance you have of getting a job with us - make sure your CV's are full of the things you can do rather than the names of as many technologies you can think of!

Experience is beneficial, but if you can demonstrate a keen background in the scene or can bring along a strong portfolio of homegrown projects for review then we're happy to consider you. Qualifications are secondary to skill, attitude and enthusiasm.

We are also happy to accept interns/gap year students falling into any of the above skillsets - send us your details and we'll take a look!

Discuss this entry

Revisited: Tamper-proof cookies in Rails 3

Here's a revisited post that's fairly short and sweet: way back in 2008 I blogged about my implementation of tamper-proof cookies which used a similar technique to that used by Rails for its cookie-based session store. Back then the solution involved a custom cookie jar, the OpenSSL library to generate a HMAC, overriding the ApplicationController#cookies method and a slightly unorthodox method signature for reading cookie values.

Read more of this entry

Revisited: roll your own pagination links with will_paginate and Rails 3

With a final release of Rails 3 edging closer every day it seems like a good time to revisit some of my old articles from the last few years and bring them up to date.

Back in the summer of 2008 I wrote about custom link renderers using will_paginate and, as it is still one of the most popular posts on the blog, it’s the one I’ve decided to refresh first. Don’t worry if you haven’t read the original article as I’ll be covering the same things here. So without further ado, let’s get stuck in!

Read more of this entry

Rails 2.3.8, Rack 1.1 and the curious case of the missing quotes

If you're using Rails 2.3.8 for your application and thought that you were safe after May's comedy of errors produced three point updates in as many days, think again. Unfortunately there's a little bug that can lead to parameters being altered or potentially even truncated without warning.

Read more of this entry

Create a bootable EBS AMI from a running instance

A quick set of notes on how to create a bootable EBS snapshot from a running EC2 instance - for example, an instance that has been started from an S3 backed AMI.

We've had to do this a few times over the last few months - for the benefit of others, we've outlined how we currently do it - this is based on a number of articles that were surfacing at the time of our research, but I don't have the links to hand. If there's a better way out there feel free to jump in!

Read more of this entry

Ooh la la: Paperclip et les European S3 buckets

At the end of my last blog about Paperclip I mentioned that you need to do some patching if you want to use European S3 buckets to store your files. The problem was introduced when Paperclip made the move from RightAWS to Marcel Molina’s AWS::S3 gem. Unfortunately despite several forks containing patches to AWS::S3 and a 4 month old bug report nothing has been done to officially fix the problem.

So my fellow Europeans, what are we to do?

Read more of this entry

Conditional duplicate key updates with MySQL

In one of our larger Rails apps the sheer volume of data we process means we’ve had to rely more and more on direct SQL queries, denormalised tables and summary tables to speed things up. When updating summary tables we typically use ON DUPLICATE KEY UPDATE, a MySQL extension to INSERT statements since version 4.1, that allows a record to either be inserted or updated in one query.

Read more of this entry

InfiniDB, Infobright and MonetDB - Day 3: MonetDB

Day 3 of my database exploration mission brings me to MonetDB. Binary downloads are available for Debian, Fedora, Ubuntu and (strangely!) Windows! If we still had any Windows users left here at HQ then it'd be a rare treat, but instead (as usual) our platform of choice (Centos 5) isn't directly available in binary form. We downloaded the Fedora source RPMs and built our own - in case they're of any use then i've put them up on a Google Code site for others to download.

After installing the RPMs then you're ready to get started - before you can do anything you have to start the merovingian process (you could either setup an init script, or run the binary manually for now). For information, the instructions say:

merovingian is a daemon process that controls a collection of database servers, i.e. mserver5 processes, each looking after a single physical database. Start this program to gain access to your MonetDB database farm. merovingian is designed to be used in a system initialisation script in production environments.

With merovingian running then you're ready to create a database - for this you use monetdb - and then start the database using the same command for example:

> monetdb create twf
successfully created database 'twf'

> monetdb status
     name        state     
twf            stopped              
  
> monetdb start twf
starting database 'twf'... done

> monetdb status
     name        state     
twf            running

You now have a running database and can connect to it using mclient. This is similar to most command line clients where you can perform changes to your database as well as query for data.

The first step to transferring the database was as usual - inspect the schema on our MySQL database and update it to make the correct use of the supported data types. As with the other systems, there's no support for unsigned values, it also wasn't immediately obvious to me what the maximum length of a varchar is.

With the tables created it was time to try and migrate some data. Given MonetDB has been around for quite a while then there seemed to be pretty scarce resources with any detailed instructions - I couldn't, for example, find any simple migration tools or documentation detailing the best path for migration. I guess this could be because MonetDB is more often tackled by people with bigger brains or with more time to figure things out.

I attempted to use the following to dump data from MySQL:

select * from h into outfile '/dbtmp/tmp/h' fields terminated by "|" enclosed by '"';

And then the following to import into my MonetDB table:

copy 1000000 records into h from '/dbtmp/tmp/h' using delimiters '|','\n', '"'  null as '';

This yielded reasonable results - though I did have to do some tidying up in the middle with sed - in the end I gave up as there were some string values causing me problems, so I decided to rest on it and went to bed!

In the morning I came back to find the merovingian process was dead, and the status of the database was showing as crashed. I started up the processes and took a look at the status - it said the health was 67% so i'm not really sure what's going on with it!

Performance

In the time I had available I was only able to get a 1 million row table imported successfully to play with - a shocking performance I know, but MonetDB was being quite fussy and I wasn't pressing the right buttons! I did run a few tests and also ran them against the same dataset in MySQL for comparison, all are run from cold - i.e. MySQL and MonetDB are both restarted before each query. I don't expect these queries to be representative of real world cases, I was just thinking of some nasty queries that I could throw at a single table in order to cause some pain.

Query 1

MySQL takes 250msec:

sql>select count(*) from h;
+---------+
| L1      |
+=========+
| 1000000 |
+---------+
1 tuple
Timer       1.532 msec 1 rows
Query 2

MySQL takes 420msec:

sql>select count(*) from h group by intcolumn;
+-------+
| L2    |
+=======+
+-------+
65 tuples
Timer     142.260 msec 65 rows
Query 3

MySQL takes 44,000msec:

sql>select count(*) from h group by varcharcolumn;
+-------+
| L1    |
+=======+
+-------+
12743 tuples
Timer    1464.389 msec 12743 rows
Query 4

MySQL takes 37,500msec:

sql>select count(*) as total from h group by varcharcolumn order by total;
+-------+
| L1    |
+=======+
+-------+
12743 tuples
Timer    1496.537 msec 12743 rows
Query 5

MySQL takes 373,000msec (not a typo, it's more than 6 minutes):

sql>select count(*) as total from h group by varcharcolumn, anothervarcharcolumn order by total;
+-------+
| L1    |
+=======+
+-------+
69696 tuples
Timer    4170.520 msec 69696 rows

Summary

Obviously this quick trial of each of these is not comprehensive enough to make any solid comparisons of performance - the next step will be for me to go through and come up with a proper test plan in order to be a little more methodical about things. However, it has given me a good grounding in how the 3 systems compare with respect to installing and getting started. I'll be keeping a close eye on InfiniDB - while not stable enough right now, i'm sure they'll keep things rolling and I look forward to taking another look. If I can overcome the import obstacles and also the different 'feel' of MonetDB then the basic query results make a compelling case for taking a further look - there's also more to learn here with respect to architecture, deployment techniques, monitoring, etc. Finally, Infobright - it would make my life easier if we could use it on an insert/update/delete basis - as it is I think we'd have a tough time getting clients to pay the license fee - perhaps if bundled with something like EC2 instances with a smaller incremental cost then it may be more palletable and help to increase adoption (it may be that Infobright have lots of customers with open wallets - in which case please share them!). In terms of immediate ease of use, with some visible performance improvements, Infobright fits the bill - but until i've had a chance to compare MonetDB and Infobright in a bit more detail then i'll reserve my final judgement!

Discuss this entry

InfiniDB, Infobright and MonetDB - Day 2: Infobright

Day 2 of my tour of column based storage brings me on to Infobright Community Edition (ICE). The first impressive point was that based on my blog post of yesterday then I already had an email from Mark in Community Relations at Infobright offering help and advice - despite me calling him the wrong name (I was having a bad day!) then he was immediately helpful and also offered to get some of his team to look into my queries.

As an aside, John from Calpont was also kind enough to drop by to respond to some of my points - to me this gives me a warm fuzzy feeling that both Infobright and Calpont are taking the community seriously - I guess for these products to gain traction they need to make sure people can get motoring with them to improve adoption.

Read more of this entry

InfiniDB, Infobright and MonetDB - Day 1: InfiniDB

We're taking a whistlestop tour of some of the column based storage systems out there for a project we're working on (where the use case seems to fit better with this form of storage rather than straight MySQL). After reading through the series of articles on the MySQL Performance Blog then we chose to look at InfiniDB, Infobright and MonetDB - with the two that talk MySQL coming first for ease of integration right now. I'm also going to do this as a three parter - so first up is InfiniDB.

Read more of this entry

I need your designer glasses, your blue jeans and your black turtle-neck sweater

Picture of a new MacBook ProOk so it’s not quite Schwarzenegger but last week I terminated a twenty year relationship with Microsoft and bought a Mac. Now I just need to get hold of the Apple uniform and I’ll officially be part of the club!

After more than a year of pontification on what exactly to buy as a replacement for my ageing Dell Inspiron laptop, I finally settled on a shiny new 13” MacBook Pro and an even shinier 24” Cinema Display. So far I’m pretty chuffed with my choice.

Read more of this entry

Protecting your Paperclip downloads

Way back last November when I first blogged about Paperclip I included a brief mention of hiding files behind a controller rather than simply putting them in the public directory for all to see. Since then I’ve noticed that the question of how to actually do this has come up regularly over on Rails Forum and a couple of weeks ago I had to figure out how to update some of our code to protect assets that we had migrated from local file system to Amazon S3 storage. So I figured it’s probably a worthwhile technique to share.

Read more of this entry

Bugmash!

Picture of a masher mashing a bugWell it’s day two of the first ever Rails BugMash and so far I’ve managed to score a sneaky 1,000 points just by updating my one-line binary fixtures test patch. Meanwhile Matt Duncan and Rizwan Reza are on fire with 4,350 and 4,000 points at the time of writing.

Unfortunately the event has coincided with what may be the only nice weekend of the Great British summer, so I’ve been torn between the chance to mash bugs or to enjoy the sunshine. I’m currently trying to combine the two sat out in the garden squinting to see my laptop screen in the glare of the sun!

My next attempt to score some points is an updated patch, now improved and including a test case, for a lack of quoting of aliased table names in SQL joins which has been (too eagerly) marked as resolved even though it’s still broken. If you get the opportunity please do take a look and comment on the ticket as it’d be nice to get it fixed.

After that, I’m hoping to try and sneak my patch for anonymous extension modules for belongs_to and has_one associations into the bugmash as it has been sat on Lighthouse since March and already has three +1s. Even if it isn’t eligible for the bugmash, I still think it’s a worthy patch so again please take a look and comment on the ticket if you get chance.

And of course there are still plenty more tickets tagged with bugmash to be looked at so even if you’ve never contributed to Rails before, now is a pretty good time to start!

Discuss this entry

Thin, Rails 1.2.3/1.2.6 and ActionController::Dispatcher (NameError)

Please note: This patch has now been applied to the Thin master repository so will be fixed in all future releases.

Whilst trying to get an old Rails app up and running with Thin (Gem version 1.2.2) then I encountered a spot of bother:

load_missing_constant: uninitialized constant ActionController::Dispatcher (NameError)

Read more of this entry

Archives

  1. January 2012
  2. May 2011
  3. January 2011
  4. August 2010
  5. July 2010
  6. April 2010
  7. January 2010
  8. November 2009
  9. September 2009
  10. August 2009
  11. July 2009
  12. June 2009
  13. May 2009
  14. April 2009
  15. March 2009
  16. February 2009
  17. December 2008
  18. November 2008
  19. September 2008
  20. August 2008
  21. July 2008
  22. June 2008
  23. May 2008
  24. April 2008
  25. March 2008
  26. February 2008
  27. January 2008
  28. December 2007
  29. November 2007

Tags

  1. 37signals
  2. actioncontrollerdispatcher (nameerror)
  3. actionview
  4. active messaging
  5. activerecord
  6. activesupport
  7. actverecord
  8. aes
  9. aggregation
  10. ajax
  11. akismet
  12. amazon
  13. amazon sqs
  14. ami
  15. apache
  16. api
  17. apple
  18. apr
  19. apr-util
  20. async
  21. attachments
  22. attachment_fu
  23. attr_accessible
  24. auto scaling
  25. autotest
  26. availability
  27. aws
  28. backgroundrb
  29. beanstalkd
  30. bindings
  31. bj
  32. block
  33. branding
  34. buckets
  35. bug
  36. bugmash
  37. cache
  38. caching
  39. callbacks
  40. cancer research uk
  41. cdn
  42. centos
  43. charity
  44. cloud
  45. cloudfront
  46. clusters
  47. column information
  48. columns
  49. community
  50. company name
  51. compatibility
  52. compiler
  53. composed_of
  54. config.ru
  55. consultancy
  56. content
  57. content delivery
  58. controller
  59. convert
  60. cookies
  61. csrf
  62. css
  63. data warehouse
  64. database
  65. dates
  66. defensio
  67. deployment
  68. design
  69. development
  70. dhtml
  71. docrails
  72. documentation
  73. donations
  74. drdb
  75. duplicate key
  76. dynamodb
  77. ebs
  78. ec2
  79. elastic
  80. elastic block store
  81. elastic load balancing
  82. encoding
  83. encryption
  84. erb
  85. error
  86. european
  87. events
  88. exalead
  89. ezcrypto
  90. facebook
  91. fckeditor
  92. feedburner
  93. feeds
  94. ffmpeg
  95. filter
  96. fixes
  97. flash
  98. flickr
  99. flickr api
  100. flickr_fu
  101. fuse
  102. geekup
  103. gems
  104. geocode
  105. git
  106. github
  107. god
  108. great south run
  109. greenplum
  110. growl
  111. hacker
  112. haml
  113. haproxy
  114. helper
  115. hmac
  116. holiday
  117. hooks
  118. hosting
  119. howto
  120. hpricot
  121. html
  122. identity
  123. imagemagick
  124. imagescience
  125. infinidb
  126. infiniteftp
  127. infobright
  128. init.d
  129. insert
  130. invalid authenticity token
  131. italy
  132. javascript
  133. jobs
  134. jquery
  135. json
  136. leeds media
  137. limit
  138. linkrenderer
  139. linux
  140. load balancing
  141. logo
  142. mac
  143. markaby
  144. mass-assignment
  145. memcached
  146. mephisto
  147. messageverifier
  148. messaging
  149. middleware
  150. migrate
  151. migration
  152. model
  153. mod_rails
  154. mod_ruby
  155. monetdb
  156. mongrel
  157. mongrel_cluster
  158. monit
  159. monitoring
  160. mootools
  161. mp3
  162. mq
  163. multiple gems
  164. multiselect
  165. mysql
  166. neon
  167. new site
  168. nginx
  169. observer
  170. offset
  171. open source
  172. opensolaris
  173. openssl
  174. optimisation
  175. pagination
  176. paperclip
  177. parameters
  178. params
  179. passenger
  180. patch
  181. performance
  182. permanentredirect
  183. persistence
  184. persistent storage
  185. persistentfs
  186. php
  187. phusion
  188. plugin
  189. plugins
  190. post commit
  191. post-commit
  192. pow
  193. protomultiselect
  194. prototype
  195. query
  196. queues
  197. quotes
  198. race for life
  199. rack
  200. rails
  201. rails development
  202. rails patch
  203. rails plugin
  204. rails-doc
  205. rails3
  206. rake
  207. refresh
  208. renderer
  209. respond_to
  210. rich text editor
  211. rmagick
  212. ruby
  213. ruby on rails
  214. rubyinline
  215. running
  216. rvideo
  217. s3
  218. s3fs. elasticdrive
  219. scaling
  220. schema
  221. schwarzenegger
  222. scm
  223. search based applications
  224. security
  225. services
  226. session
  227. shorthand
  228. signed
  229. snarl
  230. social
  231. solaris
  232. spam filter
  233. sparrow
  234. specify
  235. sponsorship
  236. sql
  237. sqlite3
  238. sql_logging
  239. starling
  240. starter kit
  241. storage
  242. streaming
  243. subversion
  244. sue ryder care
  245. survey
  246. svn
  247. swfupload
  248. swig
  249. sysadmin
  250. tables
  251. tamper
  252. templates
  253. the webfellas
  254. thewebfellas
  255. thin
  256. thumbnail
  257. time zone
  258. tinymce
  259. tip
  260. tips
  261. to-done
  262. training
  263. transcoding
  264. twitter
  265. tzinfo
  266. ui
  267. uk
  268. uk rails
  269. unsigned
  270. update
  271. uploads
  272. url
  273. ux
  274. validation
  275. version
  276. video
  277. view
  278. vmdk
  279. vmware
  280. webfellas
  281. webfellows
  282. wedding
  283. welcome
  284. widgeditor
  285. will_paginate
  286. win32
  287. windows
  288. wysiwyg
  289. xen
  290. xhtml
  291. xvm
  292. youtube
  293. zenoss
  294. zentest
  295. zfs

Flickr snaps