Amazon wishlist - Please give us an EC2 Recovery Console!

In the olden days with our racks of Dell servers we used to have the luxury of the iDRAC card - Integrated Dell Remote Access Controller (I often wondered what the acronym meant!) - other manufacturers like HP, IBM, etc would all have something similar. Essentially, it lived on a different network connection, and gave you remote access to the machine - even when things had gone badly wrong. You could even mount ISO files over the network - meaning you could install Operating Systems over it - great for when things had all gone a bit wrong, or even for upgrading with some peace of mind that you could access the machine regardless of what happened. When working with XenServer it was a pretty useful way of upgrading between major releases (for some reason it didn't allow remote upgrades).

In the virtual server space, one of the great things about Linode was LISH - an out of band access tool to your virtual server instance. Again, this provided console access - great for resolving boot time issues, allowing you to drop into single user mode.

But, with EC2 there isn't yet an equivalent - if you've used an EBS backed instance then you can stop your instance, mount the volume elsewhere and attempt to diagnose and fix any issues - there is a guide over here. This doesn't allow you to drop into single user mode, and is a bit more effort - it would be great if Amazon could provide something more like LISH...

Why would it be useful? Sometimes you can be a dork and make changes to the system that don't become a problem until you restart - changes to fstab are a classic example - you can see the log telling you to press a button, but you can't get to it. As above - there are workarounds, but it would be great if you could jump straight in and connect to the console. Likewise when creating new AMIs - it would aid debugging of issues to have console access and single user mode without having to start over or mount/unmount volumes from other instances.

So, Dear Amazon Web Services.. please give me a console!

Discuss this entry

The Defensio Gem versus HTTParty YAML Deprecation!

As part of the recent round of security issues relating to Rails and YAML, support for using YAML has been removed from the HTTParty Gem (in this commit).

This has a knock on effect on any Gems that use YAML with HTTParty and don't set a specific version of the YAML-friendly HTTParty to use - one such example is the gem for Defensio.

Luckily, the fix is simple - the gem uses the Defensio 2.0 API - which already supports XML or JSON (as well as YAML) - a quick fork and 5 minutes later we have a JSON consuming version of the Gem.

UPDATE! My pull request has been accepted so the official repository is now using JSON rather than YAML.

Discuss this entry

Ebook DRM - a closed book that needs to be opened?

In a departure from the normal format of our blogs, i'm going to attempt a more commentary based post!

I've spent a significant amount of the last 18 months working with Ebooks - from integrating ONIX (and proprietary) data feeds through to working on displaying epubs in a browser. One area that frustrates me is the mandatory inclusion of DRM in many contracts - I don't think I am alone in this frustration.

Read more of this entry

Using Resque with short-lived (i.e. quick) workers

We use Resque all the time - but on a recent project we came across a slight issue - our jobs were executing so quick, that the overhead of forking on every job being processed was massively effecting the time taken to process the queue.

The first line of thought was to see if we could make the jobs 'do more' - but they really were both as concise and as complex as they needed to be!

Luckily, it looks like our problems have already been solved - there are a number of options already in the wild for improving worker performance with fast running jobs.

Read more of this entry

Amazon DynamoDB - SSD backed non-relational DB is here!

So - pretty exciting stuff, for us at least - Amazon Dynamo DB (previously mentioned here) - has now seen the light of day, and is appearing within the console. Instructions are linked from within the console.

We're already getting cracking on giving it a good play - we'll report back findings of interest!

Discuss this entry

We're hiring!

Bucking the trend of interplanetary meltdown, we're looking for some 'awesome' people to join our team. We're fleshing out the full details right now and will be posting ads in all the predictable places in the next few weeks, but in the meantime if you fall into any of the following job descriptions and are interested in applying send your CV and examples of your work through to .

These are all permanent positions, salaries are to be decided based on experience and skills but will be based on market rates, as will the overall package on offer. We're currently based in Fareham, Hampshire but may also be relocating soon - wherever we go will be within 15 miles and will be near a mainline station. So if you're currently London based then you'll need to suffer a nice commute (we've all done it the opposite way round!) or perhaps a nice reloaction to the Hampshire countryside.

Openings:

User Interface/Web Designer
We need someone with strong skills in designing web based interfaces and graphics. Our clients require designs that are clean, professional and modern with a sprinkling of innovation and spark. We're also interested in people who dabble in pixel art as it's a current favourite of ours and will give us a head start on some of our current internal projects!
Ruby Developer (x2)
We predominantly work with Ruby based applications. Whether it be Rails, Sinatra or anything else, we need someone proficient in Ruby as a language with a strong foundation in clean, sensible programming. The more skills you have, the more chance you have of getting a job with us - make sure your CV's are full of the things you can do rather than the names of as many technologies you can think of!

Experience is beneficial, but if you can demonstrate a keen background in the scene or can bring along a strong portfolio of homegrown projects for review then we're happy to consider you. Qualifications are secondary to skill, attitude and enthusiasm.

We are also happy to accept interns/gap year students falling into any of the above skillsets - send us your details and we'll take a look!

Discuss this entry

Revisited: Tamper-proof cookies in Rails 3

Here's a revisited post that's fairly short and sweet: way back in 2008 I blogged about my implementation of tamper-proof cookies which used a similar technique to that used by Rails for its cookie-based session store. Back then the solution involved a custom cookie jar, the OpenSSL library to generate a HMAC, overriding the ApplicationController#cookies method and a slightly unorthodox method signature for reading cookie values.

Read more of this entry

Revisited: roll your own pagination links with will_paginate and Rails 3

With a final release of Rails 3 edging closer every day it seems like a good time to revisit some of my old articles from the last few years and bring them up to date.

Back in the summer of 2008 I wrote about custom link renderers using will_paginate and, as it is still one of the most popular posts on the blog, it’s the one I’ve decided to refresh first. Don’t worry if you haven’t read the original article as I’ll be covering the same things here. So without further ado, let’s get stuck in!

Read more of this entry

Rails 2.3.8, Rack 1.1 and the curious case of the missing quotes

If you're using Rails 2.3.8 for your application and thought that you were safe after May's comedy of errors produced three point updates in as many days, think again. Unfortunately there's a little bug that can lead to parameters being altered or potentially even truncated without warning.

Read more of this entry

Create a bootable EBS AMI from a running instance

A quick set of notes on how to create a bootable EBS snapshot from a running EC2 instance - for example, an instance that has been started from an S3 backed AMI.

We've had to do this a few times over the last few months - for the benefit of others, we've outlined how we currently do it - this is based on a number of articles that were surfacing at the time of our research, but I don't have the links to hand. If there's a better way out there feel free to jump in!

Read more of this entry

Ooh la la: Paperclip et les European S3 buckets

At the end of my last blog about Paperclip I mentioned that you need to do some patching if you want to use European S3 buckets to store your files. The problem was introduced when Paperclip made the move from RightAWS to Marcel Molina’s AWS::S3 gem. Unfortunately despite several forks containing patches to AWS::S3 and a 4 month old bug report nothing has been done to officially fix the problem.

So my fellow Europeans, what are we to do?

Read more of this entry

Conditional duplicate key updates with MySQL

In one of our larger Rails apps the sheer volume of data we process means we’ve had to rely more and more on direct SQL queries, denormalised tables and summary tables to speed things up. When updating summary tables we typically use ON DUPLICATE KEY UPDATE, a MySQL extension to INSERT statements since version 4.1, that allows a record to either be inserted or updated in one query.

Read more of this entry

InfiniDB, Infobright and MonetDB - Day 3: MonetDB

Day 3 of my database exploration mission brings me to MonetDB. Binary downloads are available for Debian, Fedora, Ubuntu and (strangely!) Windows! If we still had any Windows users left here at HQ then it'd be a rare treat, but instead (as usual) our platform of choice (Centos 5) isn't directly available in binary form. We downloaded the Fedora source RPMs and built our own - in case they're of any use then i've put them up on a Google Code site for others to download.

After installing the RPMs then you're ready to get started - before you can do anything you have to start the merovingian process (you could either setup an init script, or run the binary manually for now). For information, the instructions say:

merovingian is a daemon process that controls a collection of database servers, i.e. mserver5 processes, each looking after a single physical database. Start this program to gain access to your MonetDB database farm. merovingian is designed to be used in a system initialisation script in production environments.

With merovingian running then you're ready to create a database - for this you use monetdb - and then start the database using the same command for example:

> monetdb create twf
successfully created database 'twf'

> monetdb status
     name        state     
twf            stopped              
  
> monetdb start twf
starting database 'twf'... done

> monetdb status
     name        state     
twf            running

You now have a running database and can connect to it using mclient. This is similar to most command line clients where you can perform changes to your database as well as query for data.

The first step to transferring the database was as usual - inspect the schema on our MySQL database and update it to make the correct use of the supported data types. As with the other systems, there's no support for unsigned values, it also wasn't immediately obvious to me what the maximum length of a varchar is.

With the tables created it was time to try and migrate some data. Given MonetDB has been around for quite a while then there seemed to be pretty scarce resources with any detailed instructions - I couldn't, for example, find any simple migration tools or documentation detailing the best path for migration. I guess this could be because MonetDB is more often tackled by people with bigger brains or with more time to figure things out.

I attempted to use the following to dump data from MySQL:

select * from h into outfile '/dbtmp/tmp/h' fields terminated by "|" enclosed by '"';

And then the following to import into my MonetDB table:

copy 1000000 records into h from '/dbtmp/tmp/h' using delimiters '|','\n', '"'  null as '';

This yielded reasonable results - though I did have to do some tidying up in the middle with sed - in the end I gave up as there were some string values causing me problems, so I decided to rest on it and went to bed!

In the morning I came back to find the merovingian process was dead, and the status of the database was showing as crashed. I started up the processes and took a look at the status - it said the health was 67% so i'm not really sure what's going on with it!

Performance

In the time I had available I was only able to get a 1 million row table imported successfully to play with - a shocking performance I know, but MonetDB was being quite fussy and I wasn't pressing the right buttons! I did run a few tests and also ran them against the same dataset in MySQL for comparison, all are run from cold - i.e. MySQL and MonetDB are both restarted before each query. I don't expect these queries to be representative of real world cases, I was just thinking of some nasty queries that I could throw at a single table in order to cause some pain.

Query 1

MySQL takes 250msec:

sql>select count(*) from h;
+---------+
| L1      |
+=========+
| 1000000 |
+---------+
1 tuple
Timer       1.532 msec 1 rows
Query 2

MySQL takes 420msec:

sql>select count(*) from h group by intcolumn;
+-------+
| L2    |
+=======+
+-------+
65 tuples
Timer     142.260 msec 65 rows
Query 3

MySQL takes 44,000msec:

sql>select count(*) from h group by varcharcolumn;
+-------+
| L1    |
+=======+
+-------+
12743 tuples
Timer    1464.389 msec 12743 rows
Query 4

MySQL takes 37,500msec:

sql>select count(*) as total from h group by varcharcolumn order by total;
+-------+
| L1    |
+=======+
+-------+
12743 tuples
Timer    1496.537 msec 12743 rows
Query 5

MySQL takes 373,000msec (not a typo, it's more than 6 minutes):

sql>select count(*) as total from h group by varcharcolumn, anothervarcharcolumn order by total;
+-------+
| L1    |
+=======+
+-------+
69696 tuples
Timer    4170.520 msec 69696 rows

Summary

Obviously this quick trial of each of these is not comprehensive enough to make any solid comparisons of performance - the next step will be for me to go through and come up with a proper test plan in order to be a little more methodical about things. However, it has given me a good grounding in how the 3 systems compare with respect to installing and getting started. I'll be keeping a close eye on InfiniDB - while not stable enough right now, i'm sure they'll keep things rolling and I look forward to taking another look. If I can overcome the import obstacles and also the different 'feel' of MonetDB then the basic query results make a compelling case for taking a further look - there's also more to learn here with respect to architecture, deployment techniques, monitoring, etc. Finally, Infobright - it would make my life easier if we could use it on an insert/update/delete basis - as it is I think we'd have a tough time getting clients to pay the license fee - perhaps if bundled with something like EC2 instances with a smaller incremental cost then it may be more palletable and help to increase adoption (it may be that Infobright have lots of customers with open wallets - in which case please share them!). In terms of immediate ease of use, with some visible performance improvements, Infobright fits the bill - but until i've had a chance to compare MonetDB and Infobright in a bit more detail then i'll reserve my final judgement!

Discuss this entry

InfiniDB, Infobright and MonetDB - Day 2: Infobright

Day 2 of my tour of column based storage brings me on to Infobright Community Edition (ICE). The first impressive point was that based on my blog post of yesterday then I already had an email from Mark in Community Relations at Infobright offering help and advice - despite me calling him the wrong name (I was having a bad day!) then he was immediately helpful and also offered to get some of his team to look into my queries.

As an aside, John from Calpont was also kind enough to drop by to respond to some of my points - to me this gives me a warm fuzzy feeling that both Infobright and Calpont are taking the community seriously - I guess for these products to gain traction they need to make sure people can get motoring with them to improve adoption.

Read more of this entry

InfiniDB, Infobright and MonetDB - Day 1: InfiniDB

We're taking a whistlestop tour of some of the column based storage systems out there for a project we're working on (where the use case seems to fit better with this form of storage rather than straight MySQL). After reading through the series of articles on the MySQL Performance Blog then we chose to look at InfiniDB, Infobright and MonetDB - with the two that talk MySQL coming first for ease of integration right now. I'm also going to do this as a three parter - so first up is InfiniDB.

Read more of this entry

Archives

  1. January 2013
  2. December 2012
  3. January 2012
  4. May 2011
  5. January 2011
  6. August 2010
  7. July 2010
  8. April 2010
  9. January 2010
  10. November 2009
  11. September 2009
  12. August 2009
  13. July 2009
  14. June 2009
  15. May 2009
  16. April 2009
  17. March 2009
  18. February 2009
  19. December 2008
  20. November 2008
  21. September 2008
  22. August 2008
  23. July 2008
  24. June 2008
  25. May 2008
  26. April 2008
  27. March 2008
  28. February 2008
  29. January 2008
  30. December 2007
  31. November 2007

Tags

  1. 37signals
  2. actioncontrollerdispatcher (nameerror)
  3. actionview
  4. active messaging
  5. activerecord
  6. activesupport
  7. actverecord
  8. adobe content server
  9. aes
  10. aggregation
  11. ajax
  12. akismet
  13. amazon
  14. amazon sqs
  15. ami
  16. apache
  17. api
  18. apple
  19. apr
  20. apr-util
  21. async
  22. attachments
  23. attachment_fu
  24. attr_accessible
  25. auto scaling
  26. autotest
  27. availability
  28. aws
  29. backgroundrb
  30. beanstalkd
  31. bindings
  32. bj
  33. block
  34. boot failure
  35. branding
  36. buckets
  37. bug
  38. bugmash
  39. cache
  40. caching
  41. callbacks
  42. cancer research uk
  43. cdn
  44. centos
  45. charity
  46. cloud
  47. cloudfront
  48. clusters
  49. column information
  50. columns
  51. community
  52. company name
  53. compatibility
  54. compiler
  55. composed_of
  56. config.ru
  57. consultancy
  58. content
  59. content delivery
  60. controller
  61. convert
  62. cookies
  63. csrf
  64. css
  65. data warehouse
  66. database
  67. dates
  68. defensio
  69. deployment
  70. design
  71. development
  72. dhtml
  73. digital editions
  74. docrails
  75. documentation
  76. donations
  77. drdb
  78. drm
  79. duplicate key
  80. dynamodb
  81. ebooks
  82. ebs
  83. ec2
  84. elastic
  85. elastic block store
  86. elastic load balancing
  87. em-resque
  88. encoding
  89. encryption
  90. erb
  91. error
  92. european
  93. events
  94. exalead
  95. ezcrypto
  96. facebook
  97. fckeditor
  98. feedburner
  99. feeds
  100. ffmpeg
  101. filter
  102. fixes
  103. flash
  104. flickr
  105. flickr api
  106. flickr_fu
  107. fuse
  108. geekup
  109. gem
  110. gems
  111. geocode
  112. git
  113. github
  114. god
  115. great south run
  116. greenplum
  117. growl
  118. hacker
  119. haml
  120. haproxy
  121. helper
  122. hmac
  123. holiday
  124. hooks
  125. hosting
  126. howto
  127. hpricot
  128. html
  129. identity
  130. imagemagick
  131. imagescience
  132. infinidb
  133. infiniteftp
  134. infobright
  135. init.d
  136. insert
  137. invalid authenticity token
  138. italy
  139. javascript
  140. jobs
  141. jobs per fork
  142. jquery
  143. json
  144. leeds media
  145. limit
  146. linkrenderer
  147. linode
  148. linux
  149. lish
  150. load balancing
  151. logo
  152. mac
  153. markaby
  154. mass-assignment
  155. memcached
  156. mephisto
  157. messageverifier
  158. messaging
  159. middleware
  160. migrate
  161. migration
  162. model
  163. mod_rails
  164. mod_ruby
  165. monetdb
  166. mongrel
  167. mongrel_cluster
  168. monit
  169. monitoring
  170. mootools
  171. mp3
  172. mq
  173. multiple gems
  174. multiselect
  175. mysql
  176. neon
  177. new site
  178. nginx
  179. observer
  180. offset
  181. open source
  182. opensolaris
  183. openssl
  184. optimisation
  185. pagination
  186. paperclip
  187. parameters
  188. params
  189. passenger
  190. patch
  191. performance
  192. permanentredirect
  193. persistence
  194. persistent storage
  195. persistentfs
  196. php
  197. phusion
  198. plugin
  199. plugins
  200. post commit
  201. post-commit
  202. pow
  203. protomultiselect
  204. prototype
  205. query
  206. queues
  207. quotes
  208. race for life
  209. rack
  210. rails
  211. rails development
  212. rails patch
  213. rails plugin
  214. rails-doc
  215. rails3
  216. rake
  217. recovery
  218. refresh
  219. renderer
  220. respond_to
  221. resque
  222. rich text editor
  223. rmagick
  224. ruby
  225. ruby on rails
  226. rubyinline
  227. running
  228. rvideo
  229. s3
  230. s3fs. elasticdrive
  231. scaling
  232. schema
  233. schwarzenegger
  234. scm
  235. search based applications
  236. security
  237. services
  238. session
  239. shorthand
  240. signed
  241. snarl
  242. social
  243. solaris
  244. spam filter
  245. sparrow
  246. specify
  247. sponsorship
  248. sql
  249. sqlite3
  250. sql_logging
  251. starling
  252. starter kit
  253. storage
  254. streaming
  255. subversion
  256. sue ryder care
  257. survey
  258. svn
  259. swfupload
  260. swig
  261. sysadmin
  262. tables
  263. tamper
  264. templates
  265. the webfellas
  266. thewebfellas
  267. thin
  268. thumbnail
  269. time zone
  270. tinymce
  271. tip
  272. tips
  273. to-done
  274. training
  275. transcoding
  276. twitter
  277. tzinfo
  278. ui
  279. uk
  280. uk rails
  281. unsigned
  282. update
  283. uploads
  284. url
  285. ux
  286. validation
  287. version
  288. video
  289. view
  290. vmdk
  291. vmware
  292. webfellas
  293. webfellows
  294. wedding
  295. welcome
  296. widgeditor
  297. will_paginate
  298. win32
  299. windows
  300. wysiwyg
  301. xen
  302. xhtml
  303. xvm
  304. yaml
  305. youtube
  306. zenoss
  307. zentest
  308. zfs

Flickr snaps