S3 
October 5th, 2007
In the life of a web application, there comes a point where that shared hosting account just isn’t good enough (and you found out because your provider kicked you off), or your server just isn’t able to pull the queries from the database fast enough. Then one day, you finally get the filesystem error EMLINK, which you have a VERY hard time googling.
This is simple, you just created the maximum number of subdirectories that you can have in a directory. This is suprisingly not a common issue with file_column, acts_as_attachhment or attachment_fu, although I’m shocked as why it’s not. So, what do you do when you’re faced with scalability issues, and you’re image handling plugin is broken!
THROW IT ALL AWAY!
That’s what I had to do. Recently we worked on a site and we decided that because it was getting too hammered, that we would put the images on S3. Then we found the ultimate weakness of S3, which is that it’s not able to easily handle batch processing. We used the AWS:S3 library for most of the movement of the files, but we found that if we made a mistake, it would cost us hours to get these back.
Eventually, the day was saved with jetS3t, and Cockpit. Using jetS3t, we were finally able to actually get through all the S3 issues, and it saved the day at the end. (Actually, Dave saved the day at the end, my computer kept running out of memory). But we managed to get S3 support into it, and all we had to do was sacrifice File Column and replace it with this:
def user_image=( blob )
# establish S3 connection
AWS::S3::Base.establish_connection!(:access_key_id => AWS_ACCESS_KEY_ID, :secret_access_key => AWS_SECRET_ACCESS_KEY)
datestamp = Time.now.strftime(’%d%m%Y’)
identifier = UUID.random_create.to_s
object_path = “images/” + datestamp + ‘/’ + identifier + ‘/’
object_key = object_path + blob.original_filename
self.image = blob.original_filename
self.image_dir = ‘http://s3.amazonaws.com/bucket/images/’ + datestamp + ‘/’ + identifier + ‘/’
image_data = blob.read
#Send the file to S3
AWS::S3::S3Object.store(object_key, image_data , ‘bucket’, :access => :public_read)
# resize to thumnail here
img = Magick::Image.from_blob( image_data ).first
thumbnail = img.resize_to_fit! 96, 96
# Set the thumbnail directory path
thumb_key = object_path + ‘thumb/’ + self.image
AWS::S3::S3Object.store(thumb_key, thumbnail.to_blob , ‘bucket’, :access => :public_read)
end
However, if you have to do S3, I would highly recommend using a long key so that you can sort your re.sults better based on this key! However, the biggest gotcha I found when adding S3 integration to my rails app was including AWS/S3. If you include and require it, it will break your routing, this is something that can cause hours of headaches, especially if you are doing something else. At the end, we learned that S3 is a misnomer. For a large number of files, it’s far from simple.
This entry was posted on Friday, October 5th, 2007 at 5:36 pm and is filed under Linux, Rails, Ruby, Uncategorized. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

October 8th, 2007 at 12:46 am
[...] Joe@Nitobi » Blog Archive » S3 A good write up on using S3 to offload image hosting and serving to AWS. We definitely had some challenges with it working with Kidzworld. jets3t and cockpit and it helped a lot. (tags: s3 aws hosting scaling images emlink joebowser) [...]