Pushing GAE Limits
While GAE might have great infrastructure for web applications (I’m using the java version), storing large amounts of data is pushing the envelop of what it can do. To store large files, I had to break each file up into a little less than 1MB chunks (1MB is the largest unit of storage for a GAE object). When I started to retrieve the 1MB objects, I would periodically run into GAE datastore timeout issues.
Retrieving large entities does not seem to be worked out yet. As a note, I would get timeout errors retrieving a single 1MB entity by it’s primary key. After fiddling around a bit, I have decided that I’m pushing the GAE infrastructure. So, I’m back to trying out Amazon’s S3. Now that I have a prototype of the application, it shouldn’t be too hard to port to S3.
In fact, I’ve been able to setup a S3 look alike instance on my local hardware using Eucalyptus; but that’s a story for another day.
I’ve been using a combination of gae and s3 as well. all static files, swf, and images, i’ve been storing and serving from s3.