Thursday, January 15, 2009

Setting up an Amazon AMI with Java and MySQL on EBS using the AWS Management Console

Introduction

In this post I'll describe the steps I performed to create my own Amazon Machine Image using the AWS Management Console and Windows XP as my local machine.
Just recently Amazon released their AWS Management Console. It's still in beta, but it makes life already so much easier: before it was available, you had to use quite a few scripts to get your own AMI ready.

The final AMI will have installed on it:

  • Fedora Core 8

  • 32-bit architecture

  • Java JDK 7 (1.7.0)

  • JEE 5

  • Tomcat 5.5.27

  • Apache 2.2.9

  • MySQL 5.0.45

I also will setup MySQL on Elastic Block Storage such that you can shutdown the AMI and not lose your MySQL data.
In the end you should be able to deploy for example a Java .war file (if correctly assembled of course) with the following frameworks without any problem:

  • Spring 2.5

  • Hibernate 3.2.5

  • Wicket 1.3

  • Sitemesh 2.2.1

  • Quartz 1.6.2

For the steps described below, I'm assuming you've already setup your keys and know how to start an AMI and get access to it via a browser and putty. If you don't know how to do this, on the AMC homepage there's a good introductionary video on how to do this.

Starting up an instance

You can start with a very basic AMI with only an OS installed on it, or use one that has already a lot more software installed on it.
As a starting point I used the publicly available Java Web Starter AMI. Notice that you can lookup AMIs without being logged in into AWS.
Start the instance such that you see something like this:


Check that Tomcat is running by going to the public IP of the instance. In my case I had to go to http://ec2-174-129-150-80.compute-1.amazonaws.com/. And I do see Tomcat:



Setting up MySQL and Tomcat passwords

In the basic AMI I'm using, MySQL root has no password and the Tomcat Manager login is admin/password. You don't want that in your final version, so let's change that. First login with putty to the instance (don't forget to use the ppk version of the key). You can login as root w/o a password because the key takes care of the authentication. Then change the passwords:
  1. MySQL password: login to mysql:


    mysql -u root



    And execute:


    GRANT ALL ON *.* to 'root'@'localhost' IDENTIFIED BY 'mysecretpassword'.



    Note that we're only allowing root access from the localhost (the AMI itself) and of course replace the 'mysecretpassword' text for your own password. Double check that you can now login with the new password. If not, you might have to restart the AMI (all changes are lost) and try again.

  2. Tomcat password:


    cd /usr/share/tomcat5/conf.



    Edit tomcat-users.xml, change the password field with value 'password' where it says username="admin" to your desired new password. Restart Tomcat to let the change take effect: /etc/init.d/tomcat5 restart. Go to your instance again with your browser and check that the new Tomcat Manager login works.


If there's anything else you like to change on your AMI, you should do it now, for as long as you don't terminate the AMI instance. Tip: Reboot is fine btw, that keeps the AMI settings!

Setting up MySQL on Amazon EC2 with Elastic Block Store

As a basis for these steps I'll be using "Running MySQL on Amazon EC2 with Elastic Block Store". The steps given in there are written prior to the Amazon AWS Management Console being available, so here I'll describe the changes necessary using the AMC. The steps in the article are also specificly for Ubuntu 8.04 LTS, but as you can see from the AMI I use as a starting point, I'm using Fedora 8. Only some of the commands differ, for the rest the steps work for both OSs. Í'll also be setting up Ext3 as filesystem, not XFS.
  1. Create an EBS volume with the AMC: click on Volumes on the left and click on the 'Create Volume' button. Enter the required capacity. AFAIK as long as you're not using the datablocks, you're not charged. So I put in 25GB as size. Select a zone (don't know how much it matters which one you pick, just make sure you stay in the same zone as your AMI). This should result in something like this:


    Now attach the volume to the instance as device /dev/sdf (in the original article sdh is used) via the 'Attach Volume' button. When successful, the 'Attachment Information' status field has changed to "attached":


  2. Now let's format the volume and mount it. For that I deviated from the article and followed the steps you can also see when you click on the 'Help' button on the (current) EBS Volumes page in AMC. Thus execute in your putty session:


    mkfs -t ext3 /dev/sdf



    Hit OK on any questions. Then create a directory to mount the EBS volume on. Let's use a more distintive name than '/vol'. Let's create:


    mkdir /ebsmnt



    Then mount it:


    mount /dev/sdf /ebsmnt



    Note: at my first effort, I wanted to use /mnt/data (in the Volumes 'Help' button in the AMC they also give /mnt as an example), so I created that directory. But: the bundling command you'll see below does not include the /mnt directory by default! So then the /mnt/data directory doesn't exist on the new AMI, and then the auto-mount from fstab at bootup of the AMI will always fail for that reason!

  3. Make sure the mount is performed at startup by adding it in /etc/fstab:


    /dev/sdf /ebsmnt ext3 defaults 0 0




  4. Backup the new config file to the EBS into its own separate directory, maybe you need it some time:


    mkdir /ebsmnt/configs
    rsync -a /etc/fstab /ebsmnt/configs/




Now let's tell MySQL to use the EBS volume to store its databases.
  1. Stop MySQL:


    /etc/init.d/mysqld stop



    And to be safe:


    killall mysqld_safe




  2. Move the existing database files. Since there isn't much yet except a couple of test databases, not much needs to be done. First let's make a separate dir for MySQL on the EBS volume:


    mkdir /ebsmnt/mysql



    Then:


    cd /ebsmnt/mysql
    mkdir lib log



    And start moving stuff:


    mv /var/lib/mysql /ebsmnt/mysql/lib/
    mkdir /var/lib/mysql # Note that we need this dir for the mysql.sock file
    chown mysql:mysql /var/lib/mysql # Give it again the correct permissions
    mv /var/log/mysqld.log /ebsmnt/mysql/log/




  3. Tell MySQL to look on the mounted EBS from now on. Edit /etc/my.cnf and change it as below. The '# Was: ' indicates wat was there originally:


    [mysqld]
    # Was: datadir=/var/lib/mysql
    datadir=/ebsmnt/mysql/lib/mysql
    socket=/var/lib/mysql/mysql.sock
    user=mysql
    # Default to using old password format for compatibility with mysql 3.x
    # clients (those using the mysqlclient10 compatibility package).
    old_passwords=1

    [mysqld_safe]
    # Was: log-error=/var/log/mysqld.log
    log-error=/ebsmnt/mysql/log/mysqld.log
    pid-file=/var/run/mysqld/mysqld.pid



    Note that I put the log-error file also on the mount. The reason for this is that I want to have to logfile saved even when I shutdown an instance. If you don't care, you can leave it as it was.
    Another advantage I found out by practice is that when you've set the log-error like above, you can't detach the volume because MySQL is still accessing that logfile. So you can't accidentally detach, maybe leaving MySQL in an inconsistent state, which is a good thing (I mean not leaving MySQL in an inconsistent state).
    In that case you'll have to stop your instance w/o detaching it first (this makes sure MySQL shutsdown first). Update: as far as I can tell from /etc/rc.d, unmounting from fstab is done as the last thing, so doing a 'Terminate' of the instance to automatically let the shutdown sequence do the unmount and unattach of the volume should be safe.
    Note also that I didn't modify the mysql.sock socket file. Advantage is that I can keep using mysql and mysqladmin the way they are. If you would change the location of the socket file, say to /ebsmnt/mysql/lib/mysql/mysql.sock, then you will have to start mysql and mysqladmin with the '-S ' option.

  4. Backup the new config file to the EBS, just to be safe:


    rsync -a /etc/my.cnf /mnt/configs/




  5. Restart MySQL again:


    /etc/init.d/mysqld start



    You can check everything is going ok by tailing the logfile (on the mount of course):


    tail -f /ebsmnt/mysql/log/mysqld.log




  6. To see later on that the data is still available after having terminated the AMI, create an example database:


    mysql -p -e 'CREATE DATABASE esb_test_database'




So now the database is setup on the EBS. If you want to create snapshots, you can do that via the AWS Console or follow the instructions in the mentioned article, which also includes automated snapshots and cleaning up everything that has been created in EBS when following the above steps (handy to know, because otherwise you'll be charged for using it until the end of time! hahahahaaa).

So now the AMI is how I want it to be. Let's store it on S3 to make it persistent.

Bundling the new Linux AMI

This part is based upon Bundling a Linux or UNIX AMI from the online EC2 Developer Guide. There's currently almost no AWS Management Console facilities to do this, so we'll have to use the AMI Tools on the host AMI.

  1. Installing the AMI Tools: luckily they are already installed on the AMI we're using (how you'd think it got created? :-). Try this command to see that they are indeed installed (well of course that only proves this one is installed ;-):


    ec2-bundle-image --manual



    If you're using an AMI that doesn't have these tools installed, follow the steps in section 'Installing the AMI Tools' in the article.

  2. Bundling an AMI Using the AMI Tools: Let's do the bundling in /tmp:


    cd /tmp



    Then to bundle execute:


    ec2-bundle-vol --prefix Fedora8-JEE5-JDK7-Tomcat55-MySQL50 -k
    <private_keyfile> -c <certificate_file> -u <user_id>



    Note that I specify a prefix. By default it's 'image', so you'll get 'image.manifest.xml'. When you look at all the AMIs out there, they should be more descriptive than that, so I used Fedora8-JEE5-JDK7-Tomcat55-MySQL50.
    The other parameters are explained in the article. The user_id you can find in the AMC under the 'Your Account --> Account Activity' menu.
    Your private_keyfile is the pk*.pem file, which you need to upload to the AMI (host). The same goes for the cert*.pem file. Note I put them in /mnt so they won't get put on the new AMI, which you don't want!
    An example pscp copy from Windows command prompt to get the files over would be:


    pscp -i Fedora8.ppk pk*.pem cert*.pem
    root@ec2-174-129-150-80.compute-1.amazonaws.com:/mnt/



    Fedora8.ppk is the filename of the key generated for putty with puttygen, as clearly described in the before-mentioned video on the AMC homepage.
    You can accept all the defaults at the prompts.
    The warnings:


    "NOTE: rsync with preservation of extended file attributes failed. Retrying
    rsync without attempting to preserve extended file attributes..."




    and


    "NOTE: rsync seemed successful but exited with error code 23. This probably means
    that your version of rsync was built against a kernel with HAVE_LUTIMES defined,
    although the current kernel was not built with this option enabled. The bundling
    process will thus ignore the error and continue bundling. If bundling completes
    successfully, your image should be perfectly usable. We, however, recommend that
    you install a version of rsync that handles this situation more elegantly.
    "

    are (apparently) no serious problem, so these can be ignored.



    Note that a file with the same name as the --prefix parameter gets created in the directory where you started the command.
    Notice from the manual of the above command, that it skips /mnt (amongst others), so in case you mounted it there, the command won't include the whole mounted EBS in the bundle (luckily)!
    Note also that it seems the script is smart enough to not include the mounted /ebsmnt data either. If you want to be really sure, umount it first before running the bundle command.
    Running the command takes about 5-10 minutes.
    The message


    "Unable to read instance meta-data for product-codes"



    at the end does not seem to be a serious problem, I haven't found any problems at least :-).

Uploading a Bundled AMI

Here you can just follow the steps in the article, thus:


ec2-upload-bundle -b <bucket> -m Fedora8-JEE5-JDK7-Tomcat55-MySQL50.manifest.xml
-a <access_key> -s <secret_key>



The two keys should be under the menu 'Your Account --> Access Identifiers' in the AWS Mangement Console.
You can imagine being a directory in your S3 storage. Let's use "mybucket" as name.

Registering the AMI

There are two ways of registering. Via the AMC is described first. Then the old way via the Amazon EC2 API command line tools as in the article is described.

Via AWS Console
  1. Go to the 'AMIs' page and click the 'Register New AMI' button.

  2. In the popup enter the path to your above uploaded manifest file, including the bucket. So the whole path becomes:


    http://s3.amazonaws.com:80/mybucket/Fedora8-JEE5-JDK7-Tomcat55-MySQL50.manifest.xml



    Thus the popup would look something like this:



  3. Hit 'Register' to register the AMI. Too late for this tutorial I noticed this 'Register New AMI' button, thus here my personal practical experience ends for this option. I used the command line tools as described below. But I guess the result will be the same: a registered AMI!

Via Amazon EC2 API command line tools

For this I used the steps described here in the AWS Developer Guide.
For this you need to download and install the Amazon EC2 API command line tools on your local machine and a Java Runtime, I just used the Java 5 JDK.
Then run:


ec2-register mybucket/Fedora8-JEE5-JDK7-Tomcat55-MySQL50.manifest.xml



where mybucket is the you specified in the above 'ec2-upload-bundle' command.
It returns the id of the created AMI, e.g.


IMAGE ami-aabb5cc3



Now check that you can find it in the AMIs 'Owned By Me' in the AMC by searching for the above AMI ID ami-aabb5cc3. And yes there it is:


If you want, you can make it public such that other users can see it etc. But I won't describe that here.

That's it! And as final step:

Let's see if the new AMI actually works

Let's see if it starts up, the EBS volume can be connected to it and the newly created test database is still there.
First terminate the running (modified) AMI. Note that you can't detach, since MySQL still uses the volume!
Start the new AMI. Now the problem is that you can't attach until Fedora OS is starting to boot. But /ebsmnt can't be mounted until it is attached! Thus MySQL can't find its data.
So what you can do is run the 'ec2-attach-volume' command until it succeeds, or check the instance status until it reaches a "specific intermediate state" and then run the ec2-attach-volume command. This way you "hope" that the attachment succeeds before the mounting. I never managed to get this working.
The safest solution seems to attach the volume when the AMI has reached status 'running', and then reboot the instance.
Of course you can put this all in scripts to automate it as much as possible.

Note that my AWS Mangement Console did not show my newly running AMI instance when trying to attach a volume via the AMC. There might be some delay in status updates(?). The command 'ec2-attach-volume' did work in that case. An example of that command on your local workstation:


ec2-attach-volume -d /dev/sdf -i i-ec8f0e85 vol-52d4303b



Also don't forget to use the same zone for the volume and the instance, otherwise you also might not see the running instance in the 'Attach' popup window.

Done

Did you notice BTW what a large amount of memory you get, even for a small AMI instance:


Mem: 1747764k total, 176724k used, 1571040k free, 12236k buffers
Swap: 917496k total, 0k used, 917496k free, 64340k cached



Tip: for managing your AMI and other data that you (want to) store on Simple Storage Service S3, you can either use the more manual tools from Amazon or the great graphical Amazon S3 Firefox Organizer (S3Fox) Firefox plugin.

25 comments:

Eric Hammond said...

You do get charged for all of the EBS volume even if you are not using some blocks on it.

Beware of taking snapshots of an EBS volume holding a database without flushing tables, locking the database, and freezing the file system during snapshot creation (which can be fast).

The ext3 file system does not support freezing by itself, so the file system may not be consistent, and the database may be corrupt when you try to restore.

The procedure I outlined in the tutorial describes how to take consistent snapshots with XFS, but I have not tested this with Fedora.

http://ec2ebs-mysql.notlong.com

Techie said...

Thanks, good to know!

Cory said...

Great article, it was clear and concise. I used your AMI to bring up a server with a java stack.

I found an issue with the configuration of the AMI and thought you might want to know about it.

The 1.7 version of java has issues loading keys from a keystore. This functionality is necessary for configuring tomcat to run with https.

The only solution that I have found for this problem, so far, is to replace java 1.7 with a sun version of the jdk (1.6).

I'm wondering if you'd be interested in building the same AMI with a java 1.6? That would make it much easier for me :)

Techie said...

No, no plans for that by me :-). I agree that's the only not-so-great-thing about this AMI: JDK 7 is way too advanced, I think it's not even officially out yet.
But it should not be too hard to also install JDK 6, and make the settings that you use that JDK instead of JDK 7. In Ubuntu you can do something like 'sudo update-alternatives –-config javac' to configure the JDK /usr/bin/javac should be set to. Must be possible in Fedora...

Anonymous said...

Thanks, I will definitely try out this AMI image. I appreciate the very well-written article!

Daniel D. Shaw said...

Pretty sweet. What's your average monthly cost for this setup?

Techie said...

For February it was in total $69.82. The AMI ran the whole month for a total of 672 hours. In January it only ran 324hrs and that was $42.73.

Unknown said...

Amazing blog!!
really got me started with Amazon EC2.
Keep writing!
Amir

Anonymous said...

Just wanted to say thank you for the GREAT post! You saved me (and non-profit org I work for) hours of "yahooing", "googling" and countless trial and error.
Btw, have you tried modifying existing AMI and then saving it in the same S3 bucket with the same manifest name? Is it possible?

Techie said...

Haven't tried it, but I guess that won't be a problem because you get an AMI id for each AMI you register. That id seems to be the unique identifier of an AMI.

Anonymous said...

I am using this Java Web Starter AMI and your blog to get myself started in EC2. Thanks a ton for the clear explanation. I have a question though. How is Apache forwarding all the requests to tomcat. Where can I find this exact config. I am not a linux guru. Kindly help.

Techie said...

Good point, because I have Apache mentioned as software pre-installed on the AMI.
But it does seem it is not being used to forward requests, and that Tomcat is also serving HTTP. I need to do some more investigating to be sure...
httpd is running but only with a helloworld config file; you can see it when you 'ps -ef|grep httpd':

/usr/sbin/httpd -f /home/webuser/helloworld/conf/httpd.conf

Unknown said...

Very cool thanks! will help get me up and running quickly!

Just wondering how will this work if/when you need to scale up; running multiple EC2 instances; in terms of the following

- from a DNS load balancing point of view?
- from MySQL sharing the same storage area?

Thanks
Roger

Unknown said...

attaching apache to tomcat

http://tomcat.apache.org/connectors-doc/webserver_howto/apache.html

Thanks
Roger

Techie said...

@MrRe: I haven't used the loadbalancing features of AWS yet so can't help you there. Depending on the type of your application, you can potentially immediately use the AWS auto-scaling and load balancing. If your app is completely stateless, it should be easy to setup. Otherwise, you'll have to take provision for it in your code and setup to handle multiple instances accessing for example the database storage area (as you mention). Recently AWS announced new auto-scaling and load-balancing features, maybe you can find some more info there regarding your questions. Good luck!

Unknown said...

The starting point of this AMI was the "Java Web Starter" AMI:
ami-45e7002c
/aws-console-quickstart-amis/tomcat/1.1/tomcatquickstart.manifest.xml

As Techie said, it runs the Apache Web Server via the command:
/usr/sbin/httpd -f /home/webuser/helloworld/conf/httpd.conf

The config file:
/home/webuser/helloworld/conf/httpd.conf
configures it to run the server as "webuser", not "apache", and it keeps the Web pages, pid file, logs, etc. in
/home/webuser/helloworld
instead of the standard places.

Furthermore, it includes the lines:
LoadModule jk_module /etc/httpd/modules/mod_jk.so
JkWorkersFile /home/webuser/helloworld/conf/workers.properties
JkLogFile /home/webuser/helloworld/logs/mod_jk.log
JkLogLevel info
JkLogStampFormat "[%a %b %d %H:%M:%S %Y] "
JkMount /* wlb
which configure it to pass all HTTP requests to Tomcat, as documented in the link:
http://tomcat.apache.org/connectors-doc/webserver_howto/apache.html
that MrRe posted.

That explains HOW it is specifying its config file, and HOW that config file arranges to forward requests to Tomcat.

My question is WHY is it launched with a config file that is not in the standard place (/etc/httpd/conf/httpd.conf)? And WHY is it launched from cron instead of from the standard startup files?

The AMI has standard /etc/rc.d/init.d files and /etc/rc.d/rc*.d sym links
to cause the various files to stop (K) and start (S) automatically
at reboot. However it does not have any S links for tomcat,
apache, or mysql, so they don't get started. It has only K links
for them. Instead, it starts them from cron via the
/var/spool/cron/root file which contains:

# restart tomcat on reboot
@reboot /etc/init.d/tomcat5 start

# restart apache on reboot
@reboot /home/webuser/helloworld/bin/run_apache

# restart mysql server on reboot
@reboot /etc/init.d/mysqld start


Any idea why it would have been set up this way?
Why not use the standard S links, and the standard
config files?

Thanks,
--Fred Stluka

golfdude said...

Awesome article. Quick question: In AWS, after creating a ebs volume, the "attach volume" button is disabled for me. Any ideas ?

Thanks

gd

golfdude said...

I had to make a change to httpd.conf in this instance and I used kill -9 to kill all the processes as init.d/httpd stop did not work. I noticed that the instance restart uses /home/webuser/helloworld/.../httpd.conf to setup apache. But init.d/httpd uses /etc/httpd. Curious on how the original "http" is getting started. I couldnt find anything in inetd.conf. Maybe I asm missing something simple. Any ideas ?

Thx

pady

golfdude said...

To my previous post, who is kicking off webuser/helloworld/bin/run_apache ? I did a find/grep on etc and cannot find a reference ? Have to be missing something simple...

thx

pady

Techie said...

@gd: strange, I can't remember having this problem. Can't help you there...

@pady: Maybe Fred's question above clarifies it a bit (though I guess you read that one already)?
Otherwise the next thing to do is a find/grep at root level...

Techie said...

@Fred: sounds quite non-standard to me too. Haven't found why this approach was taken.

Niel Eyde said...

This post was a really big help. I ended up creating my own AMI with Fedora8, Tomcat6.0, ApacheWebServer (for front-end), and MySQL. Your instructions on setting up EBS was perfect.

As far as cost, have you considered using a reserved instance? It looks like it could save some significant cash. I've been experimenting with OnDemand instances, but I think I'm going to go ultimately go with a reserved instance.

golfdude said...

Hi Niel,

I just got my server up with the same image config ( and setup log-bin backups to s3 and weekly backups also ). I am also tempted to get a reserved instance ( RI ) - but one catch I just learned is that with a RI, you cannot upgrade a reserved instance. ie the reserved instance is tied to the type of an instance. So if you have a small reserved instance, and decide to upgrade ur current small instance to a large instance, you cannot use the reserved instance. You have to get another reserved instance for the large instance. Check the docs. Might as well get a large reserved instance and a large instance. But if you do not have the need, then reserved instance is an obvious choice...

golfdude said...

Here is the RI catch I mentioned in my prev post...( see "Important Notes about Purchases" section )

http://aws.amazon.com/ec2/reserved-instances/

Techie said...

Update: note that Amazon RDS now offers easier relational database support (MySql for now). I haven't tried this out yet though, so can't tell you how well it works.