Graylog2 v.0.91.3 + Elasticsearch v.1.3.4 : backing up Graylog data to AWS S3 made easy

On November 5th the Graylog2 team have released a new server and web-interface version -v.0.91.3 – , which now supports Elasticsearch v.1.3.4 .

One of the major benefits over the previous versions (v. 0.90.0 and lower), which can be viewed in the release notes , is the exact compatibility with Elasticsearch v.1.3 series , finally included in a stable build.

Before detailing the advantages of moving to this new version, a quick guide on how to install Graylog can be found here , on top of which I am adding these notes :

1. Graylog2 v.0.91.X & Graylog2 v.0.20.x server and web-interface installation process is exactly the same, just replace the packages with the most recent ones.

2. Elasticsearch server must be 1.3.4 specifically , there is no backward compatibility guaranteed with older ES versions .

3. Mongo version specifications are the same, use the latest stable build (if for some reason you choose not to, make sure the version you are using is at least v.2.0)

Now, about the benefits : backing up your Graylog2 data has never been easier. With the support for Elasticsearch v.1.3 series , we can profit greatly over Elasticsearch’s Snapshot&Restore API (introduced with v.1.0.0 , which until now was not supported by Graylog2).

I am going to illustrate here a simple example for backing up your data to AWS S3 , Elasticsearch offering an AWS Cloud Plugin for S3 repositories . Some of you may have used S3 Gateway until now, but support for that has been removed early this year.

First let’s install the plugin (v.2.3.0 for Elasticsearch v.1.3.x), and restart the node:

{path_to_elasticsearch}/bin/plugin -install elasticsearch/elasticsearch-cloud-aws/2.3.0

Then what you need to do is register your backup bucket as a repository with Elasticsearch , like so:

$ curl -XPUT 'https://localhost:9200/_snapshot/my_s3_repository' -d '{
    "type": "s3",
    "settings": {
        "bucket": "my_bucket_name",
        "region": "us-west-1"
    }
}'

Once this is completed, time to run the backup.You can do it by issuing a command similar to :

$ curl -XPUT "localhost:9200/_snapshot/my_s3_repository/snapshot_1?wait_for_completion=true"

Important note : A repository can contain multiple snapshots of the same cluster, so they have to be uniquely identifiable if you do not want to overwrite previous data . Up to you to decide on a naming convention, in the above example we use “snapshot_1”.

If you want to try out the restore , simply do (which will restore all the indices and the cluster state):

$ curl -XPOST "localhost:9200/_snapshot/my_s3_repository/snapshot_1/_restore"

Here’s a link for more details on the snapshot&restore api , it provides further more complex examples on how to get the desired usage out of this feature. You can do selective or partial restoration, you can monitor the snapshot/restore progress, and of course delete snapshots, get their status, ans so on.

And here’s the link to the AWS plugin repo on GitHub , some pointers can be found there as well.

You can now go ahead and script this procedure, cron it,etc, to backup your Graylog data in a simple and efficient way.

Simona Miroiu