How to browse AWS RDS snapshots
Browsing Amazon Relation Database Service snapshots is not an obvious task since they are stored not as regular DB dumps but as snapshots of RDS instances. The benefit of this format is obvious - you can quickly restore it to a new instance, and it takes more space and, therefore, is more profitable for Amazon. But what if we want not to restore DB but take a fast overview of the data in the specific database or table?
Understanding export
AWS offers a relatively new option to export your DB snapshot to the S3 bucket. It has two ways to work with CLI or AWS Console. In this article, let’s look at the simplest way to achieve the result with the AWS Console. If you want to dive deep into every detail, you can read the entire documentation page from the AWS Help portal: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_ExportSnapshot.html
First of all, let’s consider several limitations for the export themself. The export feature is not available for all the regions of AWS. Please take a look at this page to understand if your DB engine and region can use export to S3 feature https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.RDS_Fea_Regions_DB-eng.Feature.ExportSnapshotToS3.html
Secondly, your RDS instance (and snapshot) should be in the same region as the S3 bucket you want to export to. I prefer to create a new bucket for database exports. It would be easier to clean after work all the unnecessary resources.
So, we can start after we sort out all prerequisites and you know which snapshot you want to export!
Creating resources for export
As I mentioned, we need to create an S3 bucket to export our DB snapshot. Let’s navigate to the S3 dashboard and create a new bucket. Put a name and select a region. Let’s leave all other options by default:
- unversioned bucket;
- private access;
- no tags (for production use, I recommend setting meaningful tags, though);
As the next step, we will create an encryption key to encrypt our backup in the S3 bucket and use this key to decrypt the data as we use it. Let’s navigate to the Key Management Service (KMS) dashboard and create a new User Managed Key. We need a “Symmetric Key” with “Encrypt and decrypt” purpose.
We set a meaningful alias for our key on the wizard’s next step.
And in the next one, we select a key administrator. For security reasons, I will not share this screen. Just select your user name for the sake of simplicity.
On the next screen, select key user “AWSServiceRoleForRDS.”
And now, as we have all the necessary resources to start our key snapshot export, we can navigate to the RDS snapshot dashboard.
Export RDS Snapshot
Select the snapshot you want to explore and choose the “Export to Amazon S3” option in the “Actions” menu.
Let’s take a look at the wizard window and explore available options.
In the first section, we must define a unique name for our export.
The second section allows us to define which database part we will export. I will keep “All” for demo purposes, but you can select a specific database and even a table to export.
In the following two sections, we need to select the S3 bucket where the master will store the export file and specify the IAM role to access this bucket. I will create a new role for our bucket.
And in the last section, we will select our key to encrypt export data.
And the last setting is to specify our encryption key
At least we can press the “Export to S3” button! We will see a new page with our export status if everything is correct. After several minutes, we can check our exported data in the S3 bucket after the process is finished.
You can navigate your bucket and find your data stored in the Apache Parquet format. You may open this format in many tools for data analysis. My personal preference is IntelliJ DataGrip with the “BigData” plugin.
As a rule of thumb, remember to remove unused AWS resources after work to avoid unexpected costs. Happy hacking!