This blog post is the final part of the Data Warehouse Migration to AR series. The second part of the blog post series Data Warehouse Migration to Amazon Redshift – Part 2 details on how to get started with Amazon Redshift, the business and technical benefits of using AR.
1. Migrating to AR
The migrating strategy that you choose depends on various factors such as:
- The size of the database and its tables
- Network bandwidth between the source server and AWS
- Whether the migration and switchover to AWS will be done in one step or a sequence of steps over time
- The data change rate in the source system
- Transformations during migration
- The partner tool that you plan to use for migration and ETL
One-step migration is a good choice for small databases that do not require persistent operation. Clients can extricate existing databases as comma-separated value (CSV) files, and then use services such as AWS Import/Export Snowball to deliver datasets to Amazon S3 for stacking into Amazon Redshift. Clients then test the destination Amazon Redshift database for data consistency with the source. Once all validations have passed, the database is switched over to AWS.
This type of migration is commonly used for databases of any size.
Initial data migration: The data is extracted from the source database, preferably during nonpeak usage to minimize the impact. The data is then migrated to Amazon Redshift by following the one-step migration approach described above.
Changed data migration: This refers to data that changed in the source database after the initial data migration is propagated to the destination before switchover. This step synchronizes the source and destination databases.
Once all the changed data is migrated, validate the data in the destination database, perform necessary tests, and if all tests are passed, switch over to the Amazon Redshift data warehouse.
Data Migration Tools
There are several tools and technologies available for data migration. These tools can be used interchangeably or you can make use of third party tools or open source tools.
While migrating your database from one engine to another, you have two major things to consider: the conversion of the schema and code objects, the migration and conversion of the data itself. AWS has tools to help do both. You can convert schema and code with AWS SCT and migrate data with AWS DMS. AWS DMS helps you migrate your data easily and securely with minimal downtime.
AWS Data Migration Service (DMS)
AWS Database Migration Service (AWS DMS) helps you to migrate databases to AWS efficiently and securely. The source database can remain completely operational amid the relocation, limiting downtime to applications that depend on the database. This tool supports both the one-step and the two-step migration processes described above. To follow the two-step migration process, you should enable supplemental logging to capture changes to the source system. You can enable supplemental logging at the table or database level.
Make use of AWS DMS to migrate data, tables, and primary keys to the target database. All other database elements are not migrated. The service bolsters homogeneous relocations such as Oracle to Oracle, and also heterogeneous relocations between various database stages, such as Oracle to MySQL or MySQL to Amazon Aurora. The source or target database must be on an AWS service.
AWS DMS adopts a moderate approach and creates only those objects required to proficiently relocate the data, for instance; tables with primary key.
AWS Schema Conversion Tool (SCT)
AWS SCT converts the existing database schema from one database engine to another. You can convert relational OLTP schema, or data warehouse schema. Your converted schema is suitable for an Amazon Relational Database Service (Amazon RDS), MySQL DB instance, an Amazon Aurora DB cluster, an Amazon RDS PostgreSQL DB instance, or an Amazon Redshift cluster.
AWS SCT gives a task-based UI to automatically change over the database pattern of your source database into a configuration compatible with your target Amazon RDS occurence. In the event that pattern from your source database can’t be changed automatically, AWS SCT gives direction on how you can make equivalent schema in your target Amazon RDS database.
In addition to converting your existing database schema from one database engine to another, AWS SCT has some additional features that help you move your data and applications to the cloud:
- Use data extraction agents to extract data from your data warehouse to prepare to migrate it to Amazon Redshift.
- Make use of AWS SCT to create AWS DMS endpoints and tasks, run and monitor these tasks.
- In some cases, database features can’t be changed over to proportionate Amazon RDS or Amazon Redshift features. The AWS SCT extension pack wizard can enable you to install AWS Lambda functions and Python libraries to imitate the highlights that can’t be changed over.
- AWS SCT helps to optimize your existing Amazon Redshift database. AWS SCT recommends sort keys and distribution keys to optimize your database.
- With AWS SCT, you can convert SQL in your C++, C#, Java, or other application code; see, investigate, edit, and save the converted SQL code.
Amazon Redshift Best Practices
Amazon Redshift differs from other SQL database systems. To completely understand the advantages of the Amazon Redshift architecture, you need to explicitly configure, build, and load your tables to use massively parallel processing, columnar data storage, and columnar data compression.
For additional details on best practices for designing tables, loading data into tables, and writing queries check the References section.
There is a strategic shift in data warehousing as enterprises migrate their analytics databases and solutions from on-premises solutions to the cloud to take advantage of the cloud’s simplicity, performance, and cost-effectiveness. AWS gives an expansive arrangement of services and strong partner ecosystems that empower you effectively fabricate and run enterprise data warehousing in the cloud.
About the Author
Prabhu Chennupati is an avid technologist with extensive experience in enterprise application design and development. He leads the technical teams in Master Data Management (MDM) space spread across multiple continents, delivering high quality solutions on-time.