

Denamo Markos
Traffic Flow — Data Warehouse Migration from PostgreSQL to MySQL and Redash to Apache Superset
Introduction
Our AI startup has achieved significant milestones with successful sales, revenue generation, and market recognition. This success attracted an investor with a strong technical background who has made the implementation of a more scalable tech stack a prerequisite for the second round of investment.
Info: This migration project aims to modernize our data infrastructure to handle growing data volumes and provide better analytics capabilities.
Objective
The primary goal is to transition to a more scalable technical infrastructure through two main migrations:
- Data Storage: PostgreSQL → MySQL
- Analytics Platform: Redash → Apache Superset
Technology Stack
Core Technologies
Technology | Purpose | Description |
---|---|---|
MySQL | Database | Open-source relational database management system |
Airflow | Orchestration | Workflow management platform for data pipelines |
DBT | Transformation | Data transformation and testing tool |
Apache Superset | Visualization | Data exploration and visualization platform |
Technology Comparison
MySQL vs PostgreSQL
Advantages of MySQL
- Simple troubleshooting and maintenance
- Lightweight architecture
- Performance-focused design
- Scalable for various application sizes
Warning: While MySQL offers many benefits, consider these limitations before migration.
Limitations of MySQL
- Lacks object-relational features (ORDBMS)
- No ACID compliance
- No MVCC support
- Limited security features
Apache Superset vs Redash
Superset Advantages
- Rich visualization library
- Intuitive dashboard sharing
- Enhanced collaboration features
Superset Limitations
- Fewer data source integrations
- Steeper learning curve
- More complex user interface
Implementation Details

In the ELT pipeline, the migration happens after all the data has been loaded to the PostgreSQL database. The original table and all the tables that were produced as a result of transformation in dbt will be migrated by the orchestration of airflow.

In the migration dag, we have two operations. The first operation migrates the entire data from PostgreSQL to MySQL. The second operation migrates the tables that were created as a result of transformation in dbt.

The above image demonstrates the superset dashboard for the traffic flow data. The graphs are created in superset by querying the transformed tables in the MySQL database.
ELT Migration Pipeline
The migration process follows these key steps:
1async function migrationPipeline() { 2 // 1. Load data to PostgreSQL 3 await loadDataToPostgres(); 4 5 // 2. Transform data using dbt 6 await performDbtTransformations(); 7 8 // 3. Migrate to MySQL 9 await migrateToMySQL(); 10 11 // 4. Verify data integrity 12 await verifyDataIntegrity(); 13}
Info: The migration occurs after data loading to PostgreSQL, with Airflow orchestrating the transfer of both original and transformed tables.
Technical Challenges
We encountered several significant challenges during the migration:
- Dashboard Migration
- Complex chart configurations
- Different visualization paradigms
- Custom widget compatibility
- User Management
- Permission structure differences
- User role mapping
- Access control migration
Warning: Careful planning is required for user permission migration to maintain security standards.
Best Practices and Recommendations
- Pre-Migration
- Document all existing dashboards
- Map user permissions
- Create backup strategies
- During Migration
- Implement parallel running
- Validate data consistency
- Monitor system performance
- Post-Migration
- Verify all functionalities
- Train users on new system
- Document new processes
Conclusion
This migration project significantly improved our data infrastructure scalability and analytical capabilities. Despite the challenges in dashboard and user migration, the benefits of the new stack justify the effort invested.