architecture · Oct 2023 – Sep 2024
Migrating Aviation Operational Data from Relational DBs to MongoDB
Query performance +20% and downtime −15% migrating legacy relational systems to MongoDB across aviation operational workflows.
- Role
- Owned the schema redesign, aggregation-pipeline design, and the migration rollout
- Team
- Backend integrations team
- Scope
- Operational data powering enterprise integrations and reporting at Universal Weather and Aviation
- Level
- Software Engineer (SDE-1)
Query performance
Method: Compared latency on the top operational queries before vs after migration, on the read-heavy hot paths the redesign targeted.
Window: post-migration
Operational downtime
Method: Compared downtime attributable to the data layer before vs after the migration.
Window: post-migration
Context
At Universal Weather and Aviation I worked on the backend integrations powering enterprise airline operations across multiple GDS platforms. The operational data behind those integrations — and the reporting on top of it — lived in normalized relational tables, and the hot queries that joined across several tables were getting slower as the data grew.
The brief was to migrate that data to MongoDB. The interesting part wasn't moving bytes; it was deciding what shape the data should take on the other side, and how to move live operational data without taking the integrations down.
The data-model decision
The trap in a relational-to-document migration is to lift your tables straight across — keep the same normalized shape, replace SQL joins with application-side lookups, and end up paying the relational cost without the relational guarantees.
Instead I modeled documents around how the data is actually read: embed what's read together, and precompute the reporting views as indexed aggregation pipelines. The workload was overwhelmingly read- and report-heavy, so trading a little write-time complexity for read-time speed was the right exchange. That's where the ~20% query improvement came from.
Migrating without an outage
Because this was live operational data, I migrated incrementally rather than in a big-bang cutover: dual-write to both stores, verify that reads matched, then shift traffic one collection at a time. Every step was independently reversible — which is what kept the downtime change to −15% instead of risking a hard outage.
What broke
The honest part: an early aggregation pipeline that flew in staging did full collection scans in production, because the compound index I'd built didn't match the pipeline's match-and-sort order — the equality-then-sort-then-range field ordering a compound index needs in order to actually be used. Staging simply didn't have the data volume to expose it. A reporting path was slow for a short window after one cutover until I corrected the index. The lesson stuck — now I read the explain plan on every pipeline against production-scale data before a cutover, not after, and confirm it's an index scan rather than a collection scan.
Outcomes
Query performance improved roughly 20% on the hot paths, data-layer downtime dropped about 15%, and the document-modeling approach became the template for later reporting work on the team.
Decision Records
Where I Got It Wrong
Failure 1
An early aggregation pipeline that was fast in staging did full collection scans in production, because the supporting compound index didn't match the pipeline's match-and-sort order
Cost: A reporting path was slow for a short period after one of the incremental cutovers, until the index was corrected
Lesson: Validate aggregation pipelines against production-scale data and read the explain plan before cutover — staging data volume hides index problems that only surface at scale
Long-Term Impact
The document model and aggregation-pipeline approach became the pattern the team reused for later reporting work.