Pixel Federation, a leading Slovak mobile gaming company, successfully migrated their on-premise data warehouse infrastructure to a modern, cloud-native architecture on AWS, achieving significant gains in both performance & operational efficiency. The project transformed their legacy Hadoop and Spark setup into a scalable, cost-optimized solution built on Amazon EKS, Apache Spark, Apache Airflow & Amazon S3. The migration delivered a 60% performance improvement while substantially reducing operational overhead & maintenance burden.
Customer Background & Challenge
Pixel Federation is a prominent Slovak mobile gaming company developing and operating multiple successful titles with millions of active users worldwide. Like all modern gaming companies, Pixel Federation relies heavily on data analytics to understand player behavior, optimize game mechanics, improve user retention, and drive business decisions.
The Legacy Infrastructure
Prior to the migration, Pixel Federation operated an on-premise data warehouse built on Hadoop and Apache Spark. While this infrastructure had initially served the company well, it had evolved into a significant operational burden with several critical limitations:
- Limited Automation: The original setup required substantial manual intervention for routine operations, updates, and maintenance. In-place updates were virtually impossible. Hardware replacement was the only viable upgrade path, requiring procurement of new servers and construction of a parallel cluster before any migration could occur.
- Scaling Constraints: As game popularity grew, the on-premise infrastructure struggled to scale. Adding capacity required physical hardware procurement, installation and configuration: a process measured in weeks or months.
- High Maintenance Overhead: Considerable engineering time was consumed by physical infrastructure maintenance, software updates, hardware failure triage, and system availability management, diverting resources from strategic work.
- Cost Inefficiency: The on-premise model demanded significant upfront capital investment, ongoing maintenance, and data-centre costs. Resources frequently sat idle during low-demand periods, representing poor capital utilization.
- Licensing Complexity: Managing software licences across the Hadoop ecosystem added a further layer of cost and operational complexity.
Strategic Objectives
Pixel Federation recognized that modernizing their data infrastructure was essential to sustain growth and competitive advantage. The primary migration objectives were:
- Reduce operational complexity and maintenance burden
- Improve scalability to handle growing data volumes and analytical workloads
- Optimize costs through more efficient resource utilization
- Enhance performance for faster analytical insights
- Ensure long-term maintainability with modern, well-supported technologies
- Leverage existing cloud infrastructure to create a unified operational environment
Solution Architecture
The decision to migrate to AWS was straightforward: Pixel Federation had already been running all their game backends on AWS for several years. This existing relationship provided familiarity with AWS services, established networking infrastructure, and operational expertise directly applicable to the data warehouse migration.
Core Technology Stack
- Amazon EKS: Container orchestration platform. AWS-managed control plane reduces ops overhead while maintaining Kubernetes portability and ecosystem benefits.
- Apache Spark: Primary data processing engine. Retained for continuity with existing pipelines; team expertise minimized retraining whilst gaining cloud-native deployment patterns.
- Apache Airflow: Workflow orchestration and scheduling. Robust, programmable platform for managing complex data pipeline dependencies and execution schedules.
- Amazon S3 Data Lake: Foundation for data storage. Virtually unlimited scalability, 99.999999999% durability, and intelligent tiering for cost-optimized access patterns.
- LARA Platform: Labyrinth Labs' proprietary Terraform/EKS/GitOps framework. Accelerated deployment from weeks to days, ensuring consistency, repeatability, and IaC best practices across all environments.
Cost Optimization Strategy
Cost optimization was a primary consideration from the outset. The team implemented several strategies to maximize cost efficiency:
- Spot Instances: The majority of Spark analytical compute workloads run on AWS EC2 Spot Instances, offering up to 60% savings vs on-demand. Since analytical jobs tolerate interruptions, Spot provides an ideal cost-performance balance.
- Right-Sizing: Cloud infrastructure enables precise matching of compute resources to actual workload requirements, eliminating fixed capacity for peak loads.
- Storage Tiering: S3 Intelligent Tiering automatically moves data between access tiers based on usage patterns, optimising storage costs without manual intervention.
Following the initial migration, Pixel Federation continued to refine the platform through several significant enhancements, each delivering further improvements in performance, cost, and operational efficiency.
Results & Business Impact
This improvement resulted from multiple converging factors: more powerful compute instances on demand, better data locality with the S3 data lake, optimized Spark configurations for cloud deployment, and efficient caching for frequently accessed data.
Operational Efficiency
The cloud-native architecture dramatically reduced the time and effort required to maintain the data warehouse infrastructure. Tasks that previously required manual intervention are now automated or handled by AWS-managed services:
- No hardware maintenance or replacement cycles
- Automated software updates for managed services
- Self-healing infrastructure through Kubernetes
- Simplified capacity planning and scaling
This operational improvement freed the engineering team to focus on higher-value activities: developing new analytical capabilities, optimizing data pipelines, and supporting data science initiatives.
Cost Optimization
While specific figures are confidential, the combination of Spot Instances, Graviton processors, WarpStream and improved resource utilization delivered substantial cost savings compared to both the previous on-premise infrastructure and a naive lift-and-shift cloud migration approach.
Ongoing Challenges
- Cost Observability & Attribution: Accurately attributing costs to specific users, teams, queries, and projects in a shared dynamically-allocated environment requires sophisticated tagging, monitoring, and cost allocation strategies. Some challenges stem from inherent platform constraints. For example, attributing exact S3 API call volumes to individual queries is extremely difficult. The team continues to refine their approach through enhanced tagging, custom allocation tools, AWS Cost Explorer / CUR integration, and internal chargeback models.
- Continuous Optimization: The flexibility of the cloud environment means optimization is an ongoing discipline, not a one-time event. The team maintains a backlog of potential improvements and regularly evaluates new AWS services.
Long-Term Maintainability
Perhaps the most significant but hardest-to-quantify benefit is improved long-term maintainability:
- Standard Technologies: Widely adopted open-source and AWS-managed services ensure long-term support and rich community resources.
- Easier Updates: Containerization and infrastructure-as-code make updates more predictable and less risky.
- Talent Attraction: Modern technology stacks make it easier to recruit and retain skilled engineers.

Labyrinth Labs expertise in cutting edge technologies helped us take our innovations and progress to another level.
Lessons Learned
- Leverage Existing Infrastructure: Building on an existing AWS presence avoided multi-cloud complexity and leveraged established expertise and networking, accelerating time-to-value.
- Iterate and Optimize: The initial migration established a solid foundation; subsequent optimizations [WarpStream, Graviton, caching, custom scheduling] delivered additional compounding benefits. Treat migration as a journey, not an event.
- Balance Familiarity & Innovation: Retaining familiar technologies [Spark, Airflow] minimized disruption while adopting cloud-native deployment patterns provided modernization benefits.
- Architecture-Agnostic: Ability to migrate to Graviton ARM with minimal effort demonstrated the value of platform-agnostic code that can exploit new technologies as they emerge.
- Plan Cost Observability Early: Implementing comprehensive cost tracking and attribution from the start is far easier than retrofitting it later.
- Use Platform Management Tools: Scalable, modular platform management software such as LARA significantly streamlines infrastructure orchestration, improves operational consistency, and reduces complexity at scale.
Best Practices for Cloud-Native Data Warehouses
Based on this engagement, organizations planning similar migrations should consider:
- Start with a solid foundation: invest in proper architecture, networking, security, and IaC design upfront
- Implement cost controls early: build tagging strategies, monitoring, and cost allocation mechanisms from day one
- Use managed services strategically: balance reduced operational burden against cost and control trade-offs
- Plan for Spot Instance interruptions: design workloads to handle interruptions gracefully
- Embrace containerization: containers provide portability, resource efficiency, and operational benefits
- Automate everything: IaC, CI/CD pipelines, and automated testing reduce errors and accelerate deployments
- Plan for sustainable maintenance: decisions made today have recurring consequences; architect for the long term
Future Considerations
Pixel Federation continues to evaluate opportunities for further optimization:
- Enhanced Cost Attribution: More sophisticated cost allocation and chargeback mechanisms to provide per-team, per-project, and per-query visibility.
- Advanced Analytics: Exploring Amazon Athena for ad-hoc queries, AWS Glue for data cataloguing, and Amazon Redshift for specific data warehouse use cases.
- ML Integration: Investigating tighter integration of machine learning workflows with the data lake, potentially leveraging Amazon SageMaker.
- Multi-Region: Evaluating multi-region deployment patterns for improved latency and data residency compliance as the business grows globally.
Sustainability Focus: Continuing to optimize for energy efficiency via ARM processors and AWS sustainability tooling.









