The CTO Podcast with Fexingo: Technical Leadership, Architecture, and Engineering Org

Episodes

How GitLab Runs Remote Engineering with 2000 Developers

Jun 7 2026

In this episode, Lucas and Luna dive into how GitLab manages a fully remote engineering organization of over 2,000 developers. They explore the company's unique handbook-first culture, how they maintain code quality across time zones, and the specific tools they use for asynchronous communication. Lucas shares key metrics: GitLab ships 40 releases per year with a median merge request cycle time of under 6 hours. They also discuss how the company handles onboarding, performance reviews, and incident response without a physical office. A must-listen for anyone leading or building a remote engineering team. #GitLab #RemoteEngineering #EngineeringManagement #AsynchronousWork #DevOps #CodeReview #TechLeadership #BusinessAndTechnology #FexingoBusiness #BusinessPodcast #CTO #EngineeringOrg #RemoteWork #HandbookDriven #MergeRequest #Onboarding #IncidentResponse #Culture Keep every episode free: buymeacoffee.com/fexingo
Show More Show Less

11 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to basket failed.

Please try again later

Add to wishlist failed.

Please try again later

Remove from wishlist failed.

Please try again later

Adding to library failed

Please try again

Follow podcast failed

Unfollow podcast failed

Listen for free
How Figma Scales Real-Time Collaboration With CRDTs

Jun 7 2026

Episode 36 of The CTO Podcast dives into how Figma built its real-time collaboration engine using Conflict-Free Replicated Data Types (CRDTs). Lucas and Luna unpack the architectural decision to move from Operational Transform to CRDTs, how Figma handles merge conflicts at scale, and the engineering tradeoffs behind its vector-based multi-user editing. They walk through the key design choices: why Figma chose a custom CRDT instead of off-the-shelf libraries, how it serialises operations for low-latency sync across hundreds of collaborators on a single file, and the surprising way it prioritises local responsiveness over consistency. Luna asks the hard questions about production incidents, and Lucas breaks down the monitoring approach behind Figma's 'real-time' guarantee. A concrete look at distributed systems theory meeting product design. #Figma #CRDT #RealTimeCollaboration #DistributedSystems #ConflictFreeReplicatedDataTypes #OperationalTransform #ProductDesign #Collaboration #Latency #Engineering #Architecture #Whiteboard #MultiUserEditing #Sync #VectorGraphics #BusinessAndTechnology #FexingoBusiness #BusinessPodcast Keep every episode free: buymeacoffee.com/fexingo
Show More Show Less

10 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to basket failed.

Please try again later

Add to wishlist failed.

Please try again later

Remove from wishlist failed.

Please try again later

Adding to library failed

Please try again

Follow podcast failed

Unfollow podcast failed

Listen for free
How Elasticsearch Powers Netflix's Search and Observe

Jun 6 2026

Netflix runs one of the largest Elasticsearch deployments in the world — over 150 clusters, thousands of nodes, processing tens of billions of documents. In this episode, Lucas and Luna unpack how Netflix uses Elasticsearch not just for log aggregation, but to power its internal search, real-time monitoring, and even the titles you see when you open the app. They walk through the architecture behind Netflix's search — from how they handle partial matches across 17,000 titles to how they keep observability data flowing without crashing the clusters. Along the way, they cover shard sizing, index lifecycle management, and the painful lessons Netflix learned when Elasticsearch failed at scale. A practical episode for any engineering leader running search or observability at scale. #Elasticsearch #Netflix #SearchArchitecture #Observability #Logging #DistributedSystems #Sharding #IndexLifecycleManagement #RealTimeMonitoring #EngineeringLeadership #CTO #TechnicalDebt #Infrastructure #SiteReliabilityEngineering #BusinessAndTechnology #FexingoBusiness #BusinessPodcast #TheCTOPodcast Keep every episode free: buymeacoffee.com/fexingo
Show More Show Less

10 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to basket failed.

Please try again later

Add to wishlist failed.

Please try again later

Remove from wishlist failed.

Please try again later

Adding to library failed

Please try again

Follow podcast failed

Unfollow podcast failed

Listen for free
How Discord Rebuilt Its Voice Engine for Latency

Jun 6 2026

In this episode of The CTO Podcast, Lucas and Luna dive into Discord's architectural overhaul of its real-time voice system. They explore how the team reduced latency from hundreds of milliseconds to under 50 by switching from a traditional client-server model to a mesh-based WebRTC architecture. The discussion covers the trade-offs of running their own media servers versus outsourcing, the engineering challenge of synchronizing 50 users in a single voice channel without a central coordinator, and how Discord handled the transition without disrupting its 150 million monthly active users. Lucas explains the key insight: rather than optimizing the existing pipeline, Discord rethought the entire signaling and media routing layer around a 'selective forwarding unit' pattern. Luna presses on the operational cost of running proprietary infrastructure at scale, and Lucas shares the surprising finding that the rewrite actually reduced server spend by 30 percent. The episode closes with a reflection on when to rebuild versus patch. #Discord #VoiceEngine #WebRTC #LowLatency #RealTimeCommunication #MeshArchitecture #SelectiveForwardingUnit #CTO #EngineeringOrg #Scaling #Infrastructure #TechnicalLeadership #Business #Technology #FexingoBusiness #BusinessPodcast #TheCTOPodcast #Architecture Keep every episode free: buymeacoffee.com/fexingo
Show More Show Less

8 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to basket failed.

Please try again later

Add to wishlist failed.

Please try again later

Remove from wishlist failed.

Please try again later

Adding to library failed

Please try again

Follow podcast failed

Unfollow podcast failed

Listen for free
How AWS Built Its Control Plane for 200 Services

Jun 5 2026

Amazon Web Services runs over 200 services, each with its own control plane. In this episode, Lucas and Luna break down how AWS's internal architecture team designed a unified control plane framework that handles millions of API requests per second across regions. They explore the concept of 'control plane as a platform' — a set of reusable primitives for authorization, rate limiting, and state management that lets service teams focus on business logic. Lucas walks through the key design decisions: separating data plane from control plane at the infrastructure level, using eventual consistency for global state, and the 'cell-based architecture' that isolates failures. Luna asks how this affects developers building on AWS today and whether the pattern is reproducible outside of hyperscalers. A specific look at one of the most complex distributed systems ever built, and what it teaches us about scaling engineering orgs. #AWS #ControlPlane #DistributedSystems #CloudArchitecture #EngineeringAtScale #TechLeadership #PlatformEngineering #FexingoBusiness #BusinessPodcast #CTOPodcast #AWSreInvent #CellBasedArchitecture #APIDesign #Authorization #RateLimiting #EventualConsistency #InfrastructureAsCode #Scaling Keep every episode free: buymeacoffee.com/fexingo
Show More Show Less

9 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to basket failed.

Please try again later

Add to wishlist failed.

Please try again later

Remove from wishlist failed.

Please try again later

Adding to library failed

Please try again

Follow podcast failed

Unfollow podcast failed

Listen for free
How Stripe Runs a Global Payment Platform With 99.999 Percent Uptime

Jun 5 2026

Stripe processes hundreds of billions in payments annually. But behind the API is a reliability architecture that few people talk about. In this episode, Lucas and Luna dive into how Stripe achieves five-nines uptime across its payment infrastructure — the layers of redundancy, the careful rollout strategy, and the incident response playbook that keeps money moving. They explore Stripe's use of circuit breakers, gradual canary deployments, and a global multi-region database topology that can survive an entire cloud region going dark. Specific numbers: Stripe's documented 99.999% uptime goal, the 30-minute maximum recovery time for critical services, and how they test failure scenarios weekly. If you're building systems where every millisecond counts, this is a masterclass in production resilience. No marketing fluff — just the engineering reality behind one of the most critical payment platforms on the internet. #Stripe #PaymentInfrastructure #ReliabilityEngineering #FiveNines #Uptime #IncidentResponse #CanaryDeployments #CircuitBreakers #MultiRegion #FaultTolerance #SRE #ProductionResilience #PaymentProcessing #GlobalInfrastructure #BusinessAndTechnology #FexingoBusiness #BusinessPodcast #CTOPodcast Keep every episode free: buymeacoffee.com/fexingo
Show More Show Less

8 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to basket failed.

Please try again later

Add to wishlist failed.

Please try again later

Remove from wishlist failed.

Please try again later

Adding to library failed

Please try again

Follow podcast failed

Unfollow podcast failed

Listen for free
How Uber Rebuilt Its Maps for 40 Million Daily Rides

Jun 4 2026

Episode 31 of The CTO Podcast digs into how Uber's engineering team rebuilt its mapping and routing stack from scratch between 2019 and 2022 to handle over 40 million daily rides across 10,000 cities. We look at the specific reason they abandoned the old pipeline — vendor lock-in with Google Maps and a 40 percent cost increase in a single quarter — and how they designed a modular routing engine called Michelangelo Maps. Lucas explains the architecture: a C++ kernel for shortest-path that runs in under 50 milliseconds, a tile-based geocoding layer that reduced queries by 80 percent, and a machine learning model that predicts travel time to within 5 percent of actual trip duration. Luna pushes back on whether rebuilding a core piece of infrastructure that touches every single ride was worth the three-year timeline and the hundreds of engineers it took. We also touch on the trade-off between cost savings and reliability during the 2020 ridership drop. No hot takes — just the concrete decisions Uber's technical leadership made and the numbers that justified them. #Uber #Maps #RoutingEngine #MichelangeloMaps #CPlusPlus #Geocoding #MachineLearning #Architecture #Scaling #Infrastructure #CTOPodcast #Fexingo #BusinessAndTechnology #Engineering #TechLeadership #TravelTimePrediction #FexingoBusiness #BusinessPodcast Keep every episode free: buymeacoffee.com/fexingo
Show More Show Less

9 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to basket failed.

Please try again later

Add to wishlist failed.

Please try again later

Remove from wishlist failed.

Please try again later

Adding to library failed

Please try again

Follow podcast failed

Unfollow podcast failed

Listen for free
How Spotify Migrated to Google Cloud Without Breaking Discovery Weekly

Jun 4 2026

In 2016, Spotify announced it was moving its entire infrastructure from its own data centers to Google Cloud Platform. The migration took four years and involved moving over 1,200 services, petabytes of data, and the machine learning pipelines powering Discover Weekly — all while keeping the music streaming without audible interruption. Lucas and Luna break down how Spotify's engineering team pulled off one of the largest cloud migrations in tech history, the architectural decisions that made it possible, and the lessons for any organization facing a big infrastructure move. Featuring the surprising role of a custom tool called 'Sisyphus' and why Spotify chose to keep its own storage layer running on top of Google's network. #Spotify #GoogleCloud #CloudMigration #Infrastructure #MusicStreaming #DiscoverWeekly #MLPipelines #Sisyphus #DataCenters #Engineering #Architecture #Scalability #Business #Technology #FexingoBusiness #BusinessPodcast #TechLeadership #CTO Keep every episode free: buymeacoffee.com/fexingo
Show More Show Less

9 mins

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to basket failed.

Please try again later

Add to wishlist failed.

Please try again later

Remove from wishlist failed.

Please try again later

Adding to library failed

Please try again

Follow podcast failed

Unfollow podcast failed

Listen for free

Episodes

How GitLab Runs Remote Engineering with 2000 Developers

Failed to add items

Add to basket failed.

Add to wishlist failed.

Remove from wishlist failed.

Adding to library failed

Follow podcast failed

Unfollow podcast failed

How Figma Scales Real-Time Collaboration With CRDTs

Failed to add items

Add to basket failed.

Add to wishlist failed.

Remove from wishlist failed.

Adding to library failed

Follow podcast failed

Unfollow podcast failed

How Elasticsearch Powers Netflix's Search and Observe

Failed to add items

Add to basket failed.

Add to wishlist failed.

Remove from wishlist failed.

Adding to library failed

Follow podcast failed

Unfollow podcast failed

How Discord Rebuilt Its Voice Engine for Latency

Failed to add items

Add to basket failed.

Add to wishlist failed.

Remove from wishlist failed.

Adding to library failed

Follow podcast failed

Unfollow podcast failed

How AWS Built Its Control Plane for 200 Services

Failed to add items

Add to basket failed.

Add to wishlist failed.

Remove from wishlist failed.

Adding to library failed

Follow podcast failed

Unfollow podcast failed

How Stripe Runs a Global Payment Platform With 99.999 Percent Uptime

Failed to add items

Add to basket failed.

Add to wishlist failed.

Remove from wishlist failed.

Adding to library failed

Follow podcast failed

Unfollow podcast failed

How Uber Rebuilt Its Maps for 40 Million Daily Rides

Failed to add items

Add to basket failed.

Add to wishlist failed.

Remove from wishlist failed.

Adding to library failed

Follow podcast failed

Unfollow podcast failed

How Spotify Migrated to Google Cloud Without Breaking Discovery Weekly

Failed to add items

Add to basket failed.

Add to wishlist failed.

Remove from wishlist failed.

Adding to library failed

Follow podcast failed

Unfollow podcast failed