How Data Scientists Use Active Learning to Label Smarter cover art

How Data Scientists Use Active Learning to Label Smarter

How Data Scientists Use Active Learning to Label Smarter

Listen for free

View show details
In this milestone 50th episode, Lucas and Luna explore active learning — a machine learning paradigm where the model itself chooses which data points to label, dramatically reducing manual annotation costs. They break down the core idea using a concrete example: training a fraud detection model for a payment processor processing 10 million transactions per day. Lucas explains uncertainty sampling, query-by-committee, and the 'exploration vs. exploitation' trade-off. Luna raises the practical challenge of label noise and how to handle it. They also discuss when active learning fails — like when the unlabeled pool doesn't represent real-world distribution. The conversation ties back to the broader theme: getting more value from fewer labels, a critical skill for any data scientist working with limited annotation budgets. #ActiveLearning #MachineLearning #DataScience #UncertaintySampling #QueryByCommittee #FraudDetection #Labeling #Annotation #SemiSupervisedLearning #ExplorationVsExploitation #ModelTraining #DataEfficiency #MLStrategy #Technology #Podcast #FexingoBusiness #BusinessPodcast #DataSciencePodcast Keep every episode free: buymeacoffee.com/fexingo
adbl_web_anon_alc_button_suppression_t1
No reviews yet