In this presentation, I will talk about my firsthand experience dealing with the unique challenges of running Kafka at a massive scale. If you ever thought that running Kafka is difficult, this talk may change your mind and provide you with valuable insights into how to configure a Kafka cluster efficiently, how to manage Kafka for enterprise customers and how to measure, monitor and maintain the Quality of Kafka Service. Our production Kafka cluster runs over 1500+ VMs, and serves over 10 GBPS data spread across hundreds of topics for multiple teams across Microsoft. We built a self-serve Kafka management service to make the process manageable and scalable across many teams. In this talk, I will also share insights about running Kafka in Private vs multi-tenant mode, supporting failover and disaster recovery requirements, and how to make Kafka Compliant with regulatory certifications such as ISO, SOC, FEDRAMP, etc.
Principal Software Engineering Manager, Microsoft