show episodes
 
Artwork

1
Streaming Audio: Apache Kafka® & Real-Time Data

Confluent, founded by the original creators of Apache Kafka®

Unsubscribe
Unsubscribe
Monthly
 
Streaming Audio features all things Apache Kafka®, Confluent, real-time data, and the cloud. We cover frequently asked questions, best practices, and use cases from the Kafka community—from Kafka connectors and distributed systems, to data mesh, data integration, modern data architectures, and data mesh built with Confluent and cloud Kafka as a service. Join our hosts as they stream through a series of interviews, stories, and use cases with guests from the data streaming industry. Apache®️, ...
  continue reading
 
Learn what every engineer should know about building and scaling SaaS products from leaders who built world-class SaaS. We will share lessons learned, advice, tips, and great stories. This podcast is part of the SaaS community. You can also join our Slack https://launchpass.com/all-about-saas and follow our Youtube https://www.youtube.com/channel/UCZuLNqvV4oUMVyNq70mFF0g
  continue reading
 
Loading …
show series
 
You can't manage what you don't measure, and this includes your cloud costs. But how detailed should this measurement be? And how will the data translate into impact?I sat down with Adam Shugar, co-founder and CTO of Dashdive, to discuss his approach to cloud costs. He shared his advice, not only on cost cutting but also technology, growing a start…
  continue reading
 
Bill Tarr has the most interesting job in the world. He and his team of SaaS Evangelists at the AWS SaaS Factory work with companies large and small to help them build SaaS on AWS. In this conversation, we discuss the different ways technology and business interact when building SaaS - from cool technical options available only to SaaS companies to…
  continue reading
 
Ryan Worl, co-founder and CTO of WarpStream, is on a mission to re-engineer fundamental infrastructure on top of S3. Starting with metrics at Uber, continuing with Husky - DataDog's platform for events, logs and everything except metrics and then... Ryan thought "it will be cool to re-build Kafka on S3", reached out to other developers to hear thei…
  continue reading
 
We invited Anna Povzner, director of engineering at Confluent, to the show to discuss Kora. Kora is a Cloud Native platform based on Apache Kafka, which her team at Confluent built. Anna and her team recently published a paper about Kora in VLDB 2023, and it won the industry's best paper award.Kora VLDB paper: https://www.vldb.org/pvldb/vol16/p3822…
  continue reading
 
In the last year or two I started hearing a lot about cell-based architectures. Usually in the form of “We had a lot of issues scaling our infrastructure, but then we moved to cell-based architectures” and “I wish I’ve learned about cell based architectures earlier, it would have saved me a lot of pain”. As a result, I’ve wanted to share knowledge …
  continue reading
 
Krishna Raman has many years of experience building platforms for developers. Now he's applying this experience at Delta Streams to build a serverless platform for stream processing in SQL. In our conversation, we discussed the serverless developer experience, some of the secret sauce behind Delta Stream's Flink Operator, how to deliver a bring-you…
  continue reading
 
You are a founding engineer at a SaaS startup. You built the MVP, and to everyone's great delight - usage is picking up. What's next? In this episode, Jeffrey shares how a messy MVP can gradually evolve into a scalable SaaS product. We discuss the critical design decisions engineers have to make in the early days of building the product: Tabs or sp…
  continue reading
 
Did you know that Postgres that lets you write data that you can’t query? Events that will show up in the write-ahead log (WAL) of the database but not in any table. Gunnar Morling, senior staff engineer at Decodable and world expert on change data capture, walks us through data capture basics, this not-new but little-known feature, and dives into …
  continue reading
 
Andrew Atkinson took a Rails web application that was struggling with load, and optimized it to handle over 9000 HTTP requests per second with an average latency of 35ms end to end. Handling a much higher load, on a smaller RDS instance, with lower latencies. He then shared his expertise by writing a book: "High-Performance Postgres with Rails."And…
  continue reading
 
Ensuring your production system is well-behaved is table stakes for any SaaS. The space around monitoring and alerting is complex and moving fast, so it is too easy to end up with results that are worse than useless. Alert fatigue is real, and developers must learn to avoid this. Shahar and Tal, founders of Keep and active SaaS Developer community …
  continue reading
 
Cloudflare is no longer "just" a CDN serving 55M HTTP requests/sec; they now offer a wide range of cloud services on the edge. These services run on a data layer with 15 Postgres clusters running hundreds of databases. Vignesh Ravichandran, engineering manager of Cloudflare's database team, joined us to discuss the challenges of running this large-…
  continue reading
 
Pinot team at Uber wrote an excellent paper about the real-time analytics platform they built. Chinmay, formerly a principal engineer at Uber and now head of product at StarTree, joined me for a conversation.We discussed the challenges they encountered at Uber, the solutions they came up with, the platform they built, and how to best apply their ex…
  continue reading
 
Shayon wrote a great blog post on the guiding principles he and his team at Loom used to guide them as they evolved Loom's data platform through a period of hypergrowth. I invited Shayon to the show to discuss the challenges he encountered and how he solved them - and I learned that he is now at Tines - solving an entirely new set of challenges wit…
  continue reading
 
Colt McNealy is re-imagining the future of microservices orchestration and he decided to build it entirely on Kafka Streams. In this conversation we discuss how Kafka Streams provides the low latency, reliability, availability and elasticity that is needed for the next generation of microservices orchestration. Colt also shares the most exciting up…
  continue reading
 
If you used a relational database at all, you probably heard of transaction isolation levels. Transaction isolation levels have a massive impact on the behavior of your application - correctness, performance, and error rates. Your database may be distributed these days, so you may have to reason about distributed transactions too.In this video, I e…
  continue reading
 
YouTube and Twitter are full of “things developers should never do”. There's an endless demand for simple advice that applies in all situations.And that's not a bad thing. If there's a simple solution that works 80% of the time, this is valuable information. More practical than just "it depends." But advice-givers and advice-getters can do better. …
  continue reading
 
When developers talk about Serverless, they often focus on FaaS. But the best Serverless experience, by far, is delivered by a data store. S3. Why? Because it "just works" and lets developers focus on their code. Serverless databases help you focus on your queries and workload. They abstract the compute. Which also means - usage based pricing. In t…
  continue reading
 
Apache Kafka® 3.5 is here with the capability of previewing migrations between ZooKeeper clusters to KRaft mode. Follow along as Danica Fine highlights key release updates. Kafka Core: KIP-833 provides an updated timeline for KRaft. KIP-866 now is preview and allows migration from an existing ZooKeeper cluster to KRaft mode. KIP-900 introduces a wa…
  continue reading
 
After recording 64 episodes and featuring 58 amazing guests, the Streaming Audio podcast series has amassed over 130,000 plays on YouTube in the last year. We're extremely proud of these achievements and feel that it's time to take a well-deserved break. Streaming Audio will be taking a vacation! We want to express our gratitude to you, our valued …
  continue reading
 
The storage team at Airtable published a blog post describing, in detail, the migration of their petabyte-scale storage layer from MySQL 5.6 to MySQL 8.0. Andrew Wang, the lead of Airtable's storage team, joined us to discuss the migration, Airtable's storage architecture, data isolation levels, engineering culture, and more. The blog: https://medi…
  continue reading
 
Have you ever struggled with managing data long term, especially as the schema changes over time? In order to manage and leverage data across an organization, it’s essential to have well-defined guidelines and standards in place around data quality, enforcement, and data transfer. To get started, Abraham Leal (Customer Success Technical Architect, …
  continue reading
 
Can you use Apache Kafka® and Python together? What’s the current state of Python support? And what are the best options to get started? In this episode, Dave Klein joins Kris to talk about all things Kafka and Python: the libraries, the tools, and the pros & cons. He also talks about the new course he just launched to support Python programmers en…
  continue reading
 
In this episode, Kris interviews Doron Porat, Director of Infrastructure at Yotpo, and Liran Yogev, Director of Engineering at ZipRecruiter (formerly at Yotpo), about their experiences and strategies in dealing with data modeling at scale. Yotpo has a vast and active data lake, comprising thousands of datasets that are processed by different engine…
  continue reading
 
Migrating Apache Kafka® clusters can be challenging, especially when moving large amounts of data while minimizing downtime. Michael Dunn (Solutions Architect, Confluent) has worked in the data space for many years, designing and managing systems to support high-volume applications. He has helped many organizations strategize, design, and implement…
  continue reading
 
SaaS applications are multi-tenant, so whether you are writing the first line of code in a new app or worried about scaling your successful SaaS fast enough - you need to be aware of multi-tenant requirements. Isolation, access control, perfornance, operations, scale and compliance In this video, Ram Subramanian, Nile CEO and SaaS Community founder…
  continue reading
 
dbt is known as being part of the Modern Data Stack for ELT processes. Being in the MDS, dbt Labs believes in having the best of breed for every part of the stack. Oftentimes folks are using an EL tool like Fivetran to pull data from the database into the warehouse, then using dbt to manage the transformations in the warehouse. Analysts can then bu…
  continue reading
 
What’s the next big thing in the future of streaming data? In this episode, Greg DeMichillie (VP of Product and Solutions Marketing, Confluent) talks to Kris about the future of stream processing in environments where the value of data lies in their ability to intercept and interpret data. Greg explains that organizations typically focus on the inf…
  continue reading
 
What can online gaming teach us about making large-scale event management more collaborative in real-time? Ben Gamble (Developer Relations Manager, Aiven) has come to the world of real-time event streaming from an usual source: the video games industry. And if you stop to think about it, modern online games are complex, distributed real-time data s…
  continue reading
 
Apache Kafka® 3.4 is released! In this special episode, Danica Fine (Senior Developer Advocate, Confluent), shares highlights of the Apache Kafka 3.4 release. This release introduces new KIPs in Kafka Core, Kafka Streams, and Kafka Connect. In Kafka Core: KIP-792 expands the metadata each group member passes to the group leader in its JoinGroup sub…
  continue reading
 
How can you use OpenTelemetry to gain insight into your Apache Kafka® event systems? Roman Kolesnev, Staff Customer Innovation Engineer at Confluent, is a member of the Customer Solutions & Innovation Division Labs team working to build business-critical OpenTelemetry applications so companies can see what’s happening inside their data pipelines. I…
  continue reading
 
Data democratization allows everyone in an organization to have access to the data they need, and the necessary tools needed to use this data effectively. In short, data democratization enables better business decisions. In this episode, Rama Ryali, a Senior IT and Data Executive, chats with Kris Jenkins about the importance of data democratization…
  continue reading
 
Is it possible to manage and test data like code? lakeFS is an open-source data version control tool that transforms object storage into Git-like repositories, offering teams a way to use the same workflows for code and data. In this episode, Kris sits down with guest Adi Polak, VP of DevX at Treeverse, to discuss how lakeFS can be used to facilita…
  continue reading
 
Gunnar Morling asked a great querstion on Twitter: ""Separating storage and compute" vs. "Predicate push-down" -- I can't quite square these two with each other. Is there a world where they co-exist, or is it just two opposing patterns/trends in DB tech. ?" Those are complimentary patterns and you definitely want them together. They appear contradi…
  continue reading
 
How does leader election work in Apache Kafka®? For the past 2 ½ years, Adithya Chandra, Staff Software Engineer at Confluent, has been working on Kafka scalability and performance, specifically partition leader election. In this episode, he gives Kris Jenkins a deep dive into the power of leader election in Kafka replication, why we need it, how i…
  continue reading
 
Are bad customer experiences really just data integration problems? Can real-time data streaming and machine learning be democratized in order to deliver a better customer experience? Airy, an open-source data-streaming platform, uses Apache Kafka® to help business teams deliver better results to their customers. In this episode, Airy CEO and co-fo…
  continue reading
 
Loading …

Quick Reference Guide