Artwork

Content provided by Demetrios Brinkmann. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Demetrios Brinkmann or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.
Player FM - Podcast App
Go offline with the Player FM app!

Build Reliable Systems with Chaos Engineering // Benjamin Wilms // #237

46:57
 
Share
 

Manage episode 421348295 series 3241972
Content provided by Demetrios Brinkmann. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Demetrios Brinkmann or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.

Join us at our first in-person conference on June 25 all about AI Quality: https://www.aiqualityconference.com/.

Benjamin Wilms is a developer and software architect at heart, with 20 years of experience. He fell in love with chaos engineering. Benjamin now spreads his enthusiasm and new knowledge as a speaker and author – especially in the field of chaos and resilience engineering. Retrieval Augmented Generation // MLOps podcast #237 with Benjamin Wilms, CEO & Co-Founder of Steadybit. Huge thank you to Amazon Web Services for sponsoring this episode. AWS - https://aws.amazon.com/ // Abstract How to build reliable systems under unpredictable conditions with Chaos Engineering. // Bio Benjamin has over 20 years of experience as a developer and software architect. He fell in love with chaos engineering 7 years ago and shares his knowledge as a speaker and author. In October 2019, he founded the startup Steadybit with two friends, focusing on developers and teams embracing chaos engineering. He relaxes by mountain biking when he's not knee-deep in complex and distributed code. // MLOps Jobs board https://mlops.pallet.xyz/jobs // MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links Website: https://steadybit.com/ --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Benjamin on LinkedIn: https://www.linkedin.com/in/benjamin-wilms/ Timestamps: [00:00] Benjamin's preferred coffee [00:28] Takeaways [02:10] Please like, share, leave a review, and subscribe to our MLOps channels! [02:53] Chaos Engineering tldr [06:13] Complex Systems for smaller Startups [07:21] Chaos Engineering benefits [10:39] Data Chaos Engineering trend [15:29] Chaos Engineering vs ML Resilience [17:57 - 17:58] AWS Trainium and AWS Infecentia Ad [19:00] Chaos engineering tests system vulnerabilities and solutions

[23:24] Data distribution issues across different time zones

[27:07] Expertise is essential in fixing systems

[31:01] Chaos engineering integrated into machine learning systems

[32:25] Pre-CI/CD steps and automating experiments for deployments

[36:53] Chaos engineering emphasizes tool over value

[38:58] Strong integration into observability tools for repeatable experiments

[45:30] Invaluable insights on chaos engineering

[46:42] Wrap up

  continue reading

354 episodes

Artwork
iconShare
 
Manage episode 421348295 series 3241972
Content provided by Demetrios Brinkmann. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Demetrios Brinkmann or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.

Join us at our first in-person conference on June 25 all about AI Quality: https://www.aiqualityconference.com/.

Benjamin Wilms is a developer and software architect at heart, with 20 years of experience. He fell in love with chaos engineering. Benjamin now spreads his enthusiasm and new knowledge as a speaker and author – especially in the field of chaos and resilience engineering. Retrieval Augmented Generation // MLOps podcast #237 with Benjamin Wilms, CEO & Co-Founder of Steadybit. Huge thank you to Amazon Web Services for sponsoring this episode. AWS - https://aws.amazon.com/ // Abstract How to build reliable systems under unpredictable conditions with Chaos Engineering. // Bio Benjamin has over 20 years of experience as a developer and software architect. He fell in love with chaos engineering 7 years ago and shares his knowledge as a speaker and author. In October 2019, he founded the startup Steadybit with two friends, focusing on developers and teams embracing chaos engineering. He relaxes by mountain biking when he's not knee-deep in complex and distributed code. // MLOps Jobs board https://mlops.pallet.xyz/jobs // MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links Website: https://steadybit.com/ --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Benjamin on LinkedIn: https://www.linkedin.com/in/benjamin-wilms/ Timestamps: [00:00] Benjamin's preferred coffee [00:28] Takeaways [02:10] Please like, share, leave a review, and subscribe to our MLOps channels! [02:53] Chaos Engineering tldr [06:13] Complex Systems for smaller Startups [07:21] Chaos Engineering benefits [10:39] Data Chaos Engineering trend [15:29] Chaos Engineering vs ML Resilience [17:57 - 17:58] AWS Trainium and AWS Infecentia Ad [19:00] Chaos engineering tests system vulnerabilities and solutions

[23:24] Data distribution issues across different time zones

[27:07] Expertise is essential in fixing systems

[31:01] Chaos engineering integrated into machine learning systems

[32:25] Pre-CI/CD steps and automating experiments for deployments

[36:53] Chaos engineering emphasizes tool over value

[38:58] Strong integration into observability tools for repeatable experiments

[45:30] Invaluable insights on chaos engineering

[46:42] Wrap up

  continue reading

354 episodes

All episodes

×
 
Loading …

Welcome to Player FM!

Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.

 

Quick Reference Guide