Scalyr: Column-Oriented Log Management with Steve Newman


Manage episode 218317913 series 1437556
By Discovered by Player FM and our community — copyright is owned by the publisher, not Player FM, and audio streamed directly from their servers.

Log messages are fast, high volume, unstructured data. Logs are often the source of metrics, alerts, and dashboards, so these critical systems are downstream from a log management system. A log management system needs to be highly available, so that a failure in one part of your system will not be correlated with failure of the log management system.

Users of a log management system are often building tools based off of the query engine of that log management system. For example, I might build a dashboard that gives me a line graph representing the number of times a certain log message is alerting me due to a memory warning. I write a query to return the instances of these memory warnings, and my line graph is a visual representation that query. A log management system needs to be able to quickly serve users that are querying their logs–whether for dashboards or for ad-hoc queries.

When logs are ingested by a log management system, the logs get parsed in a way that can bring some structure to the blob of text that is a raw log message. Some log management systems will then add the log message to an index. An index can allow for very fast lookups of particular types of queries. But an index also has certain constraints–such as processing regular expression queries.

Steve Newman is the CEO and founder of Scalyr, a log management system that uses a column-oriented data storage system instead of the more conventional index-based log management systems. Today’s episode is a great case study in distributed systems tradeoffs. Steve talks in great detail about how Scalyr maintains high uptime, and its system for ingesting logs and serving queries.

Show Notes

The post Scalyr: Column-Oriented Log Management with Steve Newman appeared first on Software Engineering Daily.

153 episodes available. A new episode about every 6 days averaging 57 mins duration .