Big Data Life Cycle Gets The Self-service Treatment Talking Data Podcast

Big data life cycle gets the self-service treatment

7y ago 8:46

Archived series ("HTTP Redirect" status)

Replaced by: Talking Data Podcast » Episodes

When? This feed was archived on April 04, 2018 03:10 (6+ y ago). Last successful fetch was on March 23, 2018 01:12 (6+ y ago)

Why? HTTP Redirect status. The feed permanently redirected to another series.

What now? If you were subscribed to this series when it was replaced, you will now be subscribed to the replacement series. This series will no longer be checked for updates. If you believe this to be in error, please check if the publisher's feed link below is valid and contact support to request the feed be restored or if you have any other concerns about this.

Content provided by Talking Data Podcast » Episodes. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Talking Data Podcast » Episodes or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.

As 2017 winds down, we invite you to take a look behind the big data curtain. There, you will find data engineers, data scientists, end-users and others working to move a big data concept into production. It doesn’t take much digging to find that more self-service capabilities are needed at each stage in the data life cycle.

That is among the take-aways from this latest edition of the Talking Data Podcast. In this and a subsequent episode, Ed Burns and I discuss recent user stories that graced the editorial pages of SearchBusinessAnalytics.com and SearchDataManagement.com – ones that speak to some of the outstanding trends of the year just winding down.

One of the telling threads we found was self-service; that is, self-service as it relates to ETL, as it relates to interactive data queries, and as it relates to cluster configuration. In the latter case we have as example restaurateur Panera Bread. The chain is among the company’s with particularly aggressive web initiatives underway.

More and more, when lunchtime arrives, incoming orders come in via cell phone. That can stress operational systems. Aware of this threat, Panera Bread built a Spark-Hadoop system to analyze computing needs for the processing involved in handling the lunchtime crush. It was the first in a series of Hadoop apps that Panera is spinning up quickly, after deciding to use automated container configuration software.

Panera announced earlier this year that annual digital sales had gone past $1 billion, and that projected digital sales could double by 2019. The ability to let individuals spin up big data jobs at will become handier going forward, one of the company’s engineering leads said.

Self-service that empowers more individuals in the data pipeline is a fact of life that IT has generally come to accept. It seems now to be a big part of moving at the speed of innovation. Listen to this podcast and feel free to come back for seconds.

The post Big data life cycle gets the self-service treatment appeared first on Talking Data Podcast » Episodes.

80 episodes