FLOSS Weekly 458: Crail

59:11
 
Share
 
Manage episode 191235307 series 1311254
By Discovered by Player FM and our community — copyright is owned by the publisher, not Player FM, and audio streamed directly from their servers.

Crail is a storage platform for sharing performance critical data in distributed data processing jobs at very high speed. Crail is built entirely upon principles of user-level I/O and specifically targets data center deployments with fast network and storage hardware (e.g., 100Gbps RDMA, plenty of DRAM, NVMe flash, etc.) as well as new modes of operation such resource disaggregation or server-less computing. Crail is written in Java and integrates seamlessly with the Apache data processing ecosystem (e.g., Spark, Hadoop, Flink). It can be used as (i) a backbone to accelerate high-level data operations such as shuffle, reduce, or broadcast; (ii) a cache to store hot data that is queried repeatedly; (iii) a storage platform for sharing inter-job data in complex multi-job pipelines. Last week, Crail has been voted in to become an Apache Incubator project.

Hosts: Randal Schwartz and Dan Lynch

Guests: Patrick Stuedi and Animesh Trivedi

59 episodes available. A new episode about every 6 days averaging 58 mins duration .