Artwork

Content provided by HPR Volunteer and Hacker Public Radio. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by HPR Volunteer and Hacker Public Radio or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.
Player FM - Podcast App
Go offline with the Player FM app!

HPR2698: XSV for fast CSV manipulations - Part 1

 
Share
 

Manage episode 222633845 series 108988
Content provided by HPR Volunteer and Hacker Public Radio. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by HPR Volunteer and Hacker Public Radio or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.

XSV for fast CSV manipulations - Part 1: Basic Usage

https://github.com/BurntSushi/xsv

Introduction

xsv is a command line program for indexing, slicing, analyzing, splitting and joining CSV files. Commands should be simple, fast and composable:

  1. Simple tasks should be easy.
  2. Performance trade offs should be exposed in the CLI interface.
  3. Composition should not come at the expense of performance.

We will be using the CSV file provided in the documentation.

Commands covered in this episode

  • count - Count the rows of CSV data
  • headers - Show the headers of CSV data, or show the intersection of all headers between many CSV files
  • index - Create an index for a CSV file. This is very quick and provides constant time indexing into the CSV file.
  • frequency - Build frequency tables of each column in CSV data.
  • stats - Show basic types and statistics of each column in the CSV file. (i.e., mean, standard deviation, median, range, etc.)
  • sort - Sort CSV data
  • select - Select or re-order columns from CSV data.
  • slice - Slice rows from any part of a CSV file. When an index is present, this only has to parse the rows in the slice (instead of all rows leading up to the start of the slice).
  • search - Run a regex over CSV data. Applies the regex to each field individually and shows only matching rows.
  • table - Show aligned output of any CSV data using elastic tabstops.
  • flatten - A flattened view of CSV records. Useful for viewing one record at a time.
  continue reading

4110 episodes

Artwork
iconShare
 
Manage episode 222633845 series 108988
Content provided by HPR Volunteer and Hacker Public Radio. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by HPR Volunteer and Hacker Public Radio or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.

XSV for fast CSV manipulations - Part 1: Basic Usage

https://github.com/BurntSushi/xsv

Introduction

xsv is a command line program for indexing, slicing, analyzing, splitting and joining CSV files. Commands should be simple, fast and composable:

  1. Simple tasks should be easy.
  2. Performance trade offs should be exposed in the CLI interface.
  3. Composition should not come at the expense of performance.

We will be using the CSV file provided in the documentation.

Commands covered in this episode

  • count - Count the rows of CSV data
  • headers - Show the headers of CSV data, or show the intersection of all headers between many CSV files
  • index - Create an index for a CSV file. This is very quick and provides constant time indexing into the CSV file.
  • frequency - Build frequency tables of each column in CSV data.
  • stats - Show basic types and statistics of each column in the CSV file. (i.e., mean, standard deviation, median, range, etc.)
  • sort - Sort CSV data
  • select - Select or re-order columns from CSV data.
  • slice - Slice rows from any part of a CSV file. When an index is present, this only has to parse the rows in the slice (instead of all rows leading up to the start of the slice).
  • search - Run a regex over CSV data. Applies the regex to each field individually and shows only matching rows.
  • table - Show aligned output of any CSV data using elastic tabstops.
  • flatten - A flattened view of CSV records. Useful for viewing one record at a time.
  continue reading

4110 episodes

All episodes

×
 
Loading …

Welcome to Player FM!

Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.

 

Quick Reference Guide