Release v1.6.0 Release · eBay/tsv-utils

To download and unpack prebuilt binaries:

$ # Linux
$ curl -L https://github.com/eBay/tsv-utils/releases/download/v1.6.0/tsv-utils-v1.6.0_linux-x86_64_ldc2.tar.gz | tar xz

$ # MacOS
$ curl -L https://github.com/eBay/tsv-utils/releases/download/v1.6.0/tsv-utils-v1.6.0_osx-x86_64_ldc2.tar.gz | tar xz

Installation instructions are in the ReleasePackageReadme.txt file in the release package.

To be notified of new releases:

GitHub supports notification of new releases. Click the "Watch" button on the repository page and select "Releases Only".

Release 1.6.0 Changes:

Prebuilt binaries have been updated to use the latest LDC compiler (1.20.1).

tsv-select: New feature, the ability to exclude fields (PR #267).

Fields to exclude are specified with the --e|exclude option. Examples:

$ # Drop the first field, keep everything else.
$ # Equivalent to `cut -f 2- file.tsv`
$ tsv-select --exclude 1 file.tsv

$ # Drop fields 3-10, keep everything else
$ tsv-select --exclude 3-10 file.tsv

See the tsv-select reference for more information.

New tool: tsv-split (PR #270)

tsv-split is used to split one or more input files into multiple output files. There are three modes of operation:
- Fixed number of lines per file (--l|lines-per-file NUM): Each input block of NUM lines is written to a new file. This is similar to the Unix split utility.
- Random assignment (--n|num-files NUM): Each input line is written to a randomly selected output file. Random selection is from NUM files.
- Random assignment by key (--n|num-files NUM, --k|key-fields FIELDS): Input lines are written to output files using fields as a key. Each unique key is randomly assigned to one of NUM output files. All lines with the same key are written to the same file.
Examples:
```
$ # Split a file into files of 10,000 lines each.
$ tsv-split data.txt --lines-per-file 10000 --dir split_files

$ # Split a file into 1000 files with lines randomly assigned.
$ tsv-split data.txt --num-files 1000 --dir split_files

# Randomly assign lines to 1000 files using field 3 as a key.
$ tsv-split data.tsv --num-files 1000 -key-fields 3 --dir split_files
```
See the tsv-split reference for more information.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.6.0 Release