v1.6.0 Release
To download and unpack prebuilt binaries:
$ # Linux
$ curl -L https://github.com/eBay/tsv-utils/releases/download/v1.6.0/tsv-utils-v1.6.0_linux-x86_64_ldc2.tar.gz | tar xz
$ # MacOS
$ curl -L https://github.com/eBay/tsv-utils/releases/download/v1.6.0/tsv-utils-v1.6.0_osx-x86_64_ldc2.tar.gz | tar xz
Installation instructions are in the ReleasePackageReadme.txt
file in the release package.
To be notified of new releases:
GitHub supports notification of new releases. Click the "Watch" button on the repository page and select "Releases Only".
Release 1.6.0 Changes:
-
Prebuilt binaries have been updated to use the latest LDC compiler (1.20.1).
-
tsv-select
: New feature, the ability to exclude fields (PR #267).Fields to exclude are specified with the --e|exclude option. Examples:
$ # Drop the first field, keep everything else. $ # Equivalent to `cut -f 2- file.tsv` $ tsv-select --exclude 1 file.tsv $ # Drop fields 3-10, keep everything else $ tsv-select --exclude 3-10 file.tsv
See the tsv-select reference for more information.
-
New tool:
tsv-split
(PR #270)tsv-split
is used to split one or more input files into multiple output files. There are three modes of operation:-
Fixed number of lines per file (
--l|lines-per-file NUM
): Each input block of NUM lines is written to a new file. This is similar to the Unixsplit
utility. -
Random assignment (
--n|num-files NUM
): Each input line is written to a randomly selected output file. Random selection is from NUM files. -
Random assignment by key (
--n|num-files NUM, --k|key-fields FIELDS
): Input lines are written to output files using fields as a key. Each unique key is randomly assigned to one of NUM output files. All lines with the same key are written to the same file.
Examples:
$ # Split a file into files of 10,000 lines each. $ tsv-split data.txt --lines-per-file 10000 --dir split_files $ # Split a file into 1000 files with lines randomly assigned. $ tsv-split data.txt --num-files 1000 --dir split_files # Randomly assign lines to 1000 files using field 3 as a key. $ tsv-split data.tsv --num-files 1000 -key-fields 3 --dir split_files
See the tsv-split reference for more information.
-