Short application to upload binary files to Cassandra cluster. Created to check cassandra speed on inserting raw data to blob field type.
Application depends on gocql package.
Uploader built to write files to Cassandra table which could be create by following cqlsh command:
CREATE TABLE stat (first int PRIMARY KEY, second blob);
No matter how many files will be into indicated folder(./files/ by default), but to make it simple better create hundreds of pseudo-random files:
for i in {1..1000}; do dd if=/dev/urandom of=~/files/${i}.rand bs=5M count=1; done
this will create 1000x5MB files in ./folder. Based on tests, 5MB - optimal size to write to Cassadra: big enough and no to big, to see timeouts.
Now you are ready to load files into your Cassadra cluster.
Application accept following command-line arguments:
-concurent=5: amount of concurent writes
-path="./files/": path to directory with blob files to upload
-servers_list="::1": list of cassandra servers to connect, i.e.: 2001:db8:f:ffff:0:0:0:1,2001:db8:f:ffff:0:0:0:2
-keyspace="simple_space": keyspace where target table located
-table="stat": table where blobs will be saved. Should have following structure: first:int, second:blob
but if you're doing everything by this README, just do following steps:
- scp application to your cassandra server
- generate 1000 pseudo-random files into ./files/ directory
- create simple_space tablespace and stat table
- run application without any arguments
you should see verbosity output about upload process. By default write will be started with concurency 5.
In 2 nodes cluster(one VPS per DC) on comodity VPS(1 VCPU,2 GB RAM, 20GB HDD) i've reached ~46MB/s writes(with QUORUM).
Based on Cassandra design, if you have much more nodes, you should try start this test on half of nodes in same time, summary throughput will be significally high.