Language comparisons

Language Comparisons

We are interested in comparing some simple algorithms, written each in different programming languages, through the use of static metrics. A static metric is obtained parsing and extracting information from a source code without depending on any information deduced at runtime.

All considered metrics have been computed making use of a software written in Rust language and called: rust-code-analysis.
This software can receive in input either single files or entire directories, detect whether they contain any kind of code written in one of its supported languages, and output the resultant static metrics in various formats: textual, json, yaml, toml, plus a binary one called cbor.

In the next sections, we will describe how comparisons have been implemented and the choices adopted during the development.

Why have we chosen a software written in Rust?

Rust is an innovative programming language initially developed by Mozilla and currently maintained and improved by Rust Foundation.
Its main goal consists of allowing everyone to build reliable and efficient software.

Speaking of its main characteristics, Rust is multi-paradigm programming language focused on performance and safety, especially safe concurrency.
It can be used on different architectures with little effort and provide a good documentation for anyone who want to learn or improve their own knowledge of the language.
In addition, it is quite pervasive in the industrial world, indeed hundreds of companies around the world are currently using Rust in production today for fast, low-resource, cross-platform solutions. For example, software like Firefox, Dropbox, and Cloudflare use Rust. From startups to large corporations, from embedded devices to scalable web services, Rust is a great fit.

From our point of view instead, we have decided to adopt, and personally extend, a project written in Rust because of the advantages listed below:

Guarantees memory-safety and thread-safety, eliminating many classes of bugs at compile-time
Fast and memory-efficient
Few runtime checks
No garbage collector
Easily integrates with other programming languages
Useful and clear error messages
Good documentation

Why have we decided to modify and extend the json files produced by rust-code-analysis?

We have decided to modify the output produced by rust-code-analysis for the following reasons:

Change the names of the metrics which are not coherent with the ones present in scientific literature
Change the type of data associated to a metric. Indeed, rust-code-analysis returns floating point values instead of integers because aims at being as versatile as possible
Aggregate the metrics of each source code present in a directory within a single json-object containing not only the result of this aggregation but also the respective metrics of each file. This additional data allow to obtain a more general prospect on the quality of a project written in a determined programming language.

FIXME: In our experiment, data aggregation is NOT considered, so I don't know if we should mention it between the main reasons

Comparisons Algorithms and Languages

We are comparing 9 simple algorithms written each in 5 different languages. All implementations of these algorithms have been taken from this (https://github.com/greensoftwarelab/Energy-Languages)[https://github.com/greensoftwarelab/Energy-Languages] repository that has been chosen because it is actively maintained and whose algorithms are adopted by a great variety of other projects for tests and benchmarking purposes.

The considered algorithms, sorted out alphabetically, are:

binarytrees
fannkuchredux
fasta
knucleotide
mandelbrot
nbody
regexredux
revcomp
spectralnorm

All of them are contained in the Assets directory of our repository.

FIXME: pidigits can't be added because it is not implemented in Javascript and TypeScript, and the same goes for bubble_sort which is implemented in C, C++, and Rust only ---> should we remove them?

For what concerns the programming languages, we was restricted to use a limited number of 5 because only few languages are currently parsed by rust-code-analysis. Below a list of them sorted out alphabetically:

C
C++
JavaScript
Python
Rust
TypeScript

Json Structure of Computed Metrics

The types of metrics computed for each algorithm are described in the README of our repository (TODO: explain metrics in a detailed way within the paper).

We have set rust-code-analysis to export metrics as a json file. Then, through a Python script called analyzer.py, we have enriched the structure of each json file produced by rust-code-analysis such that it was possible to analyze the global metrics obtained aggregating metrics from different files contained in the same directory.
In addition, a json array has been added to this new json version containing all metrics computed for each file of a directory.

For our comparisons though, the additional global data computed by the analyzer.py script are not necessary at all, since the analyzed algorithms are processed one at a time and there is no correlation among them, practically they are all independent of each other.

Output json files are contained in the Results directory of our repository.

Comparison script structure

The Python script that executes the various comparisons is called compare.py. To simplify the entire comparison process, we have introduced the configuration concept.
A configuration is nothing less than a pair of different programming language versions of the same algorithm.

For each configuration, the script runs the following steps:

Computes the metrics for the two files of a configuration calling the analyzer.py script
Loads the two json files from the Results directory and compares them producing a json file of differences
Deletes from the json file of differences all local metrics (the ones computed by rust-code-analysis for each subspace)
Saves the json file of differences, now containing only global file metrics, in the Compare directory

The json file of differences is produced using a JavaScript program called json-diff that can be easily downloaded and built using the npm package manager.

Source Codes Resume

Name	analyzer.py
Description	Runs `rust-code-analysis` in order to compute the various metrics, formats the output json files in a certain way, and saves them in a determined directory
Reference	https://github.com/SoftengPoliTo/SoftwareMetrics/blob/master/analyzer.py
Characteristics	Analyzes the parameters passed as input to evaluate their correctness, contains some debug code to detect implementation errors in an easier way

Name	compare.py
Description	Executes the comparisons between various language configurations
Reference	https://github.com/SoftengPoliTo/SoftwareMetrics/blob/master/compare.py
Characteristics	Makes the difference between two json files and outputs the resultant json file

Input Algorithms Resume

The implementation of the input algorithms, with relative comments to the code, can be found on the Energy-Languages repository in the directories associated to the supported programming languages.

Name	binarytrees
Description	Allocate and deallocate many many binary trees
Reference	https://benchmarksgame-team.pages.debian.net/benchmarksgame/description/binarytrees.html#binarytrees

Name	fannkuch-redux
Description	Indexed-access to tiny integer-sequence
Reference	https://benchmarksgame-team.pages.debian.net/benchmarksgame/description/fannkuchredux.html#fannkuchredux

Name	fasta
Description	Generate and write random DNA sequences
Reference	https://benchmarksgame-team.pages.debian.net/benchmarksgame/description/fasta.html#fasta

Name	k-nucleotide
Description	Hashtable update and k-nucleotide strings
Reference	https://benchmarksgame-team.pages.debian.net/benchmarksgame/description/knucleotide.html#knucleotide

Name	mandlebrot
Description	Generate Mandelbrot set portable bitmap file
Reference	https://benchmarksgame-team.pages.debian.net/benchmarksgame/description/mandelbrot.html#mandelbrot

Name	n-body
Description	Double-precision N-body simulation
Reference	https://benchmarksgame-team.pages.debian.net/benchmarksgame/description/nbody.html#nbody

Name	regex-redux
Description	Match DNA 8-mers and substitute magic patterns
Reference	https://benchmarksgame-team.pages.debian.net/benchmarksgame/description/regexredux.html#regexredux

Name	reverse-complement
Description	Read DNA sequences - write their reverse-complement
Reference	https://benchmarksgame-team.pages.debian.net/benchmarksgame/description/revcomp.html#revcomp

Name	spectral-norm
Description	Eigenvalue using the power method
Reference	https://benchmarksgame-team.pages.debian.net/benchmarksgame/description/spectralnorm.html#spectralnorm

Provide feedback

Saved searches

Use saved searches to filter your results more quickly