Perf-test get_used_symbols and its variants #324

Adda0 · 2023-09-06T07:00:43Z

There are numerous variants of get_used_symbols(). We need to perf-test them and keep only the most performant one. The remaining variants should be removed.

The text was updated successfully, but these errors were encountered:

koniksedy · 2024-11-21T16:08:10Z

The performance of different implementations of the method get_used_states (implemented by @kilohsakul) was tested on real-world and random automata. Interestingly, the performance improves with repeated calls to the method (to get the average performance). Sometimes it even changes which implementation is the fastest. Therefore we measured the average performance and the performance of the first call.

Tested implementations:

vec - using get_used_symbols_vec() with std::vector
set - using get_used_symbols_set() with std::set
sps - get_used_symbols_sps() with utils::SparseSet
bv1 - get_used_symbols_bv() with utils::BitVector and without preallocation of the returned vector
bv2 - get_used_symbols_bv() with utils::BitVector and with preallocation of the returned vector
chv - get_used_symbols_ch() with BoolVector that is inherited from std::vector<uint8_t>

Tested automata:

Random Tree Automata
Automata for Regualr Expression .*{n}
Tabakov-Vardi Random Automata
ARMC Automata
Netbench Symbolic Automata
Automata for Email Validation
Automat for Regular Expressions

`_STATIC_DATA_STRUCTURES_`

Static data structures drastically improve the performance of the sparse set. However, we present only the results of benchmarks without the static data structure, as the sparse set is not our preferred choice for the get_used_states implementation.

Cactus Plot With and Without `_STATIC_DATA_STRUCTURES_`

Overall

The vector implementation (vec) we currently use seems to be the slowest variant. The bitvector (bv1, bv2) and BoolVector (chv) implementations appear to be the best. It is necessary to confirm this by running experiments with Z3-Noodler.

Results

Random Tree Automata

Automata were generated with 20 branches, each containing $|\Sigma|$ symbols with p symbols per transition. The depth of each branch was determined as $d \sim |\Sigma| / p$.

Runtime

Runtime based on the alphabet size $|\Sigma|$ and the number of symbols per transition $p$.

Cactus Plot

Scatter Plot Matrix

Automata for Regular Expression `.*{n}`

Automata were generated for different n and the size of the alphabet $|\Sigma|$.

Runtime

Runtime based on the alphabet size $|\Sigma|$ and the number of self-loops.

Cactus Plot

Scatter Plot Matrix

Tabakov-Vardi Random Automata

Automata were generated using the function crrate_random_nfa_tabakov_vardi() with different numbers of states, alphabet size $|\Sigma|$, and transition density.

Runtime

Runtime based on the alphabet size $|\Sigma|$, transition density, and the number of states.

Cactus Plot

Scatter Plot Matrix

ARMC Automata

Cactus Plot

Scatter Plot Matrix

Netbench Symbolic Automata

Cactus Plot

Scatter Plot Matrix

Automata for Email Validation

Cactus Plot

Scatter Plot Matrix

Automat for Regular Expressions

Cactus Plot

Here, the **vec** variant was the best on average but not so good regarding the single run.

Scatter Plot Matrix

Adda0 added For:library The issue is related to library (c++ implementation) Module:nfa The issue is related to Nondeterministic Finite Automata Type:required A required implementation/change necessary in near future Priority:normal Work on this sooner rather than later. labels Sep 6, 2023

koniksedy self-assigned this Nov 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Perf-test get_used_symbols and its variants #324

Perf-test get_used_symbols and its variants #324

Adda0 commented Sep 6, 2023

koniksedy commented Nov 21, 2024 •

edited

Loading

Perf-test get_used_symbols and its variants #324

Perf-test get_used_symbols and its variants #324

Comments

Adda0 commented Sep 6, 2023

koniksedy commented Nov 21, 2024 • edited Loading

_STATIC_DATA_STRUCTURES_

Cactus Plot With and Without _STATIC_DATA_STRUCTURES_

Overall

Results

Random Tree Automata

Runtime

Cactus Plot

Scatter Plot Matrix

Automata for Regular Expression .*{n}

Runtime

Cactus Plot

Scatter Plot Matrix

Tabakov-Vardi Random Automata

Runtime

Cactus Plot

Scatter Plot Matrix

ARMC Automata

Cactus Plot

Scatter Plot Matrix

Netbench Symbolic Automata

Cactus Plot

Scatter Plot Matrix

Automata for Email Validation

Cactus Plot

Scatter Plot Matrix

Automat for Regular Expressions

Cactus Plot

Scatter Plot Matrix

koniksedy commented Nov 21, 2024 •

edited

Loading

`_STATIC_DATA_STRUCTURES_`

Cactus Plot With and Without `_STATIC_DATA_STRUCTURES_`

Automata for Regular Expression `.*{n}`