This repository contains benchmark automata that can be used to evaluate various approaches to reasoning about regular properties. Our aim is to create a wide benchmark used in automata community for evaluating novel as well as past approaches with other tools.
⚠️ All of the benchmarks are kept in packed format (tgz). Github advises users to keep the repositories as small as possible and we decided to comply with this request. For a brief time we provide link to repository that contains unpacked benchmarks in several additionl files. We warn that this might be soon deleted (the repository is self hosted and takes more than 10 GB of data)
For unpacked benchmarks see: Benchmarks (mirror). The repository is a mirror of these benchmarks that were used in the comparison.
- Cloning our benchmarks from this mirror requires
git-lfs
support (see git-lfs) - Note that the benchmarks are quite extensive and occupies more than 10GB of storage.
Unpack the benchmarks:
./extract_benchmarks.sh
This will unpack the benchmarks into individual directories.
You can then use the benchmarks in your application, or you can use our replication package.
Note that in order to use the .emp
files you will need some kind
of interpreter of the file, since the files do not describe
automata, but programs with automata operations.
Similarly, to use .afa
or .mata
files you will need either some format converter or parser of
either of the formats.
.mata
is our own format (see AUTOMATAFORMAT.md) for various types of automata (AFA, NFA, DFA ,etc.).afa
is an intermediate format for AFA automata, that can be converted to other AFA formats supported by other tools (see v1_checker)..smt2
contains formulae in SMTLIB that are passed to SMT solvers, such as Z3 or CVC5..emp
is our own specification of simple programs, that contains automata operations.gen_aut/
directory contains source automata for.emp
files.ltlf
and.pltl
contains original formulae of LTL logic that were used to generate the benchmarks.
You can use generate_afa_benchmarks.sh
to generate additional formats. Either
(1) run the script without any parameters (then by default all of our selected
benchmarks will be transformed), (2) run the script as
./generate_afa_benchmarks.sh file.afa
to convert single .afa
benchmark to
four selected formats (.aig
, .ada
, .afasat
, .bisim
), or, (3) run the
script as ./generate_afa_benchmarks.sh dir
to generate all *.afa
files
found in the directory dir
.
This way, we additionaly support the following formats:
.aig
: format supported by bwIC3 (a checker based on IC3 algorithm build upon ABC model checker)..ada
: format supported by JAltImpact tool, an implementation o finterpolation-based algorithm..afasat
: format supported by Antisat, our own implementation of antichain AFA emptiness test integrated with SAT solver..bisim
: format supported by Bisim, an implementation of the AFA-emptiness check based on bisimulation.
There are no fixed naming conventions for the benchmarks. We only request to provide the benchmark
either in .afa
or .mata
file. This way we can convert them to other formats.
The benchmarks contain following formats and files:
We divided benchmarks into two sets: b-*
models boolean combinations of regular expression benchmarks,
while a-*
models AFA benchmarks.
Note that there are some formulae, that do not fit into one of these categories. We omitted them from our experiments, as they were not that interesting.
-
b-smt
contains 330 string constraints from the Norn and SyGuS-qgen, collected in SMT-LIB benchmark [8], that fall in BRE. This includes following directories:bool_comb/ere/QF_SLIA_Norn
bool_comb/ere/QF_S_sygus_qgen
-
b-regex
contains 500 problems, obtained analogously as in [30] and [77], from RegExLib [71]. This includes following directories:email_filter
-
b-hand-made
has 56 difficult handwritten problems from [73] containing membership in regular expressions extended with intersection and complement. They encode (1) date and password problems, (2) problems where Boolean operations interact with concatenation and iteration, and (3) problems with exponential determinization. This includes following directories:bool_comb/ere/boolean_and_loops
bool_comb/ere/date
bool_comb/ere/det_blowup
bool_comb/ere/password
-
b-armc-incl
contains 171 language inclusion problems from runs of abstract regular model checking tools (verification of the bakery algorithm, bubble sort, and a producer- consumer system) of [12]. This includes following directories:automata-inclusion
-
b-param
has 8 parametric problems. Four are from [40]. Another four are from [28]. This includes following directories:bool_comb/cox
bool_comb/intersect
-
a-ltl-rand
contains 300 LTLf formulae obtained with the random generator of [77]. The generator traverses the syntactic tree of the LTL grammar, and is controlled by the number of variables, probabilities of connectives, maximum depth, and average depth.We have set the parameters empirically in a way likely to generate examples difficult for the compared solvers (the formulae have 6 atomic propositions and maximum depth 16). This includes following directories:ltl_afa/random_ltl
-
a-ltl-param
has a pair of hand-made parametric LTLf formulae (160 formulae each) used in [30] and [77]: Lift [43] describes a simple lift operating on a parametric number of floors and Counter [72] describes a counter incremented modulo the parameter. This includes following directories:ltl_afa/parametric_ltl
-
a-ltl-spec
[60] contains 62 LTLf formulae that specify realistic systems, used by Boeing [14] and NASA [42]. The formulae represent specifications used for designing Boeing AIR 6110 wheel-braking system and for designing NASA NextGen air traffic control (ATC) system. This includes following directories:ltl_afa/created_ltl/nasa-boeing
-
a-ltlf-patterns
comes from transformation of linear temporal logic formulae over finite traces (LTLf) to AFA [34]. The 1699 formulae are from [60]8 and they represent common LTLf patterns which can be divided into two groups: (1) 7 parametric patterns (100 each) and (2) randomly generated conjunctions of simpler LTLf patterns (999 formulae). This includes following directories:ltl_afa/created_ltl/LTLf-specific
-
a-sloth
4062 AFA emptiness problems to which the string solver Sloth reduced string constraints [47]. The AFA have complex multi-track transitions encoding Boolean operations and transductions, and a special kind of synchronization of traces requiring complex initial and final conditions. This includes following directories:stranger_afa
-
a-noodler
13840 AFA emptiness problems that correspond to certain sub-problems solved within the string solver Noodler in [10]. The AFA were created similarly as those ofa-sloth
, but encode a different particular set of operations over different input automata. This includes following directories:noodler
- Lukáš Holík, Tomas Fiedor, Adam Rogalewicz, Pavol Vargovčík, Martin Hruska and Juraj Síč: Reasoning about Regular Properties: A Comparative Study. In Proc. of CADE'23, 2023.
- See the core site for more information, updates and many more.
- arXiv preprint: preprint of the paper.
- For more recent version of the paper, either check the official CADE'23 submission or newer version in the arXiv
- Replication package: A virtual machine for replication of our results
- The replication package is in
.ova
format and needs to be imported in some virtual image manager (see, e.g., VirtualBox or VMware. - The replication packages occupies about 10GB of data.
- The replication package is in
- Results: results of the comparison.
Our repositories are open for forking, enabling interested individuals to explore, modify, and build upon our codebase. We wish to extend this benchmark with other problems and sources that could enhance the comparison of all widely used and presented tools.
If you are interested in contributing your own benchmark we suggest following:
- Fork this repository.
- Add your own benchmark formulae, preferably in
.mata
or.afa
format. See our format description above. If you need help with converting your format into one of our formats contact us. We have scripts capable of converting between different formats supported by other tools. - Create a Pull Request from your fork. We will check your contribution, optionally request changes and if we are satisfied, we will merge the changes into our repository.
- Please, in your pull request include the following:
- Describe your benchmark: how it was obtained, whether it was generated, the origin of the problem it models, etc.
- Submit your benchmark in either
.zip
ortgz
. This is due to a fact, that Github asks developers to keep their repositories as small as possible. - If your benchmark is part of some paper or other work, please include citation to these works.
For other ways of contributing to our project refer to our main site
- 5.6.2023: Initial version of benchmarks
- Tomas Fiedor <[email protected]>
- Lukas Holik <[email protected]>
- Martin Hruska <[email protected]>
- Adam Rogalewicz <[email protected]>
- Juraj Sic <[email protected]>
- Pavol Vargovcik <[email protected]>
This work has been supported by the Czech Ministry of Education, Youth and Sports ERC.CZ project LL1908, and the FIT BUT internal project FIT-S-20-6427.