Skip to content

RandomQueryGeneratorReporters

philip-stoev edited this page Jul 17, 2012 · 1 revision

Category:RandomQueryGenerator

Table of Contents

Reporters

The Reporters are extra modules that perform checks on the server, either periodically while the test is running or after the test has terminated. Unlike Validators, they are not related to the result of a particular query and are run in a separate monitoring process. By default, the Deadlock, ErrorLog and Backtrace are enabled.

Available Reporters

The following Reporters are available:

Deadlock Detection

Deadlock - this Reporter checks periodically whether the server has deadlocked. A deadlock is if one of the following situations occur:

  • The server does not accept a new connection within 40 seconds. This usually means that the server is completely deadlocked and is unable to accept new connections because a core mutex is deadlocked. Similar situations also occur when the CSV log table is logged and the new connection attempt hangs while writing into the log the table;
  • More than 5 queries from the SHOW PROCESSLIST are taking more than 600 seconds to complete. This usually happens if a timeout of some sort does not trigger properly and a transation-level deadlock remains unresolved forever. Note that a grammar that contains long-running queries may cause the Reporter to declare false deadlocks;
  • The test itself took more than twice the desired running time, as specified via the --duration command-line option. This protection is in place to detect situations where a single query has hanged and prevents the test from completing. A false alarm may be triggered if the grammar generates queries that take much longer to execute as compared to the --duration desired.
Once a deadlock is declared, the server process is forced to produce a core file (either by killing it with SIGSEGV or by using cdb). This would trigger the Backtrace reporter to actually dump the backtraces of the deadlocked threads.

Server Debugging

  • Backtrace - if the RQG detects that the server has crashed, the Backtrace reporter will use whatever debugger is available on the system in order to produce a backtrace, both of the crashing thread and all threads in the server. On Windows, cdb is used, on Solaris -- dbx or pstack and on Linux -- gdb.
  • ErrorLog - this reporter prints the last 100 lines from the server error log to stdout. Those lines are likely to contain the backtrace as produced by the server itself, and possibly deadlock information (for the Falcon storage engine) or other useful information. Printing those lines to STDOUT makes this information visible in the test output.
  • ErrorLogAlarm - periodically scans the server's error log for a specific pattern (regular expression) and returns a critical error (STATUS_ALARM) if this pattern is found. The default pattern is '^ERROR'. The pattern can be customized by editing the file lib/GenTest/Reporter/ErrorLogAlarm.pm. Potential usages include alerting the user of non-fatal errors mentioned in the error log, and alerting the user of specific valgrind warnings (regex pattern may require customization).

Test Debugging

  • LockTableKiller - uses KILL again LOCK TABLE queries that took more than 10 seconds to complete. This is useful in order to move the test forward in the face of a lot of deadlocks that are not covered by a server-side timeout.
  • QueryTimeout - uses KILL against queries that take more than 5 seconds to run. This is useful in order to move the test forward in the face of slow, uninteresting queries, for example, when trying to debug a crash that happens later in the test but does not require all previous slow queries to run to completion.
  • WinPackage - On Windows, this Reporter moves the mysqld.exe and mysqld.cdb file from the bindir in the data dir, so that they are together with the core file, thus facilitating easier debugging in Visual Studio.
  • Shutdown - initiates a graceful shutdown at the end of the test. This is useful when running the RQG inside an automation framework that requires that all child processes (including mysqld) are terminated before the test is considered complete.

Replication Testing

  • ReplicationConnectionKiller - uses tcpkill to periodically disrupt the TCP connection between the master and the slave. In a properly running replication setup, such disruptions would not cause replication to break permanently;
  • ReplicationThreadRestarter - periodically issues START|STOP IO_THREAD|SQL_THREAD in an attempt to distrupt replication. Similar effect can be achieved by putting appropriate SQL in the grammar itself;
  • ReplicationLogFlusher - periodically issues FLUSH LOGS on the master in order to cause log rotation while replication is running. Similar effect can be achieved by putting FLUSH LOGS in the grammar itself.
  • ReplicationSemiSync - tests semi-synchronous replication by periodically disturbing replication via START|STOP SLAVE for varying ammounts of time and makes sure that transactions on the master are not being committed before the semi-synchronous timeout has expired. Due to the relatively weak guarantees provided by semi-synchronous replication, the Reporter is fairly permissive since there are various situations between the master and the slave that are considered permissible in semi-synchronous replication.

Recovery Testing

  • Recovery - if enabled, this monitor will kill the server about 20 seconds before the expected end of the test (based on the --duration parameter. A recovery will then be initiated and failure will be reported in case the recovery did not succeed. In addition, each table on the server will be read using various index access methods, and any discrepancies in the result sets will be reported. (Killing the server while the testing processes are still running will inevitably produce some "connection lost" messages. However, waiting for the testing processes to complete before killing the server would mean no active transactions will be present, possibly masking bugs in the recovery.)
  • RecoveryConsistency - works similar to the Recovery reporter, except that consistency after recovery is checked via issuing the following query:
 SELECT (SUM(`int_key`)  + SUM(`int`)) / COUNT(*) FROM ...

against each table in the database. This query is expected to return 200.0000 if the database is consistent after recovery, which in turn implies a data set and a grammar that will maintain that value throughout the test. See RandomQueryGeneratorTests#Transactional_Integrity for an example.

Reporter API