forked from ldbc/ldbc_snb_docs
-
Notifications
You must be signed in to change notification settings - Fork 0
/
running.tex
353 lines (329 loc) · 13.3 KB
/
running.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
\section{Running the benchmark}
Running a benchmark workload involves a number of steps, including data import, driver configuration and workload execution.
The data import step involves loading the output of the data generator into the vendor database,
deploying the database and, optionally, performing a warm-up phase.
The other steps are explained in the following sections.
\subsection{Driver Configuration}
\label{sub:driver_configuration}
Before running a benchmark workload the workload driver~\cite{test_driver} must be configured, this involves the following steps:
\begin{enumerate}
\item Programming: implement a vendor specific database connector (covered in \autoref{ssub:vendor_specific_database_connector})
\item Configuration: configure the general driver properties (covered in \autoref{ssub:general_driver_properties})
\item Configuration: configure the workload specific driver properties (covered in \autoref{ssub:workload_specific_driver_properties})
\end{enumerate}
\subsubsection{Vendor specific database connector}
\label{ssub:vendor_specific_database_connector}
Vendors must provide the workload driver with implementations for a number of Java classes: \texttt{Db} and \texttt{OperationHandler}.
More specifically, one implementation of \texttt{Db},
(optionally) one implementation of \texttt{DbConnectionState} (used to manage shared resources among queries, \eg network connections),
and one implementation of \texttt{OperationHandler} per query in the workload.
As a guide, to implement the necessary classes for a workload of two queries (\texttt{LdbcQuery1} and \texttt{LdbcQuery2}),
see the code snippet below.
For a more detailed explanation of how this is done refer to the workload driver documentation \cite{test_driver}.
{\footnotesize
\begin{verbatim}
public class ExampleDbConnectionState extends DbConnectionState {
private Client client;
public ExampleDbConnectionState(String url){
client = new Client(url);
}
public getClient(){
return client;
}
}
public class ExampleDb extends Db {
private ExampleDbConnectionState state;
@Override
protected void onInit(Map<String, String> properties) throws DbException {
registerOperationHandler(LdbcQuery1.class, LdbcQuery1Handler.class);
registerOperationHandler(LdbcQuery2.class, LdbcQuery2Handler.class);
state = new ExampleDbConnectionState(properties.get(``url''));
}
@Override
protected void onCleanup() throws DbException {}
@Override
protected DbConnectionState getConnectionState() throws DbException {
return state;
}
public static class LdbcQuery1Handler extends OperationHandler<LdbcQuery1> {
@Override
protected OperationResult executeOperation(LdbcQuery1 operation) throws DbException {
Client client = ((ExampleDbConnectionState)dbConnectionState()).getClient();
LdbcQuery1Result result = client.execute(operation)
int resultCode = // typically used for debugging, \eg to return error codes
return operation.buildResult(resultCode, result);
}
}
public static class LdbcQuery2ToHandler extends OperationHandler<LdbcQuery2> {
@Override
protected OperationResult executeOperation(LdbcQuery2 operation) throws DbException {
Client client = ((ExampleDbConnectionState)dbConnectionState()).getClient();
LdbcQuery2Result result = client.execute(operation)
int resultCode = // typically used for debugging, \eg returning error codes
return operation.buildResult(resultCode, result);
}
}
}
\end{verbatim}
}
\subsubsection{General driver properties}
\label{ssub:general_driver_properties}
Although a detailed explanation of the workload driver design is beyond the
scope of this document, it is worth noting that it has a number of
configuration parameters, these parameters are summarized in
\autoref{tab:general_driver_parameters}.
\begin{table}[h!]
\label{tab:general_driver_parameters}
\begin{center}
\begin{tabular}{|l|l|l|}
\hline
status & boolean & intermittently output workload status during execution \\
\hline
operationcount & integer & number of queries to generate \\
\hline
threadcount & integer & size of thread pool to use for query execution \\
\hline
resultfile & string & path to where workload results will be written \\
\hline
timeunit & enum & time unit to report result metrics in \\
\hline
timecompressionratio & double & adjust all query start times proportionally \\
\hline
gctdeltaduration & integer & min duration (ms) between dependent reads and writes \\
\hline
peeridentifiers & string[] & addresses of other driver processes (for distributed mode) \\
\hline
toleratedexecutiondelay & integer & max allowed time between scheduled and actual query start times \\
\hline
workload & string & class name of specific \texttt{Workload} implementation \\
\hline
database & string & class name of specific \texttt{Db} implementation \\
\hline
\end{tabular}
\end{center}
\caption{General Driver Parameters}
\end{table}
Note that default configuration files, with relevant parameter values set and required parameters commented,
are provided in the driver distribution.
\subsubsection{Workload specific driver properties}
\label{ssub:workload_specific_driver_properties}
In addition to the general driver properties, the driver supports the definition of arbitrary properties,
which are passed along to the pluggable driver components: \texttt{Db} and \texttt{Workload}.
The first, \texttt{Db}, was described already in the previous section.
The second is \texttt{Workload}, it is the class responsible for defining which operations, in which order and at what throughput, the driver will generate.
From the point of view of the driver, the LDBC Social Network Benchmark is an implementation of the \texttt{Workload} class,
in this case named \texttt{LdbcInteractiveWorkload}.
For more details regarding the workload itself see \autoref{section:workload}.
The configuration parameters for the \texttt{LdbcInteractiveWorkload} workload are summarized in \autoref{tab:snb_interactive_driver_parameters}.
\begin{table}[h!]
\label{tab:snb_interactive_driver_parameters}
\begin{center}
\begin{tabular}{|l|l|l|}
\hline
parameter\_dir &
string &
data generator parameters directory (read parameters) \\
\hline
data\_dir &
string &
data generator data directory (dataset and write parameters) \\
\hline
LdbcQuery1\_interleave &
boolean &
interval between successive query 1 executions\\
\hline
LdbcQuery2\_interleave &
boolean &
interval between successive query 2 executions\\
\hline
LdbcQuery3\_interleave &
boolean &
interval between successive query 3 executions\\
\hline
LdbcQuery4\_interleave &
boolean &
interval between successive query 4 executions\\
\hline
LdbcQuery5\_interleave &
boolean &
interval between successive query 5 executions\\
\hline
LdbcQuery6\_interleave &
boolean &
interval between successive query 6 executions\\
\hline
LdbcQuery7\_interleave &
boolean &
interval between successive query 7 executions\\
\hline
LdbcQuery8\_interleave &
boolean &
interval between successive query 8 executions\\
\hline
LdbcQuery9\_interleave &
boolean &
interval between successive query 9 executions\\
\hline
LdbcQuery10\_interleave &
boolean &
interval between successive query 10 executions\\
\hline
LdbcQuery11\_interleave &
boolean &
interval between successive query 11 executions\\
\hline
LdbcQuery12\_interleave &
boolean &
interval between successive query 12 executions\\
\hline
LdbcQuery13\_interleave &
boolean &
interval between successive query 13 executions\\
\hline
LdbcQuery14\_interleave &
integer &
interval between successive query 14 executions\\
\hline
LdbcQuery1\_enable &
boolean &
enable/disable read query 1 \\
\hline
LdbcQuery2\_enable &
boolean &
enable/disable read query 2 \\
\hline
LdbcQuery3\_enable &
boolean &
enable/disable read query 3 \\
\hline
LdbcQuery4\_enable &
boolean &
enable/disable read query 4 \\
\hline
LdbcQuery5\_enable &
boolean &
enable/disable read query 5 \\
\hline
LdbcQuery6\_enable &
boolean &
enable/disable read query 6 \\
\hline
LdbcQuery7\_enable &
boolean &
enable/disable read query 7 \\
\hline
LdbcQuery8\_enable &
boolean &
enable/disable read query 8 \\
\hline
LdbcQuery9\_enable &
boolean &
enable/disable read query 9 \\
\hline
LdbcQuery10\_enable &
boolean &
enable/disable read query 10 \\
\hline
LdbcQuery11\_enable &
boolean &
enable/disable read query 11 \\
\hline
LdbcQuery12\_enable &
boolean &
enable/disable read query 12 \\
\hline
LdbcQuery13\_enable &
boolean &
enable/disable read query 13 \\
\hline
LdbcQuery14\_enable &
boolean &
enable/disable read query 14 \\
\hline
LdbcUpdate1AddPerson\_enable &
boolean &
enable/disable update query 1 \\
\hline
LdbcUpdate2AddPostLike\_enable &
boolean &
enable/disable update query 2 \\
\hline
LdbcUpdate3AddCommentLike\_enable &
boolean &
enable/disable update query 3 \\
\hline
LdbcUpdate4AddForum\_enable &
boolean &
enable/disable update query 4 \\
\hline
LdbcUpdate5AddForumMembership\_enable &
boolean &
enable/disable update query 5 \\
\hline
LdbcUpdate6AddPost\_enablet &
boolean &
enable/disable update query 6 \\
\hline
LdbcUpdate7AddComment\_enable &
boolean &
enable/disable update query 7 \\
\hline
LdbcUpdate8AddFriendship\_enable &
boolean &
enable/disable update query 8 \\
\hline
\end{tabular}
\end{center}
\caption{LDBC Social Network Benchmark Parameters}
\end{table}
\subsection{Running Workload}
\label{sub:running_workload}
Finally, to run the workload execute the \texttt{main} method of the \texttt{Client} class in the driver.
The general pattern for doing so is as follows:
{\footnotesize
\begin{verbatim}
usage: java -cp core-VERSION.jar com.ldbc.driver.Client [-db <classname>] [-del <duration>] [-gctd <duration>]
[-oc <count>] [-P <file1:file2>] [-p <key=value>] [-pids <peerId1:peerId2>] [-rf <path>] [-s] [-tc
<count>] [-tcr <ratio>] [-tu <unit>] [-w <classname>]
-db,--database <classname> classname of the DB to use (e.g.
com.ldbc.driver.workloads.simple.db.BasicDb)
-del,--toleratedexecutiondelay <duration> duration (ms) an operation handler may miss its scheduled
start time by
-gctd,--gctdeltaduration <duration> safe duration (ms) between dependent operations
-oc,--operationcount <count> number of operations to execute (default: 0)
-P <file1:file2> load properties from file(s) - files will be loaded in the
order provided
first files are highest priority; later values will not
override earlier values
-p <key=value> properties to be passed to DB and Workload - these will
override properties loaded from files
-pids,--peeridentifiers <peerId1:peerId2> identifiers/addresses of other driver workers (for
distributed mode)
-rf,--resultfile <path> where benchmark results JSON file will be written (null =
file will not be created)
-s,--status show status during run
-tc,--threadcount <count> number of worker threads to execute with (default: 2)
-tcr,--timecompressionratio <ratio> change duration between operations of workload
-tu,--timeunit <unit> time unit to use when gathering metrics.
default:MILLISECONDS, valid:[NANOSECONDS, MICROSECONDS,
MILLISECONDS, SECONDS, MINUTES]
-w,--workload <classname> classname of the Workload to use (e.g.
com.ldbc.driver.workloads.simple.SimpleWorkload)
\end{verbatim}
}
For the Interactive workload of \ldbcsnb it is necessary to provide the driver with four configuration files, containing:
general driver properties,
workload specific driver properties (\eg database connection string),
a dataset specific value for one of the general driver properties (specifically, \texttt{gctdeltaduration}),
and vendor specific database properties.
The commandline to execute the workload would look something like:
{\footnotesize
\begin{verbatim}
java -cp ldbc_driver/target/core-0.2-SNAPSHOT.jar com.ldbc.driver.Client
-db com.vendor.VendorDb
-P vendor/vendor.properties,
-P data_generator/outputDir/updateStream_0.properties,
-P ldbc_driver/workloads/ldbc/socnet/interactive/ldbc_socnet_interactive.properties
-P ldbc_driver/src/main/resources/ldbc_driver_default.properties
\end{verbatim}
}