-
Notifications
You must be signed in to change notification settings - Fork 51
/
ChangeLogP543.txt
305 lines (228 loc) · 13.6 KB
/
ChangeLogP543.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
2016-01-25
* d779d1172a6e4c73b5ece9939c4d067c2b3d7b8d Update libpfm4 current
with Jan 25 08:33:02 2016 version.
2016-01-07
* 0d9776b8 src/components/stealtime/linux-stealtime.c: Free allocated memory
in the stealtime component when component is shutdown Thanks to William
Cohen for contributing this patch and the following explaination: Running
examples with "valgrind --leak-check=full ..." showed a number of items
allocated by the stealtime component were not freed when PAPI_shutdown() was
called. This patch frees those unused memory allocations.
2016-01-06
* de40668c src/papi_preset.c: Fixed memory leak in papi_preset.c by updating
the infix_to_postfix function. Thanks to William Cohen for discovering the
leak. The infix_to_postfix function was re-written and tested using user
defined events.
2015-12-30
* db37e115 src/utils/avail.c src/utils/native_avail.c: Added "-check" flag to
papi_avail and papi_native_avail to test counter availability/validity" This
patch updates the papi_avail and papi_native_avail utilities to use the
"-check" flag to test the actual availability of counters. There were
previously two different flags for this capability, papi_native_avail used
"-validate" and papi_avail used "-avail_test". Based on a mailing list
discussion these flags have been consolidated as "-check".
2015-12-29
* 72e0ffe8 src/components/lmsensors/linux-lmsensors.c: Fixed minor error with
multiple initializers for lmsensors_vector .default_granularity. Thanks to
William Cohen for the bug report.
2015-12-07
* ec3582d8 src/utils/avail.c: papi_avail to test actual availability of
counters using "papi_avail --avail-test" This problem and the associated
patch were detected and contributed by Harald Savat. Thanks. On an Intel(R)
Xeon(R) CPU E5-2660 v2 @ 2.20GHz system with PAPI 5.4.1 installed The
papi_avail command indicates that both PAPI_LD_INS and PAPI_SR_INS are
available, however the papi_event_chooser does not accept them (see below)
and return -1. This problem can occur in kernels from version 3.1 till 4.1.
The kernel devs blocked all uses of the MEM_OPS events (including load and
store). The patch modifies papi_avail to test the counters to see if they
can be added. papi_avail # gets all PAPI counters papi_avail -a # gets
all available PAPI counters papi_avail -at # shows all available PAPI
counters that can be added [Ptools-perfapi: Oct 14 2015]
2015-11-30
* 1fab922e src/components/libmsr/README src/components/libmsr/configure
src/components/libmsr/configure.in...: The libmsr component is updated to
match major changes in LLNL libmsr library and the LLNL msr-safe kernel
module
2015-11-18
* 242b16d3 src/Makefile.inc src/components/cuda/configure
src/components/cuda/configure.in...: Added papi_cuda_sampling utility in
/src/components/cuda/sampling changed src/Makefile.inc ,
src/components/cuda/configure.in to build the utiltiy during PAPI
installation Added in /src/components/cuda/tests/Makefile which is -ldl
switch because 3.10.0-229.14.1.el7.x86_64 had issues using libpapi.a during
compilation of cuda component test programs
2015-10-21
* a10e8331 src/papi_events.csv: papi_events: add Intel Skylake presets This
just shares all of teh broadwell events with skylake. Some quick tests show
that this probably works. Someone with skylake hardware should validate this
at some point.
2015-10-08
* 91736851 src/papi_internal.c: Thanks to David Eberius of ICL for reporting
a bug in PAPI_get_event_info() in papi_internal.c, (info->component_index =
(unsigned int) cidx) was missing at line 2554, of papi_internal.c
2015-08-27
* 502df070 src/Makefile.inc: Thanks to Steve Kaufmann for reporting about the
redundant () paramater in the OBJECTS expression of src/Makefile.inc file.
Updated Makefile.inc by removing the redundant paramater
2015-08-24
* 69fdc2e0 src/papi.c: Thanks to Harald Servat for reporting the
PAPI_overflow issue for multiple eventsets. The problem was in the
PAPI_start() function in the branch at line-2166:papi.c , if(is_dirty). After
update_control_state(), it is required to re-initialize the overflow settings
using set_overflow()
2015-07-29
* be81dc43 src/components/perf_event/perf_event.c: perf_event: update the ARM
domain workaround older ARM processors could not separate out KERNEL vs USER
events. ARMv7 starting with the Cortex A15 can, as can all ARMv8 (ARM64).
This updates the code with a whitelist to properly allow setting the domains.
* 43be2588 src/linux-common.c: linux-common: clean up ARM cpu detection
Parsing cpuinfo is always a pain. Extra work because of Raspberry Pi
(ARM1176) lying and saying it's ARMv7 rather than ARMv6.
* 5a101a50 src/linux-common.c: linux-common: split up x86, power and arm
cpuinfo parsing
* 0d7772d9 src/linux-common.c: linux-common: clean up and comment the cpuinfo
parsing code
2015-07-16
* 59489b1f src/components/libmsr/Makefile.libmsr.in
src/components/libmsr/README src/components/libmsr/Rules.libmsr...: Create
libmsr component for reading power information and writing power constraints
using MSRs on some Intel processsors The PAPI libmsr component supports
measuring and capping power usage on recent Intel architectures using the
RAPL interface exposed through MSRs (model-specific registers). Lawrence
Livermore National Laboratory has released a library (libmsr) designed to
provide a simple, safe, consistent interface to several of the model-specific
registers (MSRs) in Intel processors. The problem is that permitting open
access to the MSRs on a machine can be a safety hazard, so access to MSRs is
usually limited. In order to encourage system administrators to give wider
access to the MSRs on a machine, LLNL has released a Linux kernel module
(msr_safe) which provides safer, white-listed access to the MSRs. PAPI has
created a libmsr component that can provide read and write access to the
information and controls exposed via the libmsr library. This PAPI component
introduces a new ability for PAPI; it is the first case where PAPI is writing
information to a counter as well as reading the data from the counter.
2015-07-13
* d326ecc9 src/components/perf_event/perf_event_lib.h src/papi_internal.c:
Thanks to Steve Kaufman for providing a patch that increases the
PERF_EVENT_MAX_MPX_COUNTERS to 192 from 128 and enhances the corresponding
warning message in papi_internal.c
2015-06-29
* e829baa5 src/components/cuda/tests/Makefile
src/components/cuda/tests/cuda_ld_preload_example.README
src/components/cuda/tests/cuda_ld_preload_example.c: Example of using
LD_PRELOAD with the CUDA component. A short example of using LD_PRELOAD on a
Linux system to intercept function calls and PAPI-enable an un-instrumented
CUDA binary. Several CUDA events (e.g. SM PM counters) require a CUcontext
handle to be a provided since they are context switched. This means that we
cannot use a PAPI_attach from an external process to measure those events in
a preexisting executable. These events can only be measured from within the
CUcontext, that is, within the CUDA enabled code we are trying to measure.
If the user is unable to change the source code, they may be able to use
LD_PRELOAD's ability to trap functions and measure the events for within the
executable. See src/components/cuda/tests/cuda_ld_preload_example.README for
details.
2015-06-26
* 0829a4f5 src/papi_events.csv: Add future broadwell-ep support. libpfm4
doesn't support it yet, but add it for when it appears.
2015-06-25
* 36c5b5b6 src/papi_events.csv: add broadwell predefined events For now they
are the same as Haswell, as that's what the Linux kernel does.
* f42eda64 src/papi_events.csv: Added definitions to Power8 for PAPI_SP_OPS,
PAPI_DP_OPS.
2015-06-18
* f87542f7 src/components/perf_event/tests/event_name_lib.c: Added [case 63:
/*Haswell EP*/] line the src/components/perf_event/tests/event_name_lib.c
file to support offcore for haswell EP
* fbfc641f src/components/perf_event_uncore/tests/perf_event_uncore_cbox.c
src/components/perf_event_uncore/tests/perf_event_uncore_lib.c: Added suuport
for Haswell-EP processor with model-63 in
src/components/perf_event_uncore/tests/perf_event_uncore_lib.c and
src/components/perf_event_uncore/tests/perf_event_uncore_cbox.c files. As a
result perf_event_uncore, perf_event_uncore_multiple and
perf_event_uncore_cbox tests get passed. Tested and verified on Intel(R)
Xeon(R) CPU E5-2650 v3 @ 2.30GHz with linux kernel 4.0.4-1.el6.elrepo.x86_64
2015-06-17
* 56698211 src/components/lustre/linux-lustre.c: Thanks to Garry Mohr for the
patch that removes the error message (PAPI Error: Error Code -7,Event does
not exist) on executing papi_native_avail in PAPI built with lustre component
2015-06-16
* 1b9fd867 src/components/rapl/linux-rapl.c: rapl: allow DRAM to have
separate scaling factor from CPU on Haswell-EP the DRAM scaling value is
different and cannot be detected. See https://lkml.org/lkml/2015/3/20/582
* 1aa74f85 src/components/rapl/linux-rapl.c: rapl: add support for Broadwell
2015-06-11
* a5ecda79 src/components/rapl/linux-rapl.c: Thanks to William Cohen for the
patch which does the following: Checking the cpu family and module number is
not sufficient to determine whether RAPL can be used. If the papi is running
inside a guest VM, the MSR used by the PAPI RAPL component may not be
available. There should be a simple read test to verify the RAPL MSR
registers are available. This allows the component to more clearly report
that RAPL is unsupported rather than just exiting program when the RAPL
2015-05-19
* 54c45107 src/components/rapl/utils/rapl_plot.c: Updated rapl_plot utility
so that the correct values/units are reported (e.g. scaled and fixed value
counts should not be converted)
2015-05-04
* a34fbc62 src/papi_events.csv: papi_events.csv: typo in the ARM Cortex A53
definitions
2015-04-30
* caa3af72 src/papi_events.csv: papi_events.csv: add preset events for ARM
Cortex A53 This is based purely on the names in the libpfm4 output, these
were not validated in any way.
2015-04-20
* 66553715 INSTALL.txt: added compile incantation for compiling programs that
offload code to MIC
2015-04-16
* 8914dcfc src/papi_events.csv: Bug reported by William Cohen in
papi_events.csv for the event PAPI_L1_TCM
2015-03-31
* 023af5ec src/components/nvml/configure: Updated the NVML configure script
which requires an autoconf and an updated configure script
* 2385c1b2 src/components/nvml/Makefile.nvml.in
src/components/nvml/Rules.nvml src/components/nvml/configure.in: Updated the
NVML configure script to allow separate include and library paths
2015-03-30
* 3d509095 src/components/infiniband_umad/linux-infiniband_umad.c: Bugfix
linux-infiniband_umad.c to include linux-infiniband_umad.h rather than
linux-infiniband.h. Thanks to Aurelien Bouteiller for pointing out this bug.
* b865f227 src/components/vmware/vmware.c: Corrected function name in
_vmware_vector from _vmware_init to _vmware_init_thread.
2015-03-24
* 2f58a4d8 src/configure: Regenerated configure to match the PAPI_GRN_SYS
patch
* 12e6ef31 src/components/perf_event/tests/perf_event_system_wide.c: Support
PAPI_GRN_SYS granularity for perf component, updating the system wide test
(patch 2 of 2). Thanks to William Cohen for this patch and the documentation
Make sure that a sane cpu number is selected with PAPI_GRN_SYS Corrections
to output and comments of perf_event_system_wide.c test
* 42879693 src/components/perf_event/perf_event.c
src/components/perf_event/tests/perf_event_system_wide.c src/config.h.in...:
Support PAPI_GRN_SYS granularity for perf component, picking a sane CPU
number (patch 1 of 2). Thanks to William Cohen for this patch and the
documentation The checks in perf_event_open syscall cause a failure when
both pid=-1 and cpu=-1. The perf_event component was passing in pid=-1 and
cpu=-1 when PAPI_GRN_SYS was selected. If possible, the code should pick the
current processor that the command is running so that the permission check
works properly when PAPI_GRN_SYS is used. The patch also adds a test fail if
PAPI_GRN_SYS unable to add PAPI_TOT_CYC.
* 0ab9b0c8 src/ctests/krentel_pthreads.c: Added call to unregister the
overflow handler.. plus small code cleanup
2015-03-05
* d886c49c src/papi.c src/papi_libpfm4_events.c src/utils/avail.c: Clean
output from papi_avail tools when there are no user defined events Thanks to
Gary Mohr for this patch. The changes in this patch improve the output from
the papi_avail tool. It was printing the user defined events header and a
PAPI Error message when no user defined events existed. These changes add
code in the enum call to return an error when trying to fetch the first user
defined event if no user events are defined. This allows the application to
detect that no user events are known and skip printing the user defined event
heading. It also prevents the application from calling PAPI_get_event_info
with a user defined event code that does not exist which avoids the PAPI
Error message. Also a one line change to modify a debug message type to make
the debug output produced by papi_libpfm4_events.c consistent.
2015-03-03
* ee0c58d7 src/components/cuda/linux-cuda.c: Do not generate an error if the
CUDA libraries cannot be loaded, just write a debug message
* 08bb9bf0 src/configure: Updating the number to 5.4.1
2015-03-02
* 01f742c1 release_procedure.txt: Minor change to specify locations of some
files