-
Notifications
You must be signed in to change notification settings - Fork 688
Batch mode
Since the version 7.9 of the Java analyzer, the parsing of source files is done in multiple batches of a given size, instead of file by file for previous versions. Note that it only applies to the parsing, the analysis (rules execution) is still done on a file by file basis.
This is the size of the batch, in Kilo Bytes. The analyzer will consume files until the size is reached, parse them in batch, and repeat the operation until all files have been processed.
By default, an ideal batch size is dynamically computed, based on the total memory available. More precisely, the size is equal to 0.005% of the maximum memory (available though -Xmx). The dynamic computation is capped at 500KB.
The dynamic computation of the batch size is on the safe side, as we have empirically identified that the performance benefits are already observable, without taking any risk on the memory side, when the size is relatively small. In certain situations, you may want to manually set the batch size value.
You can do this by using the property sonar.java.experimental.batchModeSizeInKB
.
Note that the perfect value depends on the project and the ecosystem setup, bigger batch size will not necessarily increase the performance and can even slow things down if the memory is limited.
When set to -1
, the parsing will be done in a single batch, whatever the memory or size of the project. This value is only meant to be used internally and should not be used.
sonar.java.internal.batchMode
is deprecated, and should not be used in batch mode related actions.
In certain situation, you may want to not run the analysis in batch.
You can do this by setting the property sonar.java.fileByFile=true
.
Except from memory/speed, batch and file by file mode should yield the same analysis results. Still, we have identified a small difference when the project is misconfigured (dependencies missing, source files not compiled, properties missing, ...).
The main benefit of batch mode is to avoid computing again and again the semantic of dependencies. It means that the bigger the project, the bigger the possible performance gain, especially if it contains many dependencies. Another secondary benefit can be observed in case of missing semantic (incorrectly configured projects). When have multiple files in the same batch, we can improve our partial knowledge about types. Batch mode is unrelated to parallel execution. We are still parsing sequentially the files, the performance improvements are coming from the fact that we can re-use partially the semantic already computed.