![]() |
Ginkgo Generated from branch based on main. Ginkgo version 1.10.0
A numerical linear algebra library targeting many-core architectures
|
In addition to the unit tests designed to verify correctness, Ginkgo also includes an extensive benchmark suite for checking its performance on all Ginkgo supported systems. The purpose of Ginkgo's benchmarking suite is to allow easy and complete reproduction of Ginkgo's performance, and to facilitate performance debugging as well. Most results published in Ginkgo papers are generated thanks to this benchmarking suite and are accessible online under the ginkgo-data repository. These results can also be used for performance comparison in order to ensure that you get similar performance as what is published on this repository.
To compile the benchmarks, the flag -DGINKGO_BUILD_BENCHMARKS=ON has to be set during the cmake step. In addition, the ssget command-line utility has to be installed on the system. The purpose of this file is to explain in detail the capacities of this benchmarking suite as well as how to properly setup everything.
There are two ways to benchmark Ginkgo. When compiling the benchmark suite, executables are generated for collecting matrix statistics, running sparse-matrix vector product, solvers (possibly distributed) benchmarks. Another way to run benchmarks is through the convenience script run_all_benchmarks.sh, but not all features are exposed through this tool!
Here is a short description of the content of this file:
Before benchmarking Ginkgo, make sure that you follow the general guidelines in order to ensure best performance.
In addition, the following specific options can be considered:
To benchmark ginkgo, matrices need to be provided as input in the Matrix Market format. A convenient way is to run benchmark with the SuiteSparse matrix collection. A helper tool, the ssget command-line utility can be used to facilitate downloading and extracting matrices from the suitesparse collection. When running the benchmarks with the helper script run_all_benchmarks.sh (or calling make benchmark), the ssget tool is required.
To install ssget, access the repository and copy the file ssget into a directory present in your PATH variable as per the tool's README.md instructions. The tool can be installed either in a global system path or a local directory such as $HOME/.local/bin. After installing the tool, it is important to review the ssget script and configure as needed the variable ARCHIVE_LOCATION on line 39. This is where the matrices will be stored into.
The Ginkgo benchmark can be set to run on only a portion of the SuiteSparse matrix collection as we will see in the following section. Please note that the entire collection requires roughly 100GB of disk storage in its compressed format, and roughly 25GB of additional disk space for intermediate data (such us uncompressing the archive). Additionally, the benchmark runs usually take a long time (SpMV benchmarks on the complete collection take roughly 24h using the K20 GPU), and will stress the system.
Before proceeding, it can be useful in order to save time to download the matrices as preparation. This can be done by using the ssget -f -i i command where i is the ID of the matrix to be downloaded. The following loop allows to download the full SuiteSparse matrix collection:
Note that ssget can also be used to query properties of the matrix and filter the matrices which are downloaded. For example, the following will download only positive definite matrices with less than 500M non zero elements and 10M columns. Please refer to the ssget documentation for more information.
For extracting the matrices, ssget -f -i ${i} can be used.
When compiling Ginkgo with the flag -DGINKGO_BUILD_BENCHMARKS=ON, a suite of executables will be generated depending on the CMake configuration. These executables are the backbone of the benchmarking suite. Note that all of these executables describe the available options and the required input format when running them with the --help option. All executables have multiple variants depending on the precision, by default double precision is used for the type of values, but variants with single and complex (single and double) value types are also available. Here is a non exhaustive list of the available benchmarks:
Optionally when compiling with MPI support:
All benchmarks require input data as in a JSON format. The json file has to consist of exactly one array, and within that array the test cases are defined. The exact syntax can change between executables, the --help option will explain the necessary JSON input format. For example for the spmv benchmark case, and many other benchmarks the following minimal input should be provided:
The files have to be in matrix market format.
Some benchmarks require some extra fields. For example the solver benchmarks requires the field "optimal": {"spmv": "matrix format (such as csr)"}. This is automatically populated when running the spmv benchmark which finds the optimal (fastest) format among all requested formats.
After writing the necessary data in a JSON file, the benchmark can be called by passing in the input via stdin, i.e.
The output of our benchmarks is again JSON, and it is printed to stdout, while our status messages are printed to stderr. So, the output can be stored with
Note that in most cases, the JSON output by our benchmarks is compatible with other benchmarks, therefore it is possible to first call the spmv benchmark, use the resulting output JSON as input to the solver benchmark, and finally use the resulting solver JSON output as input to the preconditioner benchmark.
The benchmark suite is invoked using the make benchmark command in the build directory. Under the hood, this command simply calls the script benchmark/run_all_benchmarks.sh so it is possible to manually launch this script as well. The behavior of the suite can be modified using environment variables. Assuming the bash shell is used, these can either be specified via the export command to persist between multiple runs:
or specified on the fly, on the same line as the make benchmark command:
Since make sets any variables passed to it as temporary environment variables, the following shorthand can also be used:
A combination of the above approaches is also possible (e.g. it may be useful to export the SYSTEM_NAME variable, and specify the others at every benchmark run).
The benchmark suite can take a number of configuration parameters. Benchmarks can be run only for sparse matrix vector products (spmv), for full solvers (with or without preconditioners), or for preconditioners only when supported. The benchmark suite also allows to target a sub-part of the SuiteSparse matrix collection. For details, see the "## 7: Available benchmark options" "available benchmark options". Here are the most important options:
MATRIX_LIST_FILE=/path/to/matrix_list.file - allows to list SuiteSparse matrix id or name to benchmark. As an example, a matrix list file containing the following will ensure that benchmarks are ran for only those three matrices:
The previous experiments generated json files for each matrices, each containing timing, iteration count, achieved precision, ... depending on the type of benchmark run. These files are available in the directory ${ginkgo_build_dir}/benchmark/results/. These files can be analyzed and processed through any tool (e.g. python). In this section, we describe how to generate the plots by using Ginkgo's GPE tool. First, we need to publish the experiments into a Github repository which will be then linked as source input to the GPE. For this, we can simply fork the ginkgo-data repository. To do so, we can go to the github repository and use the forking interface: https://github.com/ginkgo-project/ginkgo-data/
Once it's done, we want to clone the repository locally, put all results online and access the GPE for plotting the results. Here are the detailed steps:
Note that depending on what data is of interest, you may need to update the scripts build-list or agregate to change which files you want to agglomerate and summarize (depending on the system name), or which data you want to select (solver results, spmv results, ...).
For the generating the plots in the GPE, here are the steps to go through:
Detailed performance analysis can be run by passing the environment variable DETAILED=1 to the benchmarking script. This detailed run is available for solvers and allows to log the internal residual after every iteration as well as log the time taken by all operations. These features are also available in the performance-debugging example which can be used instead and modified as needed to analyze Ginkgo's performance.
These features are implemented thanks to the loggers located in the file ${ginkgo_src_dir}/benchmark/utils/loggers.hpp. Ginkgo possesses hooks at all important code location points which can be inspected thanks to the logger. In this fashion, it is easy to use these loggers also for tracking memory allocation sizes and other important library aspects.
There are a set amount of options available for benchmarking. Most important options can be configured through the benchmarking script itself thanks to environment variables. Otherwise, some specific options are not available through the benchmarking scripts but can be directly configured when running the benchmarking program itself. For a list of all options, run for example ${ginkgo_build_dir}/benchmark/solver/solver --help.
The supported environment variables are described in the following list: