StarPU Handbook
|
StarPU can use Simgrid in order to simulate execution on an arbitrary platform.
There are a few technical details which need to be handled for an application to be simulated through Simgrid.
If the application uses gettimeofday
to make its performance measurements, the real time will be used, which will be bogus. To get the simulated time, it has to use starpu_timing_now() which returns the virtual timestamp in us.
For some technical reason, the application's .c file which contains main() has to be recompiled with starpu_simgrid_wrap.h, which in the simgrid case will # define main() into starpu_main(), and it is libstarpu which will provide the real main() and will call the application's main().
To be able to test with crazy data sizes, one may want to only allocate application data if STARPU_SIMGRID is not defined. Passing a NULL pointer to starpu_data_register functions is fine, data will never be read/written to by StarPU in Simgrid mode anyway.
To be able to run the application with e.g. CUDA simulation on a system which does not have CUDA installed, one can fill the cuda_funcs with (void*)1, to express that there is a CUDA implementation, even if one does not actually provide it. StarPU will never actually run it in Simgrid mode anyway.
The idea is to first compile StarPU normally, and run the application, so as to automatically benchmark the bus and the codelets.
$ ./configure && make $ STARPU_SCHED=dmda ./examples/matvecmult/matvecmult [starpu][_starpu_load_history_based_model] Warning: model matvecmult is not calibrated, forcing calibration for this run. Use the STARPU_CALIBRATE environment variable to control this. $ ... $ STARPU_SCHED=dmda ./examples/matvecmult/matvecmult TEST PASSED
Note that we force to use the scheduler dmda
to generate performance models for the application. The application may need to be run several times before the model is calibrated.
Then, recompile StarPU, passing --enable-simgrid to ./configure
.
$ ./configure --enable-simgrid
To specify the location of SimGrid, you can either set the environment variables SIMGRID_CFLAGS and SIMGRID_LIBS, or use the configure options --with-simgrid-dir, --with-simgrid-include-dir and --with-simgrid-lib-dir, for example
$ ./configure --with-simgrid-dir=/opt/local/simgrid
You can then re-run the application.
$ make $ STARPU_SCHED=dmda ./examples/matvecmult/matvecmult TEST FAILED !!!
It is normal that the test fails: since the computation are not actually done (that is the whole point of simgrid), the result is wrong, of course.
If the performance model is not calibrated enough, the following error message will be displayed
$ STARPU_SCHED=dmda ./examples/matvecmult/matvecmult [starpu][_starpu_load_history_based_model] Warning: model matvecmult is not calibrated, forcing calibration for this run. Use the STARPU_CALIBRATE environment variable to control this. [starpu][_starpu_simgrid_execute_job][assert failure] Codelet matvecmult does not have a perfmodel, or is not calibrated enough
The number of devices can be chosen as usual with STARPU_NCPU, STARPU_NCUDA, and STARPU_NOPENCL, and the amount of GPU memory with STARPU_LIMIT_CUDA_MEM, STARPU_LIMIT_CUDA_devid_MEM, STARPU_LIMIT_OPENCL_MEM, and STARPU_LIMIT_OPENCL_devid_MEM.
The simgrid support even permits to perform simulations on another machine, your desktop, typically. To achieve this, one still needs to perform the Calibration step on the actual machine to be simulated, then copy them to your desktop machine (the $STARPU_HOME/.starpu
directory). One can then perform the Simulation step on the desktop machine, by setting the environment variable STARPU_HOSTNAME to the name of the actual machine, to make StarPU use the performance models of the simulated machine even on the desktop machine.
If the desktop machine does not have CUDA or OpenCL, StarPU is still able to use simgrid to simulate execution with CUDA/OpenCL devices, but the application source code will probably disable the CUDA and OpenCL codelets in thatcd sc case. Since during simgrid execution, the functions of the codelet are actually not called, one can use dummy functions such as the following to still permit CUDA or OpenCL execution:
By default, simgrid uses its own implementation of threads, which prevents gdb from being able to inspect stacks of all threads. To be able to fully debug an application running with simgrid, pass the –cfg=contexts/factory:thread
option to the application, to make simgrid use system threads, which gdb will be able to manipulate as usual.