There are three methods of finding particle haloes in yt. The default method is called HOP, a method described in Eisenstein and Hut (1998). A basic friends-of-friends (e.g. Efstathiou et al. (1985)) halo finder is also implemented. Finally Rockstar (Behroozi et a. (2011)) is a 6D-phase space halo finder developed by Peter Behroozi that excels in finding subhalos and substrcture, but does not allow multiple particle masses.
The version of HOP used in yt is an upgraded version of the publicly available HOP code. Support for 64-bit floats and integers has been added, as well as parallel analysis through spatial decomposition. HOP builds groups in this fashion:
Please see the HOP method paper for
full details and the
HOPHalo
and
Halo
classes.
Rockstar uses an adaptive hierarchical refinement of friends-of-friends groups in six phase-space dimensions and one time dimension, which allows for robust (grid-independent, shape-independent, and noise- resilient) tracking of substructure. The code is prepackaged with yt, but also separately available. The lead developer is Peter Behroozi, and the methods are described in Behroozi et al. 2011. In order to run the Rockstar halo finder in yt, make sure you’ve installed it so that it can integrate with yt.
At the moment, Rockstar does not support multiple particle masses, instead using a fixed particle mass. This will not affect most dark matter simulations, but does make it less useful for finding halos from the stellar mass. In simulations where the highest-resolution particles all have the same mass (ie: zoom-in grid based simulations), one can set up a particle filter to select the lowest mass particles and perform the halo finding only on those. See the this cookbook recipe for an example: Running Rockstar to Find Halos on Multi-Resolution-Particle Datasets.
To run the Rockstar Halo finding, you must launch python with MPI and parallelization enabled. While Rockstar itself does not require MPI to run, the MPI libraries allow yt to distribute particle information across multiple nodes.
Warning
At the moment, running Rockstar inside of yt on multiple compute nodes
connected by an Infiniband network can be problematic. Therefore, for now
we recommend forcing the use of the non-Infiniband network (e.g. Ethernet)
using this flag: --mca btl ^openib
.
For example, here is how Rockstar might be called using 24 cores:
mpirun -n 24 --mca btl ^openib python ./run_rockstar.py --parallel
.
The script above configures the Halo finder, launches a server process which disseminates run information and coordinates writer-reader processes. Afterwards, it launches reader and writer tasks, filling the available MPI slots, which alternately read particle information and analyze for halo content.
The RockstarHaloFinder class has these options that can be supplied to the
halo catalog through the finder_kwargs
argument:
dm_type
, the index of the dark matter particle. Default is 1.outbase
, This is where the out*list files that Rockstar makes should be
placed. Default is ‘rockstar_halos’.num_readers
, the number of reader tasks (which are idle most of the
time.) Default is 1.num_writers
, the number of writer tasks (which are fed particles and
do most of the analysis). Default is MPI_TASKS-num_readers-1.
If left undefined, the above options are automatically
configured from the number of available MPI tasks.force_res
, the resolution that Rockstar uses for various calculations
and smoothing lengths. This is in units of Mpc/h.
If no value is provided, this parameter is automatically set to
the width of the smallest grid element in the simulation from the
last data snapshot (i.e. the one where time has evolved the
longest) in the time series:
ds_last.index.get_smallest_dx() * ds_last['Mpch']
.total_particles
, if supplied, this is a pre-calculated
total number of dark matter
particles present in the simulation. For example, this is useful
when analyzing a series of snapshots where the number of dark
matter particles should not change and this will save some disk
access time. If left unspecified, it will
be calculated automatically. Default: None
.dm_only
, if set to True
, it will be assumed that there are
only dark matter particles present in the simulation.
This option does not modify the halos found by Rockstar, however
this option can save disk access time if there are no star particles
(or other non-dark matter particles) in the simulation. Default: False
.Rockstar dumps halo information in a series of text (halo*list and
out*list) and binary (halo*bin) files inside the outbase
directory.
We use the halo list classes to recover the information.
Inside the outbase
directory there is a text file named datasets.txt
that records the connection between ds names and the Rockstar file names.
For more information, see the
RockstarHalo
and
Halo
classes.
Both the HOP and FoF halo finders can run in parallel using simple spatial decomposition. In order to run them in parallel it is helpful to understand how it works. Below in the first plot (i) is a simplified depiction of three haloes labeled 1,2 and 3:
Halo 3 is twice reflected around the periodic boundary conditions.
In (ii), the volume has been sub-divided into four equal subregions, A,B,C and D, shown with dotted lines. Notice that halo 2 is now in two different subregions, C and D, and that halo 3 is now in three, A, B and D. If the halo finder is run on these four separate subregions, halo 1 is be identified as a single halo, but haloes 2 and 3 are split up into multiple haloes, which is incorrect. The solution is to give each subregion padding to oversample into neighboring regions.
In (iii), subregion C has oversampled into the other three regions, with the periodic boundary conditions taken into account, shown by dot-dashed lines. The other subregions oversample in a similar way.
The halo finder is then run on each padded subregion independently and simultaneously. By oversampling like this, haloes 2 and 3 will both be enclosed fully in at least one subregion and identified completely.
Haloes identified with centers of mass inside the padded part of a subregion are thrown out, eliminating the problem of halo duplication. The centers for the three haloes are shown with stars. Halo 1 will belong to subregion A, 2 to C and 3 to B.
To run with parallel halo finding, you must supply a value for
padding in the finder_kwargs argument. The padding
parameter
is in simulation units and defaults to 0.02. This parameter is how
much padding is added to each of the six sides of a subregion.
This value should be 2x-3x larger than the largest expected halo
in the simulation. It is unlikely, of course, that the largest
object in the simulation will be on a subregion boundary, but there
is no way of knowing before the halo finder is run.
import yt
from yt.analysis_modules.halo_analysis.api import *
ds = yt.load("data0001")
hc = HaloCatalog(data_ds = ds, finder_method = 'hop', finder_kwargs={'padding':0.02})
# --or--
hc = HaloCatalog(data_ds = ds, finder_method = 'fof', finder_kwargs={'padding':0.02})
In general, a little bit of padding goes a long way, and too much just slows down the analysis and doesn’t improve the answer (but doesn’t change it). It may be worth your time to run the parallel halo finder at a few paddings to find the right amount, especially if you’re analyzing many similar datasets.
Because of changes in the Rockstar API over time, yt only currently works with
a slightly older version of Rockstar. This version of Rockstar has been
slightly patched and modified to run as a library inside of yt. By default it
is not installed with yt, but installation is very easy. The
All-in-One Installation Script used to install yt from source has a line:
INST_ROCKSTAR=0
that must be changed to INST_ROCKSTAR=1
. You can
rerun this installer script over the top of an existing installation, and
it will only install components missing from the existing installation.
You can do this as follows. Put your freshly modified install_script in
the parent directory of the yt installation directory (e.g. the parent of
$YT_DEST
, yt-x86_64
, yt-i386
, etc.), and rerun the installer:
cd $YT_DEST
cd ..
vi install_script.sh // or your favorite editor to change INST_ROCKSTAR=1
bash < install_script.sh
This will download Rockstar and install it as a library in yt. You should now be able to use Rockstar and yt together.