Data objects (also called Data Containers) are used in yt as convenience structures for grouping data in logical ways that make sense in the context of the dataset as a whole. Some of the data objects are geometrical groupings of data (e.g. sphere, box, cylinder, etc.). Others represent data products derived from your dataset (e.g. slices, streamlines, surfaces). Still other data objects group multiple objects together or filter them (e.g. data collection, cut region).
To generate standard plots, objects rarely need to be directly constructed. However, for detailed data inspection as well as hand-crafted derived data, objects can be exceptionally useful and even necessary.
To create an object, you usually only need a loaded dataset, the name of
the object type, and the relevant parameters for your object. Here is a common
example for creating a Region
object that covers all of your data volume.
import yt
ds = yt.load("RedshiftOutput0005")
ad = ds.all_data()
Alternatively, we could create a sphere object of radius 1 kpc on location [0.5, 0.5, 0.5]:
import yt
ds = yt.load("RedshiftOutput0005")
sp = ds.sphere([0.5, 0.5, 0.5], (1, 'kpc'))
After an object has been created, it can be used as a data_source to certain
tasks like ProjectionPlot
(see
ProjectionPlot
), one can compute the
bulk quantities associated with that object (see Processing Objects: Derived Quantities),
or the data can be examined directly. For example, if you want to figure out
the temperature at all indexed locations in the central sphere of your
dataset you could:
import yt
ds = yt.load("RedshiftOutput0005")
sp = ds.sphere([0.5, 0.5, 0.5], (1, 'kpc'))
# Show all temperature values
print sp["temperature"]
# Print things in a more human-friendly manner: one temperature at a time
print "(x, y, z) Temperature"
print "-----------------------"
for i in range(sp["temperature"].size):
print "(%f, %f, %f) %f" % (sp["x"][i], sp["y"][i], sp["z"][i], sp["temperature"][i])
As noted above, there are numerous types of objects. Here we group them into:
If you want to create your own custom data object type, see Creating Data Objects.
For 0D, 1D, and 2D geometric objects, if the extent of the object intersects a grid cell, then the cell is included in the object; however, for 3D objects the center of the cell must be within the object in order for the grid cell to be incorporated.
YTPointBase
point(coord, ds=None, field_parameters=None, data_source=None)
YTOrthoRayBase
ortho_ray(axis, coord, ds=None, field_parameters=None, data_source=None)
YTRayBase
ray(start_coord, end_coord, ds=None, field_parameters=None, data_source=None)
YTSliceBase
slice(axis, coord, center=None, ds=None, field_parameters=None, data_source=None)
YTCuttingPlaneBase
cutting(normal, coord, north_vector=None, ds=None, field_parameters=None, data_source=None)
all_data()
all_data(find_max=False)
all_data()
is a wrapper on the Box Region class which defaults to
creating a Region covering the entire dataset domain. It is effectively
ds.region(ds.domain_center, ds.domain_left_edge, ds.domain_right_edge)
.YTRegionBase
region(center, left_edge, right_edge, fields=None, ds=None, field_parameters=None, data_source=None)
box(left_edge, right_edge, fields=None, ds=None, field_parameters=None, data_source=None)
box
wrapper, the center
is assumed to be the midpoint between the left and right edges.YTDiskBase
disk(center, normal, radius, height, fields=None, ds=None, field_parameters=None, data_source=None)
YTEllipsoidBase
ellipsoid(center, semi_major_axis_length, semi_medium_axis_length, semi_minor_axis_length, semi_major_vector, tilt, fields=None, ds=None, field_parameters=None, data_source=None)
YTSphereBase
sphere(center, radius, ds=None, field_parameters=None, data_source=None)
See also the section on Filtering your Dataset.
slice(axis, coord, ds, data_source=sph)
YTCutRegionBase
cut_region(base_object, conditionals, ds=None, field_parameters=None)
cut_region
is a filter which can be applied to any other data
object. The filter is defined by the conditionals present, which
apply cuts to the data in the object. A cut_region
will work
for either particle fields or mesh fields, but not on both simulaneously.
For more detailed information and examples, see Cut Regions.YTDataCollectionBase
data_collection(center, obj_list, ds=None, field_parameters=None)
data_collection
is a list of data objects that can be
sampled and processed as a whole in a single data object.YTCoveringGridBase
covering_grid(level, left_edge, dimensions, fields=None, ds=None, num_ghost_zones=0, use_pbar=True, field_parameters=None)
smoothed_covering_grid(level, left_edge, dimensions, fields=None, ds=None, num_ghost_zones=0, use_pbar=True, field_parameters=None)
YTArbitraryGridBase
arbitrary_grid(left_edge, right_edge, dimensions, ds=None, field_parameters=None)
YTQuadTreeProjBase
proj(field, axis, weight_field=None, center=None, ds=None, data_source=None, method="integrate", field_parameters=None)
data_source
keyword). Alternatively, one can specify
a weight_field and different method
values to change the nature
of the projection outcome. See Types of Projections for more information.YTStreamlineBase
streamline(coord_list, length, fields=None, ds=None, field_parameters=None)
streamline
can be traced out by identifying a starting coordinate (or
list of coordinates) and allowing it to trace a vector field, like gas
velocity. See Streamlines: Tracking the Trajectories of Tracers in your Data for more information.YTSurfaceBase
surface(data_source, field, field_value)
Derived quantities are a way of calculating some bulk quantities associated
with all of the grid cells contained in a data object.
Derived quantities can be accessed via the quantities
interface.
Here is an example of how to get the angular momentum vector calculated from
all the cells contained in a sphere at the center of our dataset.
ds = load("my_data")
sp = ds.sphere('c', (10, 'kpc'))
print sp.quantities.angular_momentum_vector()
AngularMomentumVector
angular_momentum_vector(use_gas=True, use_particles=True)
BulkVelocity
bulk_velocity(use_gas=True, use_particles=True)
CenterOfMass
center_of_mass(use_cells=True, use_particles=False)
MaxLocation
max_location(fields)
MinLocation
min_location(fields)
SpinParameter
spin_parameter(use_gas=True, use_particles=True)
TotalMass
total_mass()
TotalQuantity
total_quantity(fields)
WeightedAverageQuantity
weighted_average_quantity(fields, weight)
ones
.WeightedVariance
weighted_variance(fields, weight)
ones
.The covering grid and smoothed covering grid objects mandate that they be
exactly aligned with the mesh. This is a
holdover from the time when yt was used exclusively for data that came in
regularly structured grid patches, and does not necessarily work as well for
data that is composed of discrete objects like particles. To augment this, the
YTArbitraryGridBase
object
was created, which enables construction of meshes (onto which particles can be
deposited or smoothed) in arbitrary regions. This eliminates any assumptions
on yt’s part about how the data is organized, and will allow for more
fine-grained control over visualizations.
An example of creating an arbitrary grid would be to construct one, then query the deposited particle density, like so:
import yt
ds = yt.load("snapshot_010.hdf5")
obj = ds.arbitrary_grid([0.0, 0.0, 0.0], [0.99, 0.99, 0.99],
dims=[128, 128, 128])
print obj["deposit", "all_density"]
While these cannot yet be used as input to projections or slices, slices and projections can be taken of the data in them and visualized by hand.
Note
Boolean Data Objects have not yet been ported to yt 3.0 from yt 2.x. If you are interested in aiding in this port, please contact the yt-dev mailing list. Until it is ported, this functionality below will not work.
A special type of data object is the boolean data object. It works only on three-dimensional objects. It is built by relating already existing data objects with boolean operators. The boolean logic may be nested using parentheses, and it supports the standard “AND”, “OR”, and “NOT” operators:
Please see the The Cookbook for some examples of how to use the boolean data object.
The underlying machinery used in Clump Finding is accessible from any data object. This includes the ability to obtain and examine topologically connected sets. These sets are identified by examining cells between two threshold values and connecting them. What is returned to the user is a list of the intervals of values found, and extracted regions that contain only those cells that are connected.
To use this, call
extract_connected_sets()
on
any 3D data object. This requests a field, the number of levels of levels sets to
extract, the min and the max value between which sets will be identified, and
whether or not to conduct it in log space.
sp = ds.sphere("max", (1.0, 'pc'))
contour_values, connected_sets = sp.extract_connected_sets(
"density", 3, 1e-30, 1e-20)
The first item, contour_values
, will be an array of the min value for each
set of level sets. The second (connected_sets
) will be a dict of dicts.
The key for the first (outer) dict is the level of the contour, corresponding
to contour_values
. The inner dict returned is keyed by the contour ID. It
contains YTCutRegionBase
objects. These can be queried just as any other data object. The clump finder
(Clump Finding) differs from the above method in that the contour
identification is performed recursively within each individual structure, and
structures can be kept or remerged later based on additional criteria, such as
gravitational boundedness.
Often, when operating interactively or via the scripting interface, it is convenient to save an object or multiple objects out to disk and then restart the calculation later. For example, this is useful after clump finding (Clump Finding), which can be very time consuming. Typically, the save and load operations are used on 3D data objects. yt has a separate set of serialization operations for 2D objects such as projections.
yt will save out objects to disk under the presupposition that the construction of the objects is the difficult part, rather than the generation of the data – this means that you can save out an object as a description of how to recreate it in space, but not the actual data arrays affiliated with that object. The information that is saved includes the dataset off of which the object “hangs.” It is this piece of information that is the most difficult; the object, when reloaded, must be able to reconstruct a dataset from whatever limited information it has in the save file.
You can save objects to an output file using the function
save_object()
:
import yt
ds = yt.load("my_data")
sp = ds.sphere([0.5, 0.5, 0.5], (10.0, 'kpc'))
sp.save_object("sphere_name", "save_file.cpkl")
This will store the object as sphere_name
in the file
save_file.cpkl
, which will be created or accessed using the standard
python module shelve
.
To re-load an object saved this way, you can use the shelve module directly:
import yt
import shelve
ds = yt.load("my_data")
saved_fn = shelve.open("save_file.cpkl")
ds, sp = saved_fn["sphere_name"]
Additionally, we can store multiple objects in a single shelve file, so we have to call the sphere by name.
For certain data objects such as projections, serialization can be performed
automatically if serialize
option is set to True
in the
configuration file or set directly in the script:
from yt.config import ytcfg; ytcfg["yt", "serialize"] = "True"
Note
Use serialization with caution. Enabling serialization means that once a projection of a dataset has been created (and stored in the .yt file in the same directory), any subsequent changes to that dataset will be ignored when attempting to create the same projection. So if you take a density projection of your dataset in the ‘x’ direction, then somehow tweak that dataset significantly, and take the density projection again, yt will default to finding the original projection and not your new one.
Note
It’s also possible to use the standard cPickle
module for
loading and storing objects – so in theory you could even save a
list of objects!
This method works for clumps, as well, and the entire clump index will be stored and restored upon load.