Subviews and Filters#
Central to pynbody is the ability to work with sub-views of the simulation, and treat them as if they were the full simulation. There are two key concepts:
Sub-views are slices of the simulation data that behave like the full simulation, but only reference a subset of the data. When you modify a sub-view, you modify the original data. Sub-views can be created using a slice, an list of particle indices, a particular particle family, or a filter.
Filters are objects that can be used to select particles based on some criterion. They can be combined using logical operators to create complex selections. Filters can be applied to simulations to create a sub-view.
Creating a simple sub-view#
We will use the testdata provided with pynbody to demonstrate how to create a sub-view of a simulation. If you do not have the testdata, see Obtaining test data.
First, let’s load a simulation to play with:
In [1]: import pynbody
In [2]: import numpy as np
In [3]: f = pynbody.load("testdata/gasoline_ahf/g15784.lr.01024")
In [4]: f.physical_units()
Now we can create a sub-view of the simulation. These are represented by the class
SubSnap
, so we somtimes call them subsnaps. To create a subsnap, we can use a slice
or a list of particle indices. (We can also use a filter, which we will discuss later.)
In [5]: subsnap_slice = f[100:200:2]
In [6]: subsnap_indexed = f[[0, 90, 200, 1000]]
These syntaxes are hopefully self-explanatory if you are familiar with Python and numpy indexing. subsnap_slice
represents every second particle between the 100th and 200th particles, while subsnap_indexed
represents the
particles at indices 0, 90, 200, and 1000.
You can verify that these sub-views behave like the full simulation:
In [7]: np.all(f['pos'][100] == subsnap_slice['pos'][0])
Out[7]: True
In [8]: np.all(f['pos'][[0, 90, 200, 1000]] == subsnap_indexed['pos'])
Out[8]: True
This becomes useful if the sub-view is a physically-meaningful selection of particles. For example, pynbody loads halos from various halo-finding algorithms as sub-views of the simulation.
In [9]: h = f.halos()
In [10]: subsnap_halo0 = h[0]
In [11]: (subsnap_halo0['mass'][:,np.newaxis]*subsnap_halo0['pos']).sum(axis=0)/subsnap_halo0['mass'].sum()
Out[11]: SimArray([ 1675.70819996, -2333.17011991, -8387.15939682], 'kpc')
Actually mass-weighted mean values are such a common thing to want to do that pynbody provides a convenience function
to do them from any subsnap, called mean_by_mass
. So the above code can be replaced with:
In [12]: subsnap_halo0.mean_by_mass('pos')
Out[12]: SimArray([ 1675.70819996, -2333.17011991, -8387.15939682], 'kpc')
In the above we have very quickly found the centre of mass of the first halo in the simulation. More information about halo catalogues can be found in the Halos in Pynbody, but for now we just note that all halos returned by a halo catalogue are a subsnap.
The relationship is two-way#
A subsnap is not just a static view of the data; it is a pointer to the original data. This means that if you modify the subsnap, you modify the original data. For example:
In [13]: subsnap_slice['pos'][0] = [1., 2., 3.]
In [14]: f['pos'][100]
Out[14]: SimArray([1., 2., 3.], 'kpc')
Similarly, if you modify the original data, the subsnap will reflect the change:
In [15]: f['pos'][100] = [4., 5., 6.]
In [16]: subsnap_slice['pos'][0]
Out[16]: SimArray([4., 5., 6.], 'kpc')
Filters#
pynbody.filt
defines abstract filters which can be used in place of
index lists. For instance,
In [17]: cen = subsnap_halo0.mean_by_mass('pos')
In [18]: sphere_filter = pynbody.filt.Sphere('200 kpc', cen)
In [19]: sphere_view = f[sphere_filter]
In [20]: f"DM, gas, star mass: {sphere_view.dm['mass'].sum():.1e}, \
....: {sphere_view.g['mass'].sum():.1e}, and {sphere_view.s['mass'].sum():.1e} Msol."
....:
Out[20]: 'DM, gas, star mass: 1.2e+12, 8.3e+10, and 1.1e+11 Msol.'
The above created a filter that selects all particles within 200 kpc of the centre of mass of the first halo, and then used that filter to create a sub-view of the simulation. The mass of dark matter, gas, and stars within 200 kpc of the halo’s centre of mass was then calculated.
Filters can be combined using logical operators &
, |
, and ~
to create complex selections. For example,
to select all particles within 200 kpc of the centre but outside 25 kpc, you can use:
In [21]: sphere_filter_outer = pynbody.filt.Sphere('200 kpc', cen) \
....: & ~pynbody.filt.Sphere('25 kpc', cen)
....:
In [22]: sphere_outer_view = f[sphere_filter_outer]
In [23]: f"DM, gas, star mass: \
....: {sphere_outer_view.dm['mass'].sum():.1e}, {sphere_outer_view.g['mass'].sum():.1e}, \
....: and {sphere_outer_view.s['mass'].sum():.1e} Msol."
....:
Out[23]: 'DM, gas, star mass: 9.2e+11, 7.4e+10, and 1.2e+10 Msol.'
Other than Sphere
, there are several other filters available in pynbody including
Disc
, Cuboid
, and Annulus
. Filters can also
be more abstract and select particles based on non-geometric criterion. For example, LowPass
selects particles with values below a certain threshold of a specified array, HighPass
selects
particles with values above a certain threshold, and BandPass
selects particles with values
within a certain range.
Note
For a full list of available filters, see the pynbody.filt
module documentation.
For example, to select all stars with age between 1 Gyr and 10 Gyr inside 25kpc:
In [24]: age_filter = pynbody.filt.BandPass('age', '1 Gyr', '10 Gyr')
In [25]: age_sphere_filter = age_filter & pynbody.filt.Sphere('25 kpc', cen)
In [26]: age_sphere_view = f.star[age_sphere_filter]
In [27]: f"Stellar mass in range: {age_sphere_view['mass'].sum():.1e} Msol."
Out[27]: 'Stellar mass in range: 3.5e+10 Msol.'
Performance implications#
Many filters are evaluated using OpenMP parallelism, so they can be very fast. Furthermore, if a KD tree has been
constructed for the simulation (via build_tree()
), then
Sphere
and Cuboid
filters automatically evaluate using that tree.
This can amount to a significant speedup for very large simulations, although the effect on smaller simulations is less
pronounced, especially if many CPU cores are available, making the brute force search pretty fast in the first place.