Automatically Derived Arrays#
Pynbody includes a system which automatically derives one array from others. There are two goals for this system:
Help make user code more independent of the underlying file format, by providing missing arrays. For example, some simulation formats store temperature while others store internal energy. By always presenting a
temp
array to you, and handling the conversion automatically, the need to explicitly distinguish between different scenarios is removed.Keep arrays up-to-date when they depend on other arrays. For example, if you are interested in the radius of particles from the origin, a derived
r
array is offered which is automatically recalculated if the position of particles changes.
Example: the radius array#
The quantities listed under derived
are calculated for
all simulation types. If you want, for example, to calculate the
radius of particles away from the origin, you can just access the r
array and
pynbody will calculate it:
In [1]: import pynbody
In [2]: s = pynbody.load('testdata/gasoline_ahf/g15784.lr.01024.gz')
In [3]: s['r']
Out[3]:
SimArray([0.17058903, 0.16980713, 0.17045854, ..., 0.12941349, 0.12942831,
0.12946079], '6.85e+04 kpc a')
Here, r
has been calculated in the same units as the position array,
which in turn are in the units of the simulation.
You can call physical_units()
as
normal to convert to more recognizable units.
Note that you cannot directly alter a derived array:
In [4]: s['r'][0] = 0
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[4], line 1
----> 1 s['r'][0] = 0
File ~/checkouts/readthedocs.org/user_builds/pynbody/envs/latest/lib/python3.11/site-packages/pynbody/array/__init__.py:809, in _dirty_fn.<locals>.q(a, *y, **kw)
807 return w(a, *y, **kw)
808 else:
--> 809 return w(a, *y)
File ~/checkouts/readthedocs.org/user_builds/pynbody/envs/latest/lib/python3.11/site-packages/pynbody/array/__init__.py:566, in SimArray.__setitem__(self, item, to)
564 np.ndarray.__setitem__(self, item, to.in_units(self.units))
565 else:
--> 566 np.ndarray.__setitem__(self, item, to)
ValueError: assignment destination is read-only
But the array updates itself when one of its dependencies changes:
In [5]: s['x']-=1
In [6]: s['r']
Out[6]:
SimArray([1.00383365, 1.00300699, 1.00292416, ..., 0.98382042, 0.98378768,
0.98368316], '6.85e+04 kpc a')
The recalculation only takes place when required – in the example
above the r
array is stored (just like a normal array) until the pos
array is updated,
at which point it is deleted. When you ask for the r
array again,
it is recalculated.
This is why you’re not allowed to change values in a derived array – your changes could be overwritten, leading to confusing bugs. However, if you want to make a derived array back into a normal array you can do so.
In [7]: s['r'].derived = False
In [8]: s['r'][0] = 0
In [9]: s['r']
Out[9]:
SimArray([0. , 1.00300699, 1.00292416, ..., 0.98382042, 0.98378768,
0.98368316], '6.85e+04 kpc a')
At this point you’ve taken full responsibility for the array, so if you update its dependencies the framework won’t help you out any longer:
In [10]: s['x']+=2
In [11]: s['r'] # NOT updated because we took control by setting derived = False
Out[11]:
SimArray([0. , 1.00300699, 1.00292416, ..., 0.98382042, 0.98378768,
0.98368316], '6.85e+04 kpc a')
To get the framework back on your side, you can delete the modified array:
In [12]: del s['r']
In [13]: s['r']
Out[13]:
SimArray([1.02494841, 1.02549787, 1.02579498, ..., 1.03227568, 1.0323106 ,
1.03241834], '6.85e+04 kpc a')
Here we’ve deleted then re-derived the r
array, so it’s now accurate (and will start
auto-updating again).
Derived functions for specific formats#
Some derived arrays are specific to certain simulation formats. For example, ramses simulations
need to derive masses for their gas cells and as such mass()
is registered
as a derived array specifically for the RamsesSnap
class.
Defining your own deriving functions#
You can easily define your own derived arrays. The easiest way to do
this is using the decorator pynbody.snapshot.simsnap.SimSnap.derived_array()
.
This is handily aliased to pynbody.derived_array
.
Here’s an example of a derived array that calculates the specific kinetic energy of a particle:
In [14]: @pynbody.derived_array
....: def specific_ke(sim):
....: return 0.5 * (sim['vel']**2).sum(axis=1)
....:
In [15]: s['specific_ke']
Out[15]:
SimArray([0.01443704, 0.01435218, 0.01545113, ..., 0.02690037, 0.02084463,
0.02266956], '2.98e+06 km**2 a**2 s**-2')
When your function is called, the framework monitors any arrays it retrieves from the simulation. It automatically marks the accessed arrays as dependencies for your function. So, if the velocities now get changed, your derived array will be recalculated:
In [16]: s['vel']*=2
In [17]: s['specific_ke']
Out[17]:
SimArray([0.05774817, 0.05740871, 0.0618045 , ..., 0.10760149, 0.08337851,
0.09067823], '2.98e+06 km**2 a**2 s**-2')
To create a derived array associated with a specific subclass, use the
derived_array()
method of that subclass,
e.g.
In [18]: @pynbody.snapshot.tipsy.TipsySnap.derived_array
....: def half_mass(sim):
....: return 0.5 * sim['mass']
....:
In [19]: s['half_mass'] # this is a TipsySnap, so this will work
Out[19]:
SimArray([1.59655292e-11, 1.59655292e-11, 1.59655292e-11, ...,
6.20879997e-12, 6.20879997e-12, 6.20879997e-12], '4.75e+16 Msol')
In [20]: another_snap = pynbody.new(dm=100) # NOT a TipsySnap!
In [21]: another_snap['half_mass']
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
Cell In[21], line 1
----> 1 another_snap['half_mass']
File ~/checkouts/readthedocs.org/user_builds/pynbody/envs/latest/lib/python3.11/site-packages/pynbody/snapshot/simsnap.py:233, in SimSnap.__getitem__(self, i)
230 from . import subsnap
232 if isinstance(i, str):
--> 233 return self._get_array_with_lazy_actions(i)
234 elif isinstance(i, slice):
235 return subsnap.SubSnap(self, i)
File ~/checkouts/readthedocs.org/user_builds/pynbody/envs/latest/lib/python3.11/site-packages/pynbody/snapshot/simsnap.py:351, in SimSnap._get_array_with_lazy_actions(self, name)
347 self.__derive_if_required(name)
349 # At this point we've done everything we can to get the array into memory. If it's still not there, we'll
350 # get a KeyError from the below.
--> 351 return self._get_array(name)
File ~/checkouts/readthedocs.org/user_builds/pynbody/envs/latest/lib/python3.11/site-packages/pynbody/snapshot/simsnap.py:1226, in SimSnap._get_array(self, name, index, always_writable)
1216 def _get_array(self, name, index=None, always_writable=False):
1217 """Get the array of the specified *name*, optionally for only the particles specified by *index*.
1218
1219 If *always_writable* is True, the returned array is writable. Otherwise, it is still normally writable, but
(...)
1223 _get_array_with_lazy_actions.
1224 """
-> 1226 x = self._arrays[name]
1227 if x.derived and not always_writable:
1228 x = x.view()
KeyError: 'half_mass'
The derived array will only be available for the class it was defined for, so the final command raised an error.
Changed in version 2.0: The method name has been changed from derived_quantity
to derived_array
.
The old name is still available but will be removed in a future version.
Stable derived arrays#
Occasionally, you may want to define a derived array that is not automatically recalulated
when its underlying dependencies change. An example from within the framework is the
smoothing length (smooth
). This is expensive to calculate, and changes to the underlying position
coordinates – while in theory capable of changing the smoothing length – are far more commonly
associated with translations or rotations in the course of a normal analysis. In this case, it
would be wasteful to recalculate the smoothing length every time the position coordinates change.
To define a stable derived array, use stable_derived_array()
in place of derived_array()
. The array will be derived for
the first time, but will not automatically update:
In [22]: @pynbody.snapshot.simsnap.SimSnap.stable_derived_array
....: def stable_x_copy(sim):
....: return sim['x']
....:
In [23]: s['stable_x_copy']
Out[23]:
SimArray([1.01070931, 1.01140571, 1.01159962, ..., 1.02442261, 1.02445675,
1.02456377], '6.85e+04 kpc a')
In [24]: s['x']+=1
In [25]: s['x']
Out[25]:
SimArray([2.01070931, 2.01140571, 2.01159962, ..., 2.02442261, 2.02445675,
2.02456377], '6.85e+04 kpc a')
In [26]: s['stable_x_copy']
Out[26]:
SimArray([1.01070931, 1.01140571, 1.01159962, ..., 1.02442261, 1.02445675,
1.02456377], '6.85e+04 kpc a')
See also
More information about the derived array system can be found in the
method documentation for derived_array()
and stable_derived_array()
. Information about
built-in derived arrays can be found in the module pynbody.derived
. The module
pynbody.analysis.luminosity
also defines an entire class of derived arrays for
calculating magnitudes from stellar population tables.