Providing Datasets to Measurements through Blobs¶
A common pattern is to reduce a raw dataset, and share that dataset between several measurements.
The measurement API allows such datasets to be expressed as ‘blobs,’
In the context of the measurement API, a blob is an object that contains Datum
objects.
There are several advantages of storing input datasets in blob objects:
- Any pre-reduction of a raw dataset can be done in the blob object, keeping a codebase organized.
- Blobs can be passed to measurement objects, which simplifies the construction of measurements.
- Blobs are automatically serialized alongside measurements and are available to the SQUASH dashboard. Blobs can be shared among several measurements, with the blob data only be stored once.
Template for a Blob Class¶
Blobs are subclasses of lsst.validate.BlobBase
that register one or more Datum
objects.
from lsst.validate.drp.base import BlobBase
class SimpleBlob(BlobBase):
def __init__(self, gMags, iMags):
BlobBase.__init__(self)
self.registerDatum('g', value=gMags, units='mag',
description='g-band magnitudes')
self.registerDatum('i', value=iMags, units='mag',
description='i-band magnitudes')
self.registerDatum('gi', units='mag',
description='g-i colour')
self.gi = self.g - self.i
In this example, the g
and i
attributes are initially registered with values.
A third blob attribute, gi
, is also declared and its value is computed afterwards.
Notice that, like parameters
and extras
of measurement classes, the values of BlobBase
-type objects can be accessed and updated directly through instance attributes.
Accessing datum objects¶
Internally, blob attributes are stored as Datum
objects, which can be accessed as items of the datums
attribute.
blob = SimpleBlob(g, i)
blob.datums['gi'].value # == blob.gi
blob.datums['gi'].units # 'mag'
blob.datums['gi'].label # 'gi', this was automatically set from the name
blob.datums['gi'].description # 'g-i colour'
Linking measurements to blobs¶
When a blob is used by a measurement, the measurement class should declare that usage so that the SQUASH dashboard can provide rich context to measurements. Measurement classes can accomplish this simply by making the blob an instance attribute. For example:
class MeanColor(MeasurementBase):
label = 'MeanColour'
units = 'mag'
def __init__(self, simpleBlob):
self.metric = Metric.fromYaml(self.label)
self.simpleBlob = simpleBlob
self.value = np.mean(self.simpleBlob.gi)
Accessing blobs in measurements¶
In addition to simply accessing blobs associated with a measurement through the instance attribute, blobs are also available as items of the measurement’s blobs
attribute:
color = SimpleBlob(g, i)
meanColor = MeanColor(color)
meanColor.blobs['simpleBlob'].gi # array of g-i colours