Note
Go to the end to download the full example code.
Storing Image Data in NWB
This tutorial will demonstrate the usage of the pynwb.image
module for adding
images to an NWBFile
.
Image data can be a collection of individual images or movie segments (as a movie is simply a series of images), about the subject, the environment, the presented stimuli, or other parts related to the experiment. This tutorial focuses in particular on the usage of:
OpticalSeries
andAbstractFeatureSeries
for series of images that were presented as stimulusImageSeries
, for series of images (movie segments);GrayscaleImage
,RGBImage
,RGBAImage
, for static images;
The following examples will reference variables that may not be defined within the block they are used in. For clarity, we define them here:
from datetime import datetime
import os
from uuid import uuid4
import numpy as np
from dateutil import tz
from PIL import Image
from pynwb import NWBHDF5IO, NWBFile
from pynwb.base import Images
from pynwb.image import GrayscaleImage, ImageSeries, OpticalSeries, RGBAImage, RGBImage
from pynwb.misc import AbstractFeatureSeries
# Define file paths used in the tutorial
nwbfile_path = os.path.abspath("images_tutorial.nwb")
moviefiles_path = [
os.path.abspath("image/file_1.tiff"),
os.path.abspath("image/file_2.tiff"),
os.path.abspath("image/file_3.tiff"),
]
Create an NWB File
Create an NWBFile
object with the required fields
(session_description
, identifier
, session_start_time
) and additional metadata.
session_start_time = datetime(2018, 4, 25, 2, 30, 3, tzinfo=tz.gettz("US/Pacific"))
nwbfile = NWBFile(
session_description="my first synthetic recording",
identifier=str(uuid4()),
session_start_time=session_start_time,
experimenter=[
"Baggins, Bilbo",
],
lab="Bag End Laboratory",
institution="University of Middle Earth at the Shire",
experiment_description="I went on an adventure to reclaim vast treasures.",
session_id="LONELYMTN001",
)
nwbfile
See also
You can learn more about the NWBFile
format in the NWB File Basics tutorial.
OpticalSeries: Storing series of images as stimuli
OpticalSeries
is for time series of images that were presented
to the subject as stimuli.
We will create an OpticalSeries
object with the name
"StimulusPresentation"
representing what images were shown to the subject and at what times.
Image data can be stored either in the HDF5 file or as an external image file.
For this tutorial, we will use fake image data with shape of ('time', 'x', 'y', 'RGB') = (200, 50, 50, 3)
.
As in all TimeSeries
, the first dimension is time.
The second and third dimensions represent x and y.
The fourth dimension represents the RGB value (length of 3) for color images.
NWB differentiates between acquired data and data that was presented as stimulus.
We can add it to the NWBFile
object as stimulus data using
the add_stimulus
method.
image_data = np.random.randint(low=0, high=255, size=(200, 50, 50, 3), dtype=np.uint8)
optical_series = OpticalSeries(
name="StimulusPresentation", # required
distance=0.7, # required
field_of_view=[0.2, 0.3, 0.7], # required
orientation="lower left", # required
data=image_data,
unit="n.a.",
format="raw",
starting_frame=[0.0],
rate=1.0,
comments="no comments",
description="The images presented to the subject as stimuli",
)
nwbfile.add_stimulus(stimulus=optical_series)
AbstractFeatureSeries: Storing features of visual stimuli
While it is usually recommended to store the entire image data as an OpticalSeries
, sometimes
it is useful to store features of the visual stimuli instead of or in addition to the raw image data. For example,
you may want to store the mean luminance of the image, the contrast, or the spatial frequency. This can be done using
an instance of AbstractFeatureSeries
. This class is a general container for storing time
series of features that are derived from the raw image data.
# Create some fake feature data
feature_data = np.random.rand(200, 3) # 200 time points, 3 features
# Create an AbstractFeatureSeries object
abstract_feature_series = AbstractFeatureSeries(
name="StimulusFeatures",
data=feature_data,
timestamps=np.linspace(0, 1, 200),
description="Features of the visual stimuli",
features=["luminance", "contrast", "spatial frequency"],
feature_units=["n.a.", "n.a.", "cycles/degree"],
)
# Add the AbstractFeatureSeries to the NWBFile
nwbfile.add_stimulus(abstract_feature_series)
Like all TimeSeries
, AbstractFeatureSeries
specify timing using
either the rate
and starting_time
attributes or the timestamps
attribute.
ImageSeries: Storing series of images as acquisition
ImageSeries
is a general container for time series of images acquired during
the experiment. Image data can be stored either in the HDF5 file or as an external image file.
When color images are stored in the HDF5 file the color channel order is expected to be RGB.
We can add raw data to the NWBFile
object as acquisition using
the add_acquisition
method.
image_data = np.random.randint(low=0, high=255, size=(200, 50, 50, 3), dtype=np.uint8)
behavior_images = ImageSeries(
name="ImageSeries",
data=image_data,
description="Image data of an animal moving in environment.",
unit="n.a.",
format="raw",
rate=1.0,
starting_time=0.0,
)
nwbfile.add_acquisition(behavior_images)
External Files
External files (e.g. video files of the behaving animal) can be added to the NWBFile
by creating an ImageSeries
object using the
external_file
attribute that specifies
the path to the external file(s) on disk.
The file(s) path must be relative to the path of the NWB file.
Either external_file
or data
must be specified, but not both.
If the sampling rate is constant, use rate
and
starting_time
to specify time.
For irregularly sampled recordings, use timestamps
to specify time for each sample
image.
Each external image may contain one or more consecutive frames of the full ImageSeries
.
The starting_frame
attribute serves as an index to indicate which frame
each file contains.
For example, if the external_file
dataset has three paths to files and the first and the second file have 2
frames, and the third file has 3 frames, then this attribute will have values [0, 2, 4].
external_file = [
os.path.relpath(movie_path, nwbfile_path) for movie_path in moviefiles_path
]
# We have 3 movie files each containing multiple frames. We here need to specify the timestamp for each frame.
timestamps = [0.0, 0.04, 0.07, 0.1, 0.14, 0.16, 0.21]
behavior_external_file = ImageSeries(
name="ExternalFiles",
description="Behavior video of animal moving in environment.",
unit="n.a.",
external_file=external_file,
format="external",
starting_frame=[0, 2, 4],
timestamps=timestamps,
)
nwbfile.add_acquisition(behavior_external_file)
Note
See the External Links in NWB and DANDI guidelines of the DANDI data archive for best practices on how to organize external files, e.g., movies and images.
Static images
Static images can be stored in an NWBFile
object by creating an
RGBAImage
, RGBImage
or
GrayscaleImage
object with the image data.
Note
All basic image types RGBAImage
, RGBImage
, and
GrayscaleImage
provide the optional: 1) description
parameter to include a
text description about the image and 2) resolution
parameter to specify the pixels / cm resolution
of the image.
RGBAImage: for color images with transparency
RGBAImage
is for storing data of color image with transparency.
RGBAImage.data
must be 3D where the first and second dimensions
represent x and y. The third dimension has length 4 and represents the RGBA value.
img = Image.open("docs/source/figures/logo_pynwb.png") # an example image
rgba_logo = RGBAImage(
name="pynwb_RGBA_logo",
data=np.array(img),
resolution=70.0, # in pixels / cm
description="RGBA version of the PyNWB logo.",
)
RGBImage: for color images
RGBImage
is for storing data of RGB color image.
RGBImage.data
must be 3D where the first and second dimensions
represent x and y. The third dimension has length 3 and represents the RGB value.
rgb_logo = RGBImage(
name="pynwb_RGB_logo",
data=np.array(img.convert("RGB")),
resolution=70.0,
description="RGB version of the PyNWB logo.",
)
GrayscaleImage: for grayscale images
GrayscaleImage
is for storing grayscale image data.
GrayscaleImage.data
must be 2D where the first and second dimensions represent x and y.
gs_logo = GrayscaleImage(
name="pynwb_Grayscale_logo",
data=np.array(img.convert("L")),
description="Grayscale version of the PyNWB logo.",
resolution=35.433071,
)
Images: a container for images
Add the images to an Images
container
that accepts any of these image types.
images = Images(
name="logo_images",
images=[rgb_logo, rgba_logo, gs_logo],
description="A collection of logo images presented to the subject.",
)
nwbfile.add_acquisition(images)
IndexSeries for repeated images
You may want to set up a time series of images where some images are repeated many
times. You could create an ImageSeries
that repeats the data
each time the image is shown, but that would be inefficient, because it would store
the same data multiple times. A better solution would be to store the unique images
once and reference those images. This is how IndexSeries
works. First, create an Images
container with the order of
images defined using an ImageReferences
. Then create an
IndexSeries
that indexes into the
Images
.
from pynwb.base import ImageReferences
from pynwb.image import GrayscaleImage, Images, IndexSeries, RGBImage
images = Images(
name="raccoons",
images=[rgb_logo, gs_logo],
description="A collection of raccoons.",
order_of_images=ImageReferences("order_of_images", [rgb_logo, gs_logo]),
)
idx_series = IndexSeries(
name="stimuli",
data=[0, 1, 0, 1],
indexed_images=images,
unit="N/A",
timestamps=[0.1, 0.2, 0.3, 0.4],
)
Here data contains the (0-indexed) index of the displayed image as they are ordered
in the ImageReferences
.
Writing the images to an NWB File
As demonstrated in the Writing an NWB file tutorial, we will use NWBHDF5IO
to write the file.
with NWBHDF5IO(nwbfile_path, "w") as io:
io.write(nwbfile)
Reading and accessing data
To read the NWB file, use another NWBHDF5IO
object,
and use the read
method to retrieve an
NWBFile
object.
We can access the data added as acquisition to the NWB File by indexing nwbfile.acquisition
with the name of the ImageSeries
object “ImageSeries”.
We can also access OpticalSeries
data that was added to the NWB File
as stimuli by indexing nwbfile.stimulus
with the name of the OpticalSeries
object “StimulusPresentation”.
Data arrays are read passively from the file.
Accessing the data
attribute of the OpticalSeries
object
does not read the data values into memory, but returns an HDF5 object that can be indexed to read data.
Use the [:]
operator to read the entire data array into memory.
with NWBHDF5IO(nwbfile_path, "r") as io:
read_nwbfile = io.read()
print(read_nwbfile.acquisition["ImageSeries"])
print(read_nwbfile.stimulus["StimulusPresentation"].data[:])