Note
Go to the end to download the full example code.
Storing Image Data in NWB
This tutorial will demonstrate the usage of the pynwb.image module for adding
images to an NWBFile.
Image data can be a collection of individual images or movie segments (as a movie is simply a series of images), about the subject, the environment, the presented stimuli, or other parts related to the experiment. This tutorial focuses in particular on the usage of:
OpticalSeriesandAbstractFeatureSeriesfor series of images that were presented as stimulusImageSeries, for series of images (movie segments);GrayscaleImage,RGBImage,RGBAImage, for static images;
The following examples will reference variables that may not be defined within the block they are used in. For clarity, we define them here:
from datetime import datetime
import os
from uuid import uuid4
import numpy as np
from dateutil import tz
from PIL import Image
from pynwb import NWBHDF5IO, NWBFile
from pynwb.base import Images
from pynwb.image import GrayscaleImage, ImageSeries, OpticalSeries, RGBAImage, RGBImage
from pynwb.misc import AbstractFeatureSeries
# Define file paths used in the tutorial
nwbfile_path = os.path.abspath("images_tutorial.nwb")
moviefiles_path = [
os.path.abspath("image/file_1.tiff"),
os.path.abspath("image/file_2.tiff"),
os.path.abspath("image/file_3.tiff"),
]
Create an NWB File
Create an NWBFile object with the required fields
(session_description, identifier, session_start_time) and additional metadata.
session_start_time = datetime(2018, 4, 25, 2, 30, 3, tzinfo=tz.gettz("US/Pacific"))
nwbfile = NWBFile(
session_description="my first synthetic recording",
identifier=str(uuid4()),
session_start_time=session_start_time,
experimenter=[
"Baggins, Bilbo",
],
lab="Bag End Laboratory",
institution="University of Middle Earth at the Shire",
experiment_description="I went on an adventure to reclaim vast treasures.",
session_id="LONELYMTN001",
)
nwbfile
See also
You can learn more about the NWBFile format in the NWB File Basics tutorial.
OpticalSeries: Storing series of images as stimuli
OpticalSeries is for time series of images that were presented
to the subject as stimuli.
We will create an OpticalSeries object with the name
"StimulusPresentation" representing what images were shown to the subject and at what times.
Image data can be stored either in the HDF5 file or as an external image file.
For this tutorial, we will use fake image data with shape of ('time', 'x', 'y', 'RGB') = (200, 50, 50, 3).
As in all TimeSeries, the first dimension is time.
The second and third dimensions represent x and y.
The fourth dimension represents the RGB value (length of 3) for color images.
NWB differentiates between acquired data and data that was presented as stimulus.
We can add it to the NWBFile object as stimulus data using
the add_stimulus method.
image_data = np.random.randint(low=0, high=255, size=(200, 50, 50, 3), dtype=np.uint8)
optical_series = OpticalSeries(
name="StimulusPresentation", # required
distance=0.7, # required
field_of_view=[0.2, 0.3, 0.7], # required
orientation="lower left", # required
data=image_data,
unit="n.a.",
format="raw",
starting_frame=[0.0],
rate=1.0,
comments="no comments",
description="The images presented to the subject as stimuli",
)
nwbfile.add_stimulus(stimulus=optical_series)
AbstractFeatureSeries: Storing features of visual stimuli
While it is usually recommended to store the entire image data as an OpticalSeries, sometimes
it is useful to store features of the visual stimuli instead of or in addition to the raw image data. For example,
you may want to store the mean luminance of the image, the contrast, or the spatial frequency. This can be done using
an instance of AbstractFeatureSeries. This class is a general container for storing time
series of features that are derived from the raw image data.
# Create some fake feature data
feature_data = np.random.rand(200, 3) # 200 time points, 3 features
# Create an AbstractFeatureSeries object
abstract_feature_series = AbstractFeatureSeries(
name="StimulusFeatures",
data=feature_data,
timestamps=np.linspace(0, 1, 200),
description="Features of the visual stimuli",
features=["luminance", "contrast", "spatial frequency"],
feature_units=["n.a.", "n.a.", "cycles/degree"],
)
# Add the AbstractFeatureSeries to the NWBFile
nwbfile.add_stimulus(abstract_feature_series)
Like all TimeSeries, AbstractFeatureSeries specify timing using
either the rate and starting_time attributes or the timestamps attribute.
ImageSeries: Storing series of images as acquisition
ImageSeries is a general container for time series of images acquired during
the experiment. Image data can be stored either in the HDF5 file or as an external image file.
When color images are stored in the HDF5 file the color channel order is expected to be RGB.
We can add raw data to the NWBFile object as acquisition using
the add_acquisition method.
image_data = np.random.randint(low=0, high=255, size=(200, 50, 50, 3), dtype=np.uint8)
behavior_images = ImageSeries(
name="ImageSeries",
data=image_data,
description="Image data of an animal moving in environment.",
unit="n.a.",
format="raw",
rate=1.0,
starting_time=0.0,
)
nwbfile.add_acquisition(behavior_images)
External Files
External files (e.g. video files of the behaving animal) can be added to the NWBFile
by creating an ImageSeries object using the
external_file attribute that specifies
the path to the external file(s) on disk.
The file(s) path must be relative to the path of the NWB file.
Either external_file or data must be specified, but not both.
If the sampling rate is constant, use rate and
starting_time to specify time.
For irregularly sampled recordings, use timestamps to specify time for each sample
image.
Each external image may contain one or more consecutive frames of the full ImageSeries.
The starting_frame attribute serves as an index to indicate which frame
each file contains.
For example, if the external_file dataset has three paths to files and the first and the second file have 2
frames, and the third file has 3 frames, then this attribute will have values [0, 2, 4].
external_file = [
os.path.relpath(movie_path, nwbfile_path) for movie_path in moviefiles_path
]
# We have 3 movie files each containing multiple frames. We here need to specify the timestamp for each frame.
timestamps = [0.0, 0.04, 0.07, 0.1, 0.14, 0.16, 0.21]
behavior_external_file = ImageSeries(
name="ExternalFiles",
description="Behavior video of animal moving in environment.",
unit="n.a.",
external_file=external_file,
format="external",
starting_frame=[0, 2, 4],
timestamps=timestamps,
)
nwbfile.add_acquisition(behavior_external_file)
Note
See the External Links in NWB and DANDI guidelines of the DANDI data archive for best practices on how to organize external files, e.g., movies and images.
Static images
Static images can be stored in an NWBFile object by creating an
RGBAImage, RGBImage or
GrayscaleImage object with the image data.
Note
All basic image types RGBAImage, RGBImage, and
GrayscaleImage provide the optional: 1) description parameter to include a
text description about the image and 2) resolution parameter to specify the pixels / cm resolution
of the image.
RGBAImage: for color images with transparency
RGBAImage is for storing data of color image with transparency.
RGBAImage.data must be 3D where the first and second dimensions
represent x and y. The third dimension has length 4 and represents the RGBA value.
img = Image.open("docs/source/figures/logo_pynwb.png") # an example image
rgba_logo = RGBAImage(
name="pynwb_RGBA_logo",
data=np.array(img),
resolution=70.0, # in pixels / cm
description="RGBA version of the PyNWB logo.",
)
RGBImage: for color images
RGBImage is for storing data of RGB color image.
RGBImage.data must be 3D where the first and second dimensions
represent x and y. The third dimension has length 3 and represents the RGB value.
rgb_logo = RGBImage(
name="pynwb_RGB_logo",
data=np.array(img.convert("RGB")),
resolution=70.0,
description="RGB version of the PyNWB logo.",
)
GrayscaleImage: for grayscale images
GrayscaleImage is for storing grayscale image data.
GrayscaleImage.data must be 2D where the first and second dimensions represent x and y.
gs_logo = GrayscaleImage(
name="pynwb_Grayscale_logo",
data=np.array(img.convert("L")),
description="Grayscale version of the PyNWB logo.",
resolution=35.433071,
)
Images: a container for images
Add the images to an Images container
that accepts any of these image types.
images = Images(
name="logo_images",
images=[rgb_logo, rgba_logo, gs_logo],
description="A collection of logo images presented to the subject.",
)
nwbfile.add_acquisition(images)
IndexSeries for repeated images
You may want to set up a time series of images where some images are repeated many
times. You could create an ImageSeries that repeats the data
each time the image is shown, but that would be inefficient, because it would store
the same data multiple times. A better solution would be to store the unique images
once and reference those images. This is how IndexSeries
works. First, create an Images container with the order of
images defined using an ImageReferences. Then create an
IndexSeries that indexes into the
Images.
from pynwb.base import ImageReferences
from pynwb.image import GrayscaleImage, Images, IndexSeries, RGBImage
images = Images(
name="raccoons",
images=[rgb_logo, gs_logo],
description="A collection of raccoons.",
order_of_images=ImageReferences("order_of_images", [rgb_logo, gs_logo]),
)
idx_series = IndexSeries(
name="stimuli",
data=[0, 1, 0, 1],
indexed_images=images,
unit="N/A",
timestamps=[0.1, 0.2, 0.3, 0.4],
)
Here data contains the (0-indexed) index of the displayed image as they are ordered
in the ImageReferences.
Writing the images to an NWB File
As demonstrated in the Writing an NWB file tutorial, we will use NWBHDF5IO
to write the file.
with NWBHDF5IO(nwbfile_path, "w") as io:
io.write(nwbfile)
Reading and accessing data
To read the NWB file, use another NWBHDF5IO object,
and use the read method to retrieve an
NWBFile object.
We can access the data added as acquisition to the NWB File by indexing nwbfile.acquisition
with the name of the ImageSeries object “ImageSeries”.
We can also access OpticalSeries data that was added to the NWB File
as stimuli by indexing nwbfile.stimulus with the name of the OpticalSeries
object “StimulusPresentation”.
Data arrays are read passively from the file.
Accessing the data attribute of the OpticalSeries object
does not read the data values into memory, but returns an HDF5 object that can be indexed to read data.
Use the [:] operator to read the entire data array into memory.
with NWBHDF5IO(nwbfile_path, "r") as io:
read_nwbfile = io.read()
print(read_nwbfile.acquisition["ImageSeries"])
print(read_nwbfile.stimulus["StimulusPresentation"].data[:])