NWB basics

This example will focus on the basics of working with an NWBFile object, including writing and reading of an NWB file.

The NWB file

from datetime import datetime
from dateutil.tz import tzlocal
from pynwb import NWBFile
import numpy as np

start_time = datetime(2017, 4, 3, 11, tzinfo=tzlocal())
create_date = datetime(2017, 4, 15, 12, tzinfo=tzlocal())

nwbfile = NWBFile(session_description='demonstrate NWBFile basics',  # required
                  identifier='NWB123',  # required
                  session_start_time=start_time,  # required
                  file_create_date=create_date)  # optional

Time series data

PyNWB stores time series data using the TimeSeries class and its subclasses. The main components of a TimeSeries are the data and the timestamps. You will also need to supply the name and unit of measurement for data.

from pynwb import TimeSeries

data = list(range(100, 200, 10))
timestamps = list(range(10))
test_ts = TimeSeries(name='test_timeseries', data=data, unit='m', timestamps=timestamps)

Alternatively, if your recordings are sampled at a uniform rate, you can supply starting_time and rate.

rate_ts = TimeSeries(name='test_timeseries', data=data, unit='m', starting_time=0.0, rate=1.0)

Using this scheme says that this TimeSeries started recording 0 seconds after start_time stored in the NWBFile and sampled every second.

TimeSeries objects can be added directly to your NWBFile using the methods add_acquisition, add_stimulus and add_stimulus_template. Which method you use depends on the source of the data: use add_acquisition to indicated acquisition data, add_stimulus to indicate stimulus data, and add_stimulus_template to store stimulus templates.

nwbfile.add_acquisition(test_ts)

Access the TimeSeries object ‘test_timeseries’ from acquisition using

nwbfile.acquisition['test_timeseries']

or

nwbfile.get_acquisition('test_timeseries')

Writing an NWB file

NWB I/O is carried out using the NWBHDF5IO class [1]. This class is responsible for mapping an NWBFile object into HDF5 according to the NWB schema.

To write an NWBFile, use the write method.

from pynwb import NWBHDF5IO

io = NWBHDF5IO('example_file_path.nwb', mode='w')
io.write(nwbfile)
io.close()

You can also use NWBHDF5IO as a context manager:

with NWBHDF5IO('example_file_path.nwb', 'w') as io:
    io.write(nwbfile)

Reading an NWB file

As with writing, reading is also carried out using the NWBHDF5IO class. To read the NWB file we just wrote, use another NWBHDF5IO object, and use the read method to retrieve an NWBFile object.

io = NWBHDF5IO('example_file_path.nwb', 'r')
nwbfile_in = io.read()

Retrieving data from an NWB file

test_timeseries_in = nwbfile_in.acquisition['test_timeseries']
print(test_timeseries_in)
test_timeseries <class 'pynwb.base.TimeSeries'>
Fields:
  comments: no comments
  conversion: 1.0
  data: <HDF5 dataset "data": shape (10,), type "<i8">
  description: no description
  interval: 1
  resolution: 0.0
  timestamps: <HDF5 dataset "timestamps": shape (10,), type "<f8">
  timestamps_unit: Seconds
  unit: SIunit

Accessing the data field, you will notice that it does not return the data values, but instead an HDF5 dataset.

print(test_timeseries_in.data)
<HDF5 dataset "data": shape (10,), type "<i8">

This object lets you only read in a section of the dataset without reading the entire thing.

print(test_timeseries_in.data[:2])
[100 110]

To load the entire dataset, use [:].

print(test_timeseries_in.data[:])
io.close()
[100 110 120 130 140 150 160 170 180 190]

If you use NWBHDF5IO as a context manager during read, be aware that the NWBHDF5IO gets closed and when the context completes and the data will not be available outside of the context manager [2].

Adding More Data

The following illustrates basic data organizational structures that are used throughout NWB:N.

Reusing timestamps

When working with multi-modal data, it can be convenient and efficient to store timestamps once and associate multiple data with the single timestamps instance. PyNWB enables this by letting you reuse timestamps across TimeSeries objects. To reuse a TimeSeries timestamps in a new TimeSeries, pass the existing TimeSeries as the new TimeSeries timestamps:

data = list(range(101, 201, 10))
reuse_ts = TimeSeries('reusing_timeseries', data, 'SIunit', timestamps=test_ts)

Data interfaces

NWB provides the concept of a data interface–an object for a standard storage location of specific types of data–through the NWBDataInterface class. For example, Position provides a container that holds one or more SpatialSeries objects. SpatialSeries is a subtype of TimeSeries that represents the spatial position of an animal over time. By putting your position data into a Position container, downstream users and tools know where to look to retrieve position data. For a comprehensive list of available data interfaces, see the overview page. Here is how to create a Position object named ‘Position’ [3].

from pynwb.behavior import Position

position = Position()

You can add objects to a data interface as a method of the data interface:

position.create_spatial_series(name='position1',
                               data=np.linspace(0, 1, 20),
                               rate=50.,
                               reference_frame='starting gate')

or you can add pre-existing objects:

from pynwb.behavior import SpatialSeries

spatial_series = SpatialSeries(name='position2',
                               data=np.linspace(0, 1, 20),
                               rate=50.,
                               reference_frame='starting gate')

position.add_spatial_series(spatial_series)

or include the object during construction:

spatial_series = SpatialSeries(name='position2',
                               data=np.linspace(0, 1, 20),
                               rate=50.,
                               reference_frame='starting gate')

position = Position(spatial_series=spatial_series)

Each data interface stores its own type of data. We suggest you read the documentation for the data interface of interest in the API documentation to figure out what data the data interface allows and/or requires and what methods you will need to call to add this data.

Processing modules

Processing modules are used for storing a set of data interfaces that are related to a particular processing workflow. For example, if you want to store the intermediate results of a spike sorting workflow, you could create a ProcessingModule that contains data interfaces that represent the common first steps in spike sorting e.g. EventDetection, EventWaveform, FeatureExtraction. The final results of the sorting could then be stored in the top-level Units table (see below). Derived preprocessed data should go in a processing module, which you can create using create_processing_module:

behavior_module = nwbfile.create_processing_module(name='behavior',
                                                   description='preprocessed behavioral data')

or by directly calling the constructor and adding to the NWBFile using add_processing_module:

from pynwb import ProcessingModule

ecephys_module = ProcessingModule(name='ecephys',
                                  description='preprocessed extracellular electrophysiology')
nwbfile.add_processing_module(ecephys_module)

Best practice is to use the NWB schema module names as processing module names where appropriate. These are: ‘behavior’, ‘ecephys’, ‘icephys’, ‘ophys’, ‘ogen’, ‘retinotopy’, and ‘misc’. You may also create a processing module with a custom name. Once these processing modules are added, access them with

nwbfile.processing

which returns a dict:

{'behavior':
 behavior <class 'pynwb.base.ProcessingModule'>
 Fields:
   data_interfaces: { Position <class 'pynwb.behavior.Position'> }
   description: preprocessed behavioral data, 'ecephys':
 ecephys <class 'pynwb.base.ProcessingModule'>
 Fields:
   data_interfaces: { }
   description: preprocessed extracellular electrophysiology}

NWBDataInterface objects can be added to the behavior ProcessingModule.

nwbfile.processing['behavior'].add(position)

Epochs

Epochs can be added to an NWB file using the method add_epoch. The first and second arguments are the start time and stop times, respectively. The third argument is one or more tags for labelling the epoch, and the fifth argument is a list of all the TimeSeries that the epoch applies to.

nwbfile.add_epoch(2.0, 4.0, ['first', 'example'], [test_ts, ])
nwbfile.add_epoch(6.0, 8.0, ['second', 'example'], [test_ts, ])

Trials

Trials can be added to an NWB file using the methods add_trial and add_trial_column. Together, these methods maintains a table-like structure that can define arbitrary columns without having to go through the extension process.

By default, NWBFile only requires trial start time and trial end time. Additional columns can be added using add_trial_column. This method takes a name for the column and a description of what the column stores. You do not need to supply data type, as this will inferred. Once all columns have been added, trial data can be populated using add_trial.

Lets add an additional column and some trial data.

nwbfile.add_trial_column(name='stim', description='the visual stimuli during the trial')

nwbfile.add_trial(start_time=0.0, stop_time=2.0, stim='person')
nwbfile.add_trial(start_time=3.0, stop_time=5.0, stim='ocean')
nwbfile.add_trial(start_time=6.0, stop_time=8.0, stim='desert')

Tabular data such as trials can be converted to a pandas.DataFrame.

print(nwbfile.trials.to_dataframe())
    start_time  stop_time    stim
id
0          0.0        2.0  person
1          3.0        5.0   ocean
2          6.0        8.0  desert

Units

Units are putative cells in your analysis. Unit metadata can be added to an NWB file using the methods add_unit and add_unit_column. These methods work like the methods for adding trials described above

A unit is only required to contain a unique integer identifier in the ‘id’ column (this will be automatically assigned if not provided). Additional optional values for each unit include: spike_times, electrodes, electrode_group, obs_intervals, waveform_mean, and waveform_sd. Additional user-defined columns can be added using add_unit_column. Like add_trial_column, this method also takes a name for the column, a description of what the column stores and does not need a data type. Once all columns have been added, unit data can be populated using add_unit.

When providing spike_times, you may also wish to specify the time intervals during which the unit was being observed, so that it is possible to distinguish times when the unit was silent from times when the unit was not being recorded (and thus correctly compute firing rates, for example). This information should be provided as a list of [start, end] time pairs in the obs_intervals field. If obs_intervals is provided, then all entries in spike_times should occur within one of the listed intervals. In the example below, all 3 units are observed during the time period from 1 to 10 s and fired spikes during that period. Units 2 and 3 were also observed during the time period from 20-30s; but only unit 2 fired spikes in that period.

Lets specify some unit metadata and then add some units:

nwbfile.add_unit_column('location', 'the anatomical location of this unit')
nwbfile.add_unit_column('quality', 'the quality for the inference of this unit')

nwbfile.add_unit(id=1, spike_times=[2.2, 3.0, 4.5],
                 obs_intervals=[[1, 10]], location='CA1', quality=0.95)
nwbfile.add_unit(id=2, spike_times=[2.2, 3.0, 25.0, 26.0],
                 obs_intervals=[[1, 10], [20, 30]], location='CA3', quality=0.85)
nwbfile.add_unit(id=3, spike_times=[1.2, 2.3, 3.3, 4.5],
                 obs_intervals=[[1, 10], [20, 30]], location='CA1', quality=0.90)

Now we overwrite the file with all of the data

with NWBHDF5IO('example_file_path.nwb', 'w') as io:
    io.write(nwbfile)

Note

The Units table has some predefined optional columns. Please review the documentation for add_unit before adding custom columns.

Appending to an NWB file

Using functionality discussed above, NWB allows appending to files. To append to a file, you must read the file, add new components, and then write the file. Reading and writing is carried out using NWBHDF5IO. When reading the NWBFile, you must specify that you intend to modify it by setting the mode argument in the NWBHDF5IO constructor to 'a'. After you have read the file, you can add [4] new data to it using the standard write/add functionality demonstrated above.

Let’s see how this works by adding another TimeSeries to the BehavioralTimeSeries interface we created above.

First, read the file and get the interface object.

io = NWBHDF5IO('example_file_path.nwb', mode='a')
nwbfile = io.read()
position = nwbfile.processing['behavior'].data_interfaces['Position']

Next, add a new SpatialSeries.

data = list(range(300, 400, 10))
timestamps = list(range(10))
test_spatial_series = SpatialSeries('test_spatialseries2', data,
                                    reference_frame='starting_gate',
                                    timestamps=timestamps)
position.add_spatial_series(test_spatial_series)

Finally, write the changes back to the file and close it.

io.write(nwbfile)
io.close()
[1]HDF5 is currently the only backend supported by NWB.
[2]Neurodata sets can be very large, so individual components of the dataset are only loaded into memory when you request them. This functionality is only possible if an open file handle is kept around until users want to load data.
[3]Some data interface objects have a default name. This default name is the type of the data interface. For example, the default name for ImageSegmentation is “ImageSegmentation” and the default name for EventWaveform is “EventWaveform”.
[4]NWB only supports adding to files. Removal and modifying of existing data is not allowed.

Gallery generated by Sphinx-Gallery