Extracellular Electrophysiology Data

The following tutorial describes storage of extracellular electrophysiology data in NWB. The workflow demonstrated here involves four main steps:

  1. Create the electrodes table

  2. Add acquired raw voltage data

  3. Add LFP data

  4. Add spike data

This tutorial assumes that transforming data between these states is done by users–PyNWB does not provide analysis functionality. It is recommended to cover NWB File Basics before this tutorial.

The following examples will reference variables that may not be defined within the block they are used in. For clarity, we define them here:

from datetime import datetime
from dateutil.tz import tzlocal

import numpy as np
from pynwb import NWBFile, NWBHDF5IO
from pynwb.ecephys import ElectricalSeries, LFP

Creating and Writing NWB files

When creating a NWB file, the first step is to create the NWBFile.

nwbfile = NWBFile(
    session_description="my first synthetic recording",
    identifier="EXAMPLE_ID",
    session_start_time=datetime.now(tzlocal()),
    experimenter="Dr. Bilbo Baggins",
    lab="Bag End Laboratory",
    institution="University of Middle Earth at the Shire",
    experiment_description="I went on an adventure with thirteen dwarves "
    "to reclaim vast treasures.",
    session_id="LONELYMTN",
)

Electrodes Table

In order to store extracellular electrophysiology data, you first must create an electrodes table describing the electrodes that generated this data. Extracellular electrodes are stored in an "electrodes" table, which is a DynamicTable.

electrodes table UML diagram

Before creating an ElectrodeGroup, you need to provide some information about the device that was used to record from the electrode. This is done by creating a Device object using the instance method create_device.

device = nwbfile.create_device(
    name="array", description="the best array", manufacturer="Probe Company 9000"
)

Once you have created the Device, you can create an ElectrodeGroup. Then you can add electrodes one-at-a-time with add_electrode. add_electrode has two required arguments, group, which takes an ElectrodeGroup, and location, which takes a string. It also has a number of optional metadata fields for electrode features (e.g, x, y, z, imp, and filtering). Since this table is a DynamicTable, we can add additional user-specified metadata fields as well. We will be adding a "label" column to the table. Use the following code to add electrodes for an array with 4 shanks and 3 channels per shank.

nwbfile.add_electrode_column(name="label", description="label of electrode")

nshanks = 4
nchannels_per_shank = 3
electrode_counter = 0

for ishank in range(nshanks):
    # create an electrode group for this shank
    electrode_group = nwbfile.create_electrode_group(
        name="shank{}".format(ishank),
        description="electrode group for shank {}".format(ishank),
        device=device,
        location="brain area",
    )
    # add electrodes to the electrode table
    for ielec in range(nchannels_per_shank):
        nwbfile.add_electrode(
            group=electrode_group,
            label="shank{}elec{}".format(ishank, ielec),
            location="brain area",
        )
        electrode_counter += 1

Similarly to the trials table, we can view the electrodes table in tabular form by converting it to a pandas DataFrame.

nwbfile.electrodes.to_dataframe()

Note

When we added an electrode with the add_electrode method, we passed in the ElectrodeGroup object for the "group" argument. This creates a reference from the "electrodes" table to the individual ElectrodeGroup objects, one per row (electrode).

Extracellular recordings

Raw voltage traces and local-field potential (LFP) data are stored in ElectricalSeries objects. ElectricalSeries is a subclass of TimeSeries specialized for voltage data. To create the ElectricalSeries objects, we need to reference a set of rows in the "electrodes" table to indicate which electrodes were recorded. We will do this by creating a DynamicTableRegion, which is a type of link that allows you to reference :py:meth:~pynwb.file.NWBFile.create_electrode_table_region` is a convenience function that creates a DynamicTableRegion which references the "electrodes" table.

all_table_region = nwbfile.create_electrode_table_region(
    region=list(range(electrode_counter)),  # reference row indices 0 to N-1
    description="all electrodes",
)

Raw voltage data

Now create an ElectricalSeries object to store raw data collected during the experiment, passing in this "all_table_region" DynamicTableRegion reference to all rows of the electrodes table.

electrical series UML diagram
raw_data = np.random.randn(50, 4)
raw_electrical_series = ElectricalSeries(
    name="ElectricalSeries",
    data=raw_data,
    electrodes=all_table_region,
    starting_time=0.0,  # timestamp of the first sample in seconds relative to the session start time
    rate=20000.0,  # in Hz
)

NWB organizes data into different groups depending on the type of data. Groups can be thought of as folders within the file. Here are some of the groups within an NWBFile and the types of data they are intended to store:

  • acquisition: raw, acquired data that should never change

  • processing: processed data, typically the results of preprocessing algorithms and could change

  • analysis: results of data analysis

  • stimuli: stimuli used in the experiment (e.g., images, videos, light pulses)

Since this ElectricalSeries represents raw data from the data acquisition system, we will add it to the acquisition group of the NWBFile.

nwbfile.add_acquisition(raw_electrical_series)

LFP

Now create an ElectricalSeries object to store LFP data collected during the experiment, again passing in the DynamicTableRegion reference to all rows of the "electrodes" table.

lfp_data = np.random.randn(50, 4)
lfp_electrical_series = ElectricalSeries(
    name="ElectricalSeries",
    data=lfp_data,
    electrodes=all_table_region,
    starting_time=0.0,
    rate=200.0,
)

To help data analysis and visualization tools know that this ElectricalSeries object represents LFP data, store the ElectricalSeries object inside of an LFP object. This is analogous to how we can store the SpatialSeries object inside of a Position object.

LFP UML diagram
lfp = LFP(electrical_series=lfp_electrical_series)

Unlike the raw data, which we put into the acquisition group of the NWBFile, LFP data is typically considered processed data because the raw data was filtered and downsampled to generate the LFP.

Create a processing module named "ecephys" and add the LFP object to it. This is analogous to how we can store the Position object in a processing module created with the create_processing_module method.

ecephys_module = nwbfile.create_processing_module(
    name="ecephys", description="processed extracellular electrophysiology data"
)
ecephys_module.add(lfp)

Spike Times

Spike times are stored in the Units table, which is a subclass of DynamicTable. Adding columns to the Units table is analogous to how we can add columns to the "electrodes" and "trials" tables.

We will generate some random spike data and populate the Units table using the add_unit method. Then we can display the Units table as a pandas DataFrame.

nwbfile.add_unit_column(name="quality", description="sorting quality")

poisson_lambda = 20
firing_rate = 20
n_units = 10
for n_units_per_shank in range(n_units):
    n_spikes = np.random.poisson(lam=poisson_lambda)
    spike_times = np.round(
        np.cumsum(np.random.exponential(1 / firing_rate, n_spikes)), 5
    )
    nwbfile.add_unit(
        spike_times=spike_times, quality="good", waveform_mean=[1.0, 2.0, 3.0, 4.0, 5.0]
    )

nwbfile.units.to_dataframe()

Designating electrophysiology data

As mentioned above, ElectricalSeries objects are meant for storing specific types of extracellular recordings. In addition to this TimeSeries class, NWB provides some Processing Modules for designating the type of data you are storing. We will briefly discuss them here, and refer the reader to API documentation and NWB File Basics for more details on using these objects.

For storing spike data, there are two options. Which one you choose depends on what data you have available. If you need to store the complete, continuous raw voltage traces, you should store your the traces with ElectricalSeries objects as acquisition data, and use the EventDetection class for identifying the spike events in your raw traces. If you do not want to store the raw voltage traces and only the waveform ‘snippets’ surrounding spike events, you should use the EventWaveform class, which can store one or more SpikeEventSeries objects.

The results of spike sorting (or clustering) should be stored in the top-level Units table. Note that it is not required to store spike waveforms in order to store spike events or waveforms–if you only want to store the spike times of clustered units you can use only the Units table.

For local field potential data, there are two options. Again, which one you choose depends on what data you have available. With both options, you should store your traces with ElectricalSeries objects. If you are storing unfiltered local field potential data, you should store the ElectricalSeries objects in LFP data interface object(s). If you have filtered LFP data, you should store the ElectricalSeries objects in FilteredEphys data interface object(s).

Writing electrophysiology data

Once you have finished adding all of your data to the NWBFile, write the file with NWBHDF5IO.

with NWBHDF5IO("ecephys_tutorial.nwb", "w") as io:
    io.write(nwbfile)

For more details on NWBHDF5IO, see the Writing an NWB file tutorial.

Reading electrophysiology data

We can access the raw data by indexing acquisition with the name of the ElectricalSeries, which we named "ElectricalSeries". We can also access the LFP data by indexing processing with the name of the processing module "ecephys". Then, we can access the LFP object inside of the "ecephys" processing module by indexing it with the name of the LFP object. The default name of LFP objects is "LFP". Finally, we can access the ElectricalSeries object inside of the LFP object by indexing it with the name of the ElectricalSeries object, which we named "ElectricalSeries".

with NWBHDF5IO("ecephys_tutorial.nwb", "r") as io:
    read_nwbfile = io.read()
    print(read_nwbfile.acquisition["ElectricalSeries"])
    print(read_nwbfile.processing["ecephys"])
    print(read_nwbfile.processing["ecephys"]["LFP"])
    print(read_nwbfile.processing["ecephys"]["LFP"]["ElectricalSeries"])

Accessing your data

Data arrays are read passively from the file. Calling the data attribute on a TimeSeries such as a ElectricalSeries does not read the data values, but presents an h5py object that can be indexed to read data. You can use the [:] operator to read the entire data array into memory.

Load and print all the data values of the ElectricalSeries object representing the LFP data.

with NWBHDF5IO("ecephys_tutorial.nwb", "r") as io:
    read_nwbfile = io.read()
    print(read_nwbfile.processing["ecephys"]["LFP"]["ElectricalSeries"].data[:])

Accessing data regions

It is often preferable to read only a portion of the data. To do this, index or slice into the data attribute just like if you were indexing or slicing a numpy array.

The following code prints elements 0:10 in the first dimension (time) and 0:3 in the second dimension (electrodes) from the LFP data we have written.

with NWBHDF5IO("ecephys_tutorial.nwb", "r") as io:
    read_nwbfile = io.read()

    print("section of LFP:")
    print(read_nwbfile.processing["ecephys"]["LFP"]["ElectricalSeries"].data[:10, :3])
    print("")
    print("spike times from 0th unit:")
    print(read_nwbfile.units["spike_times"][0])

Gallery generated by Sphinx-Gallery