.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "tutorials/general/plot_external_resources.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_tutorials_general_plot_external_resources.py: .. _external_resources: Linking to External Resources (HERD) ==================================== The :py:class:`~pynwb.resources.HERD` (HDMF External Resources Data Structure) class lets you map terms used in your data to entities defined in external, web-accessible resources such as ontologies. For example, you may store a species name ``"Mus musculus"`` on a :py:class:`~pynwb.file.Subject` and want to link it to the corresponding NCBI Taxonomy term so that the value is standardized and easy to query. From a user's perspective, a HERD can be treated as a single table that associates a ``key`` (a term used on an ``object``, i.e. a dataset or attribute in the file) with an ``entity`` (a term in an external resource, identified by an ``entity_id`` and an ``entity_uri``). Internally, HERD stores this in six interlinked tables (``keys``, ``files``, ``entities``, ``entity_keys``, ``objects``, and ``object_keys``) and provides convenience methods so you rarely need to interact with those tables directly. This tutorial shows how to create a HERD, annotate objects in an NWB file, store the HERD in the file, and inspect the annotations after reading the file back. For the full HERD API (including ``add_ref_termset`` for validating terms against a :py:class:`~hdmf.term_set.TermSet`, ``get_key``, and compound-data references), see the `HDMF HERD tutorial `_. .. GENERATED FROM PYTHON SOURCE LINES 26-35 .. code-block:: Python from datetime import datetime from uuid import uuid4 from dateutil.tz import tzlocal from pynwb import NWBHDF5IO, NWBFile from pynwb.file import Subject .. GENERATED FROM PYTHON SOURCE LINES 37-41 Create an NWB file ------------------ Start with an :py:class:`~pynwb.file.NWBFile` that has a :py:class:`~pynwb.file.Subject`. The subject's species is the value we will annotate with an external resource. .. GENERATED FROM PYTHON SOURCE LINES 41-49 .. code-block:: Python nwbfile = NWBFile( session_description="a demonstration of external resources", identifier=str(uuid4()), session_start_time=datetime(2018, 4, 25, 2, 30, 3, tzinfo=tzlocal()), subject=Subject(subject_id="001", species="Mus musculus"), ) .. GENERATED FROM PYTHON SOURCE LINES 50-57 Get the file's HERD ------------------- Use :py:meth:`~pynwb.file.NWBFile.get_external_resources` to get the file's :py:class:`~pynwb.resources.HERD`. A file has at most one HERD, so this returns the existing HERD if the file already has one (for example, when the file was read from disk) and creates and attaches a new empty HERD otherwise. The :py:attr:`~pynwb.file.NWBFile.external_resources` attribute returns the HERD without creating one, returning ``None`` when the file has no external resources. .. GENERATED FROM PYTHON SOURCE LINES 57-60 .. code-block:: Python herd = nwbfile.get_external_resources() .. GENERATED FROM PYTHON SOURCE LINES 61-72 Add references with ``add_ref`` ------------------------------- Use :py:meth:`~hdmf.common.resources.HERD.add_ref` to add a row that links a key on an object to an external entity. Here we link the subject's species to the NCBI Taxonomy entry for *Mus musculus*. The subject must be part of a file before a reference is added to it. An entity is identified by an ``entity_id`` and an ``entity_uri``. The ``entity_id`` is a compact URI (CURIE) of the form ``prefix:identifier`` whose prefix is registered with `bioregistry.io `_, such as ``NCBITaxon`` for the NCBI Taxonomy. The ``entity_uri`` is the persistent URL the CURIE resolves to, which you can look up at ``https://bioregistry.io/``. .. GENERATED FROM PYTHON SOURCE LINES 72-80 .. code-block:: Python herd.add_ref( container=nwbfile.subject, key=nwbfile.subject.species, entity_id="NCBITaxon:10090", entity_uri="http://purl.obolibrary.org/obo/NCBITaxon_10090", ) .. GENERATED FROM PYTHON SOURCE LINES 81-96 References can also point to an attribute of an object, such as a column of a table. Here we record the brain region of a set of electrodes in the electrodes table and link the region to the corresponding structure in the `Allen Mouse Brain Atlas `_. When the target is a column, pass the table as the ``container`` and the column name as the ``attribute``; HERD resolves the reference to the column object itself. .. note:: This same ``container`` plus ``attribute`` form also works for ragged columns (those backed by a :py:class:`~hdmf.common.table.VectorIndex`): ``add_ref(container=table, attribute="col", ...)`` annotates the column's :py:class:`~hdmf.common.table.VectorData`, which holds the actual values used as keys. Do not annotate the column with ``add_ref(container=table["col"], attribute=None, ...)``: for a ragged column, ``table["col"]`` is the :py:class:`~hdmf.common.table.VectorIndex` (the integer offsets into the ``VectorData``), so HERD would annotate the index instead of the values. .. GENERATED FROM PYTHON SOURCE LINES 96-115 .. code-block:: Python device = nwbfile.create_device(name="probe") electrode_group = nwbfile.create_electrode_group( name="shank0", description="a shank of the recording probe", location="VISp", device=device, ) for _ in range(4): nwbfile.add_electrode(location="VISp", group=electrode_group) herd.add_ref( container=nwbfile.electrodes, attribute="location", key="VISp", entity_id="MBA:385", entity_uri="https://purl.brain-bican.org/ontology/mbao/MBA_385", ) .. GENERATED FROM PYTHON SOURCE LINES 116-120 Inspect the HERD ---------------- :py:meth:`~hdmf.common.resources.HERD.to_dataframe` flattens the interlinked tables into a single :py:class:`~pandas.DataFrame`, with one row per (object, key, entity) association. .. GENERATED FROM PYTHON SOURCE LINES 120-123 .. code-block:: Python herd.to_dataframe() .. raw:: html
file_object_id objects_idx object_id files_idx object_type relative_path field keys_idx key entities_idx entity_id entity_uri
0 7b57d1b9-79a0-467f-972e-dec54780330e 0 cdd979e3-132c-45de-a128-aa1c4b5c221a 0 Subject 0 Mus musculus 0 NCBITaxon:10090 http://purl.obolibrary.org/obo/NCBITaxon_10090
1 7b57d1b9-79a0-467f-972e-dec54780330e 1 3cea8c50-d9ca-497d-a43e-891de7676ae1 0 VectorData 1 VISp 1 MBA:385 https://purl.brain-bican.org/ontology/mbao/MBA...


.. GENERATED FROM PYTHON SOURCE LINES 124-126 You can also view the individual tables. Each is a :py:class:`~hdmf.common.table.DynamicTable` and has its own ``to_dataframe`` method. .. GENERATED FROM PYTHON SOURCE LINES 126-129 .. code-block:: Python herd.keys.to_dataframe() .. raw:: html
key
0 Mus musculus
1 VISp


.. GENERATED FROM PYTHON SOURCE LINES 130-133 .. code-block:: Python herd.entities.to_dataframe() .. raw:: html
entity_id entity_uri
0 NCBITaxon:10090 http://purl.obolibrary.org/obo/NCBITaxon_10090
1 MBA:385 https://purl.brain-bican.org/ontology/mbao/MBA...


.. GENERATED FROM PYTHON SOURCE LINES 134-136 :py:meth:`~hdmf.common.resources.HERD.get_object_type` returns all annotations for objects of a given type, for example every annotated :py:class:`~pynwb.file.Subject`. .. GENERATED FROM PYTHON SOURCE LINES 136-139 .. code-block:: Python herd.get_object_type(object_type="Subject") .. raw:: html
file_object_id objects_idx object_id files_idx object_type relative_path field keys_idx key entities_idx entity_id entity_uri
0 7b57d1b9-79a0-467f-972e-dec54780330e 0 cdd979e3-132c-45de-a128-aa1c4b5c221a 0 Subject 0 Mus musculus 0 NCBITaxon:10090 http://purl.obolibrary.org/obo/NCBITaxon_10090


.. GENERATED FROM PYTHON SOURCE LINES 140-144 Write and read the NWB file --------------------------- Writing the file stores the HERD inside it. Reading the file back makes the HERD available again through the ``external_resources`` field. .. GENERATED FROM PYTHON SOURCE LINES 144-153 .. code-block:: Python filename = "external_resources_tutorial.nwb" with NWBHDF5IO(filename, mode="w") as io: io.write(nwbfile) read_io = NWBHDF5IO(filename, mode="r") read_nwbfile = read_io.read() read_herd = read_nwbfile.external_resources .. GENERATED FROM PYTHON SOURCE LINES 154-160 Access the loaded data ----------------------- The loaded HERD provides the same accessors as before. In a Jupyter notebook, displaying the HERD renders the flattened references as a table, and :py:meth:`~hdmf.common.resources.HERD.to_dataframe` returns that same table as a :py:class:`~pandas.DataFrame`. The individual tables give a more focused view. .. GENERATED FROM PYTHON SOURCE LINES 160-163 .. code-block:: Python read_herd.to_dataframe() .. raw:: html
file_object_id objects_idx object_id files_idx object_type relative_path field keys_idx key entities_idx entity_id entity_uri
0 7b57d1b9-79a0-467f-972e-dec54780330e 0 cdd979e3-132c-45de-a128-aa1c4b5c221a 0 Subject 0 Mus musculus 0 NCBITaxon:10090 http://purl.obolibrary.org/obo/NCBITaxon_10090
1 7b57d1b9-79a0-467f-972e-dec54780330e 1 3cea8c50-d9ca-497d-a43e-891de7676ae1 0 VectorData 1 VISp 1 MBA:385 https://purl.brain-bican.org/ontology/mbao/MBA...


.. GENERATED FROM PYTHON SOURCE LINES 164-165 View the individual tables, for example: .. GENERATED FROM PYTHON SOURCE LINES 165-168 .. code-block:: Python read_herd.keys.to_dataframe() .. raw:: html
key
0 Mus musculus
1 VISp


.. GENERATED FROM PYTHON SOURCE LINES 169-172 :py:meth:`~hdmf.common.resources.HERD.get_object_entities` returns the entities annotated on a single object as a :py:class:`~pandas.DataFrame`. Here we view the species annotation stored for the subject: .. GENERATED FROM PYTHON SOURCE LINES 172-175 .. code-block:: Python read_herd.get_object_entities(container=read_nwbfile.subject) .. raw:: html
entity_id entity_uri
0 NCBITaxon:10090 http://purl.obolibrary.org/obo/NCBITaxon_10090


.. GENERATED FROM PYTHON SOURCE LINES 176-177 Close the file once you are done reading from it. .. GENERATED FROM PYTHON SOURCE LINES 177-180 .. code-block:: Python read_io.close() .. GENERATED FROM PYTHON SOURCE LINES 181-189 Alternative: store a HERD outside an NWB file --------------------------------------------- A HERD can also be saved independently of an NWB file as a zip archive of the underlying tables using :py:meth:`~hdmf.common.resources.HERD.to_zip`, and read back with :py:meth:`~hdmf.common.resources.HERD.from_zip`. This is useful when external resources span multiple files; see :ref:`external_resources_streaming` for an example that annotates many NWB files with a single HERD. For the full HERD API, see the `HDMF HERD tutorial `_. .. _sphx_glr_download_tutorials_general_plot_external_resources.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_external_resources.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_external_resources.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: plot_external_resources.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_