Main Page: Difference between revisions
Jump to navigation
Jump to search
(→Q & A) |
|||
Line 40: | Line 40: | ||
; What is a [[Item:Q56|placeholder taxon]]? | ; What is a [[Item:Q56|placeholder taxon]]? | ||
: We would like to model taxonomic relationships ("find taxa that are members of Ciliophora") and also link out to external databases, particularly NCBI. However, there is often a discrepancy between NCBI Taxonomy and the "actual" taxonomy. | : We would like to model taxonomic relationships ("find taxa that are members of Ciliophora") and also link out to external databases, particularly NCBI. However, there is often a discrepancy between NCBI Taxonomy and the "actual" taxonomy. | ||
: For example, the [[Item:Q22|brown ciliate]] is reported as a Parduczia sp. based on sequence analysis, but the sequences from that study are published under an environmental "ciliate metagenome" identifier on NCBI. | : For example, the [[Item:Q22|brown ciliate]] is reported as a Parduczia sp. based on sequence analysis, but the sequences from that study are published under an environmental "ciliate metagenome" identifier on NCBI that is used for convenience to organize records comprising sequences from multiple taxa, rather than a legitimate taxon in the biological sense. | ||
: Therefore, the item "brown ciliate" is | : Therefore, the item "brown ciliate" is described here as a placeholder taxon, because it does not have an exact equivalent in the NCBI Taxonomy, although we think it represents a legitimate taxon, based on the information in the cited references. | ||
: For [[Item:Q301|Bacteroidales sp. Cc3-010 ectosymbiont of Caduceia versatilis]], an SSU rRNA sequence has been published, but it is currently placed in a provisional "taxon" in the NCBI Taxonomy. The property [[Property:P28|P28]] is used to link to a representative sequence for disambiguation. | : For [[Item:Q301|Bacteroidales sp. Cc3-010 ectosymbiont of Caduceia versatilis]], an SSU rRNA sequence has been published, but it is currently placed in a provisional "taxon" in the NCBI Taxonomy that is again a convenience label for records where taxonomic information is lacking. The property [[Property:P28|P28]] is used to link our placeholder taxon to a representative sequence from the cited reference, for disambiguation should the NCBI Taxonomy be updated in the future. | ||
; Why do some interacts with statements link to "unknown value"? | ; Why do some interacts with statements link to "unknown value"? | ||
: If a symbiont is reported only on the basis of microscopy, without any information on its phylogenetic affiliation, "unknown value" is used for the object of the statement. | : If a symbiont is reported only on the basis of microscopy, without any information on its phylogenetic affiliation, "unknown value" is used for the object of the statement. |
Revision as of 22:04, 18 March 2024
Introduction
This project aims to describe symbiotic interactions between protists and prokaryotes as Linked Open Data.
Envisioned use cases include:
- Search and browse symbiotic interactions by biological taxonomy, leveraging cross-references to external taxonomies (e.g. NCBI Taxonomy, Catalogue of Life)
- Find interactions that are described in earlier literature but not yet studied with modern methods
- Programatically find new literature to update the database, by querying the NCBI databases using linked NCBI taxon IDs
- Share data with GloBI through periodic data exports
Interactions are described with the following statements, roughly aligned with the GloBI terms:
- Taxonomy of hosts and symbionts, with links to external databases (primarily NCBI)
- Localization of symbionts in cellular compartments of the host cell, using Gene Ontology terms
- Nature of biotic interactions, if this is known, using Relation Ontology terms (although there are some limitations in this ontology for describing mutualistic interactions)
- Analytical methods used to study the symbioses
This project originated as part of my doctoral dissertation (2017).
Similar projects elsewhere:
- Global Biotic Interactions (GloBI), an aggregator for biotic interactions datasets across all domains of life
- Protist Interaction Database (PIDA), Bjorbækmo et al., 2019 (last updated 2018)
- Diatom Interaction Database (DIDB) (last updated 2019)
Explore the data
Some example entries to see how the data are modeled:
- Parduczia sp. from Santa Barbara Basin ("brown ciliate"), a marine ciliate
- Pelomyxa palustris, a freshwater amoebozoan
- Mixotricha paradoxa, itself a hint gut symbiont of the termite Mastotermes darwiniensis
Each interaction statement is supported by one or more references to the scientific literature, linked by their DOI.
Use the Query Service (link on menu bar) to launch SPARQL queries; try the example queries to get started.
Q & A
- Why use a single 'interacts with' statement, with qualifiers for interaction type, instead of different properties for each interaction type?
- Nature of an interaction is often not fully understood, or may have multiple facets. Coding interaction types as qualifiers allows us to stack multiple functional roles on a single interaction
- What is a placeholder taxon?
- We would like to model taxonomic relationships ("find taxa that are members of Ciliophora") and also link out to external databases, particularly NCBI. However, there is often a discrepancy between NCBI Taxonomy and the "actual" taxonomy.
- For example, the brown ciliate is reported as a Parduczia sp. based on sequence analysis, but the sequences from that study are published under an environmental "ciliate metagenome" identifier on NCBI that is used for convenience to organize records comprising sequences from multiple taxa, rather than a legitimate taxon in the biological sense.
- Therefore, the item "brown ciliate" is described here as a placeholder taxon, because it does not have an exact equivalent in the NCBI Taxonomy, although we think it represents a legitimate taxon, based on the information in the cited references.
- For Bacteroidales sp. Cc3-010 ectosymbiont of Caduceia versatilis, an SSU rRNA sequence has been published, but it is currently placed in a provisional "taxon" in the NCBI Taxonomy that is again a convenience label for records where taxonomic information is lacking. The property P28 is used to link our placeholder taxon to a representative sequence from the cited reference, for disambiguation should the NCBI Taxonomy be updated in the future.
- Why do some interacts with statements link to "unknown value"?
- If a symbiont is reported only on the basis of microscopy, without any information on its phylogenetic affiliation, "unknown value" is used for the object of the statement.
- If some information is known about its likely taxonomy, e.g. through use of group-specific FISH probes, then a placeholder taxon is created with a temporary name.
- The entry Metopus contortus has both an "unknown value" statement and a placeholder symbiont taxon Q442.
- Why host this on Wikibase?
- This database has seen a number of iterations: starting as a table in a word processor file, to spreadsheets, a custom SQLite database, and an attempt to homebrew a structured data base with XML files and Python scripts. After getting some experience on Wikidata, I found that Wikibase offers the key features that I wanted: flexible and extensible schemata, graphical frontend for manual data entry, options for programmatic data import from tables, integration with external databases, and a sophisticated query interface.
- What is the beautiful organism depicted in the logo?
- Kentrophoros sp. H
- (The logo may not be visible in the mobile version of this site.)