scholarly journals Biodiversity Literature Repository: Building the customized FAIR repository by using custom metadata

Author(s):  
Alexandros Ioannidis-Pantopikos ◽  
Donat Agosti

In the landscape of general-purpose repositories, Zenodo was built at the European Laboratory for Particle Physics' (CERN) data center to facilitate the sharing and preservation of the long tail of research across all disciplines and scientific domains. Given Zenodo’s long tradition of making research artifacts FAIR (Findable, Accessible, Interoperable, and Reusable), there are still challenges in applying these principles effectively when serving the needs of specific research domains. Plazi’s biodiversity taxonomic literature processing pipeline liberates data from publications, making it FAIR via extensive metadata, the minting of a DataCite Digital Object Identifier (DOI), a licence and both human- and machine-readable output provided by Zenodo, and accessible via the Biodiversity Literature Repository community at Zenodo. The deposits (e.g., taxonomic treatments, figures) are an example of how local networks of information can be formally linked to explicit resources in a broader context of other platforms like GBIF (Global Biodiversity Information Facility). In the context of biodiversity taxonomic literature data workflows, a general-purpose repository’s traditional submission approach is not enough to preserve rich metadata and to capture highly interlinked objects, such as taxonomic treatments and digital specimens. As a prerequisite to serve these use cases and ensure that the artifacts remain FAIR, Zenodo introduced the concept of custom metadata, which allows enhancing submissions such as figures or taxonomic treatments (see as an example the treatment of Eurygyrus peloponnesius) with custom keywords, based on terms from common biodiversity vocabularies like Darwin Core and Audubon Core and with an explicit link to the respective vocabulary term. The aforementioned pipelines and features are designed to be served first and foremost using public Representational State Transfer Application Programming Interfaces (REST APIs) and open web technologies like webhooks. This approach allows researchers and platforms to integrate existing and new automated workflows into Zenodo and thus empowers research communities to create self-sustained cross-platform ecosystems. The BiCIKL project (Biodiversity Community Integrated Knowledge Library) exemplifies how repositories and tools can become building blocks for broader adoption of the FAIR principles. Starting with the above literature processing pipeline, the concepts of and resulting FAIR data, with a focus on the custom metadata used to enhance the deposits, will be explained.

2011 ◽  
Vol 18 (5) ◽  
pp. 563-572 ◽  
Author(s):  
G. Balasis ◽  
C. Papadimitriou ◽  
I. A. Daglis ◽  
A. Anastasiadis ◽  
I. Sandberg ◽  
...  

Abstract. The dynamics of complex systems are founded on universal principles that can be used to describe disparate problems ranging from particle physics to economies of societies. A corollary is that transferring ideas and results from investigators in hitherto disparate areas will cross-fertilize and lead to important new results. In this contribution, we investigate the existence of a universal behavior, if any, in solar flares, magnetic storms, earthquakes and pre-seismic electromagnetic (EM) emissions, extending the work recently published by Balasis et al. (2011a). A common characteristic in the dynamics of the above-mentioned phenomena is that their energy release is basically fragmentary, i.e. the associated events are being composed of elementary building blocks. By analogy with earthquakes, the magnitude of the magnetic storms, solar flares and pre-seismic EM emissions can be appropriately defined. Then the key question we can ask in the frame of complexity is whether the magnitude distribution of earthquakes, magnetic storms, solar flares and pre-fracture EM emissions obeys the same law. We show that these apparently different extreme events, which occur in the solar-terrestrial system, follow the same energy distribution function. The latter was originally derived for earthquake dynamics in the framework of nonextensive Tsallis statistics.


Author(s):  
Roy Gelbard ◽  
Israel Spiegler

The research proposes a model for the representation and storage of motion data that enables the communication, storage, and analysis of patterns of motion, as with spoken and written languages. The basic problem is the lack of a machine-readable motion alphabet. We thus set out to define the elemental components and building blocks of motion, coming up with what we call the motion byte as the basis for a motion language that has words, phrases, and sentences. The binary-based model we develop, which is significantly different from the common “key frames” approach, is also a method of storing motion data. Comparison with a standard motion system, based on key frames, indicates a significant advantage for our binary model.


2020 ◽  
Vol 165 (12) ◽  
pp. 631-638
Author(s):  
Maximilian Haas

ZusammenfassungDas CERN (Conseil Européen pour la Recherche Nucléaire bzw. European Laboratory for Particle Physics) ist eine weltweit führende internationale Forschungseinrichtung auf dem Gebiet der Hochenergie- und Teilchenphysik. Die Erforschung der grundlegenden Bausteine des Universums und ihrer Interaktionen lieferte in den vergangenen Jahrzehnten bahnbrechende Erkenntnisse, die im experimentellen Nachweis des Higgs-Boson im Juli 2012 gipfelten. Um die in diesem Zusammenhang erforschten Erkenntnisse weiter zu vertiefen und noch unbeantwortete Fragen nach dem Ursprung und der Funktion des Universums zu beantworten, hat eine internationale Gemeinschaft von über 150 Instituten weltweit am CERN eine Studie für ein Forschungsprogramm mit einer neuen, leistungsfähigeren Teilchenbeschleunigerinfrastruktur initiiert. Die Future Circular Collider (FCC) Studie schließt die dafür erforderlichen unterirdischen Tunnel, Kavernen und Schächte und die damit verbundenen Konstruktionen an der Oberfläche mit ein. Die Infrastruktur ist so ausgelegt, um im Zusammenschluss mit den bereits bestehenden Teilchenbeschleunigern am CERN (z. B. PSB, PS, SPS, LHC) zu funktionieren. Im Rahmen des Projekts wurden seit 2014 die ersten technischen Machbarkeitsstudien in den verschiedensten Gebieten, unter anderem Geologie und Konstruktion des Tunnels, der sich über ca. 100 km im teils westschweizerischen und teils französischen Molassebecken erstreckt, durchgeführt, sodass FCC nach derzeitigem Planungsstand um das Jahr 2040 in Betrieb gehen kann. Im Zuge dessen ist ein geologisches Untergrundmodell unerlässlich, um einen sicheren Bau unterirdischer Infrastruktur zu gewährleisten und die Baumethode auf die Geologie abzustimmen. Ein entscheidender Faktor neben dem geologischen Modell ist die Wiederverwertbarkeit des ausgehobenen Molasse-Materials mit einem Volumen von etwa 9 Mio. m3 sowohl aus technischer als auch rechtlicher, gesellschaftspolitischer und sozio-ökonomischer Sicht.Dieser Artikel soll einen Einblick in diese beiden Machbarkeitsstudien des FCC Projekts geben, sowie Ansätze der geologischen, petrophysikalischen, geotechnischen und mineralogisch-chemischen Analysen präsentieren, die zur Beantwortung der Wiederverwertung dienen und in weiterer Folge in das geologische Untergrundmodell einfließen werden.


1997 ◽  
Vol 3 (S2) ◽  
pp. 1131-1132
Author(s):  
Jansma P.L ◽  
M.A. Landis ◽  
L.C. Hansen ◽  
N.C. Merchant ◽  
N.J. Vickers ◽  
...  

We are using Data Explorer (DX), a general-purpose, interactive visualization program developed by IBM, to perform three-dimensional reconstructions of neural structures from microscopic or optical sections. We use the program on a Silicon Graphics workstation; it also can run on Sun, IBM RS/6000, and Hewlett Packard workstations. DX comprises modular building blocks that the user assembles into data-flow networks for specific uses. Many modules come with the program, but others, written by users (including ourselves), are continually being added and are available at the DX ftp site, http://www.tc.cornell.edu/DXhttp://www.nice.org.uk/page.aspx?o=43210.Initally, our efforts were aimed at developing methods for isosurface- and volume-rendering of structures visible in three-dimensional stacks of optical sections of insect brains gathered on our Bio-Rad MRC-600 laser scanning confocal microscope. We also wanted to be able to merge two 3-D data sets (collected on two different photomultiplier channels) and to display them at various angles of view.


Symmetry ◽  
2020 ◽  
Vol 12 (5) ◽  
pp. 700
Author(s):  
Marina Prvan ◽  
Arijana Burazin Mišura ◽  
Zoltan Gecse ◽  
Julije Ožegović

This paper deals with a problem the packing polyhex clusters in a regular hexagonal container. It is a common problem in many applications with various cluster shapes used, but symmetric polyhex is the most useful in engineering due to its geometrical properties. Hence, we concentrate on mathematical modeling in such an application, where using the “bee” tetrahex is chosen for the new Compact Muon Solenoid (CMS) design upgrade, which is one of four detectors used in Large Hadron Collider (LHC) experiment at European Laboratory for Particle Physics (CERN). We start from the existing hexagonal containers with hexagonal cells packed inside, and uniform clustering applied. We compare the center-aligned (CA) and vertex-aligned (VA) models, analyzing cluster rotations providing the increased packing efficiency. We formally describe the geometrical properties of clustering approaches and show that cluster sharing is inevitable at the container border with uniform clustering. In addition, we propose a new vertex-aligned model decreasing the number of shared clusters in the uniform scenario, but with a smaller number of clusters contained inside the container. Also, we describe a non-uniform tetrahex cluster packing scheme in the proposed container model. With the proposed cluster packing solution, it is accomplished that all clusters are contained inside the container region. Since cluster-sharing is completely avoided at the container border, the maximal packing efficiency is obtained compared to the existing models.


Robotica ◽  
1993 ◽  
Vol 11 (2) ◽  
pp. 119-128
Author(s):  
David Bar-On ◽  
Shaul Gutman ◽  
Amos Israeli

SUMMARYA modular hierarchical model for controlling robots is presented. This model is targeted mainly for research and development; it enables researchers to concentrate on a certain specific task of robotics, while using existing building blocks for the rest of their applications. The presentation begins by discussing the problems with which researchers and engineers of robotics are faced whenever trying to use existing commercial robots. Based on this discussion we propose a new general model for robot control to be referred as TERM (TEchnion Robotic Model). The viability of the new model is demonstrated by implementing a general purpose robot controller.


2004 ◽  
Vol 5 (2) ◽  
pp. 271-301 ◽  
Author(s):  
Henrik Hautop Lund ◽  
Patrizia Marti

I-BLOCKS are an innovative concept of building blocks allowing users to manipulate conceptual structures and compose atomic actions while building physical constructions. They represent an example of enabling technologies for tangible interfaces since they emphasise physicality of interaction through the use of spatial and kinaesthetic knowledge. The technology presented in this paper is integrated in physical building blocks augmented with embedded and invisible microprocessors. Connectivity and behaviour of such structures are defined by the physical connectivity between the blocks. These are general purpose, constructive, tangible user interface devices that can have a variety of applications. Unlike other approaches, I-BLOCKS do not only specify a computation that is performed by the target system but perform at the same time the computation and the associated action/functionality. Manipulating I-BLOCKS do not only mean constructing physical or conceptual structures but also composing atomic actions into complex behaviours. To illustrate this concept, the paper presents different scenarios in which the technology has been applied: storytelling performed through the construction of physical characters exhibiting emotional states, and learning activities for speech therapy in cases of dyslexia and aphasia. The scenarios are presented; discussing both the features of the technology used and the related interaction design issues. The paper concludes by reporting about informal trials that have been conducted with children. It should be noted that, even if both trials represent application scenarios for children, the I-BLOCKS technology is in principle open to different kinds of applications and target users like, for example, games for adults or brainstorming activities.


2021 ◽  
Vol 8 (2) ◽  
pp. 180-185
Author(s):  
Anna Tolwinska

This article aims to explain the key metadata elements listed in Participation Reports, why it’s important to check them regularly, and how Crossref members can improve their scores. Crossref members register a lot of metadata in Crossref. That metadata is machine-readable, standardized, and then shared across discovery services and author tools. This is important because richer metadata makes content more discoverable and useful to the scholarly community. It’s not always easy to know what metadata Crossref members register in Crossref. This is why Crossref created an easy-to-use tool called Participation Reports to show editors, and researchers the key metadata elements Crossref members register to make their content more useful. The key metadata elements include references and whether they are set to open, ORCID iDs, funding information, Crossmark metadata, licenses, full-text URLs for text-mining, and Similarity Check indexing, as well as abstracts. ROR IDs (Research Organization Registry Identifiers), that identify institutions will be added in the future. This data was always available through the Crossref ’s REST API (Representational State Transfer Application Programming Interface) but is now visualized in Participation Reports. To improve scores, editors should encourage authors to submit ORCIDs in their manuscripts and publishers should register as much metadata as possible to help drive research further.


2018 ◽  
Author(s):  
Jianfu Zhou ◽  
Alexandra E. Panaitiu ◽  
Gevorg Grigoryan

AbstractThe ability to routinely design functional proteins, in a targeted manner, would have enormous implications for biomedical research and therapeutic development. Computational protein design (CPD) offers the potential to fulfill this need, and though recent years have brought considerable progress in the field, major limitations remain. Current state-of-the-art approaches to CPD aim to capture the determinants of structure from physical principles. While this has led to many successful designs, it does have strong limitations associated with inaccuracies in physical modeling, such that a robust general solution to CPD has yet to be found. Here we propose a fundamentally novel design framework—one based on identifying and applying patterns of sequence-structure compatibility found in known proteins, rather than approximating them from models of inter-atomic interactions. Specifically, we systematically decompose the target structure to be designed into structural building blocks we call TERMs (tertiary motifs) and use rapid structure search against the Protein Data Bank (PDB) to identify sequence patterns associated with each TERM from known protein structures that contain it. These results are then combined to produce a sequence-level pseudo-energy model that can score any sequence for compatibility with the target structure. This model can then be used to extract the optimal-scoring sequence via combinatorial optimization or otherwise sample the sequence space predicted to be well compatible with folding to the target. Here we carry out extensive computational analyses, showing that our method, which we dub dTERMen (design with TERM energies): 1) produces native-like sequences given native crystallographic or NMR backbones, 2) produces sequence-structure compatibility scores that correlate with thermodynamic stability, and 3) is able to predict experimental success of designed sequences generated with other methods, and 4) designs sequences that are found to fold to the desired target by structure prediction more frequently than sequences designed with an atomistic method. As an experimental validation of dTERMen, we perform a total surface redesign of Red Fluorescent Protein mCherry, marking a total of 64 residues as variable. The single sequence identified as optimal by dTERMen harbors 48 mutations relative to mCherry, but nevertheless folds, is monomeric in solution, exhibits similar stability to chemical denaturation as mCherry, and even preserves the fluorescence property. Our results strongly argue that the PDB is now sufficiently large to enable proteins to be designed by using only examples of structural motifs from unrelated proteins. This is highly significant, given that the structural database will only continue to grow, and signals the possibility of a whole host of novel data-driven CPD methods. Because such methods are likely to have orthogonal strengths relative to existing techniques, they could represent an important step towards removing remaining barriers to robust CPD.


Sign in / Sign up

Export Citation Format

Share Document