scholarly journals Large-scale Integrative Taxonomy (LIT): resolving the data conundrum for dark taxa

2021 ◽  
Author(s):  
Emily Hartop ◽  
Amrita Srivathsan ◽  
Fredrik Ronquist ◽  
Rudolf Meier

AbstractNew, rapid, accurate, scalable, and cost-effective species discovery and delimitation methods are needed for tackling “dark taxa”, that we here define as clades for which <10% of all species are described and the estimated diversity exceeds 1000 species. Species delimitation should be based on multiple data sources (“integrative taxonomy”) but collecting several types of data for the same specimens risks impeding the discovery process that is already too slow. We here show how this can be avoided with Large-scale Integrative Taxonomy (LIT). Preliminary species hypotheses are generated based on inexpensive data that are obtained quickly and cost-effectively in a technical exercise. The validation step is then based on a more expensive type of data that are only obtained for few specimens selected based on objective criteria. We here use this approach to sort 18 000 scuttle flies (Diptera: Phoridae) from Sweden into 315 preliminary species hypotheses based on NGS barcode (313bp) clusters. These clusters went through subsequent validation based on morphology and were then used to develop quantitative indicators for predicting which barcode clusters are in conflict with morphospecies. For this purpose, we first randomly selected 100 clusters for in-depth validation with morphology. Afterwards, we used a linear model to demonstrate that the best predictors for conflict between barcode clusters and morphology are maximum p-distance within the cluster and cluster stability across different clustering thresholds. A test of these indicators using the 215 remaining clusters reveals that these predictors correctly identify all clusters that conflict with morphology. The morphological validation step in our study involved just 1 039 specimens (5.8% of all specimens), but a newly proposed simplified protocol would only require the study of 915 (5.1%: 2.5 specimens per species) as we show that clusters without signatures of incongruence can be validated by only studying two specimens representing the most divergent haplotypes. To test the generality of our results across different barcode clustering techniques, we establish that the levels of conflict are similar across Objective Clustering (OC), Automatic Barcode Gap Discovery (ABGD), Poisson Tree Processes (PTP) and Refined Single Linkage (RESL) (used by Barcode of Life Data System (BOLD) to assign Barcode Index Numbers (BINs)). OC and ABGD achieved a maximum match score with morphology of 89% while PTP was slightly less effective (84%). RESL could only be tested for a subset of the specimens because the algorithm is not public. BINs based on 277 of the original 1 714 haplotypes were 86% congruent with morphology while the values were 89% for OC, 74% for PTP, and 72% for ABGD.

2019 ◽  
Author(s):  
Darren Yeo ◽  
Amrita Srivathsan ◽  
Rudolf Meier

AbstractNew techniques for the species-level sorting of millions of specimens are needed in order to accelerate species discovery, determine how many species live on earth, and develop efficient biomonitoring techniques. These sorting methods should be reliable, scalable and cost-effective, as well as being largely insensitive to low-quality genomic DNA, given that this is usually all that can be obtained from museum specimens. Mini-barcodes seem to satisfy these criteria, but it is unclear how well they perform for species-level sorting when compared to full-length barcodes. This is here tested based on 20 empirical datasets covering ca. 30,000 specimens and 5,500 species, as well as six clade-specific datasets from GenBank covering ca. 98,000 specimens for over 20,000 species. All specimens in these datasets had full-length barcodes and had been sorted to species-level based on morphology. Mini-barcodes of different lengths and positions were obtained in silico from full-length barcodes using a sliding window approach (3 windows: 100-bp, 200-bp, 300-bp) and by excising nine mini-barcodes with established primers (length: 94 – 407-bp). We then tested whether barcode length and/or position reduces species-level congruence between morphospecies and molecular Operational Taxonomic Units (mOTUs) that were obtained using three different species delimitation techniques (PTP, ABGD, objective clustering). Surprisingly, we find no significant differences in performance for both species- or specimen-level identification between full-length and mini-barcodes as long as they are of moderate length (>200-bp). Only very short mini-barcodes (<200-bp) perform poorly, especially when they are located near the 5’ end of the Folmer region. The mean congruence between morphospecies and mOTUs is ca. 75% for barcodes >200-bp and the congruent mOTUs contain ca. 75% of all specimens. Most conflict is caused by ca. 10% of the specimens that can be identified and should be targeted for re-examination in order to efficiently resolve conflict. Our study suggests that large-scale species discovery, identification, and metabarcoding can utilize mini-barcodes without any demonstrable loss of information compared to full-length barcodes.


2020 ◽  
Vol 69 (5) ◽  
pp. 999-1015 ◽  
Author(s):  
Darren Yeo ◽  
Amrita Srivathsan ◽  
Rudolf Meier

Abstract New techniques for the species-level sorting of millions of specimens are needed in order to accelerate species discovery, determine how many species live on earth, and develop efficient biomonitoring techniques. These sorting methods should be reliable, scalable, and cost-effective, as well as being largely insensitive to low-quality genomic DNA, given that this is usually all that can be obtained from museum specimens. Mini-barcodes seem to satisfy these criteria, but it is unclear how well they perform for species-level sorting when compared with full-length barcodes. This is here tested based on 20 empirical data sets covering ca. 30,000 specimens (5500 species) and six clade-specific data sets from GenBank covering ca. 98,000 specimens ($&gt;$20,000 species). All specimens in these data sets had full-length barcodes and had been sorted to species-level based on morphology. Mini-barcodes of different lengths and positions were obtained in silico from full-length barcodes using a sliding window approach (three windows: 100 bp, 200 bp, and 300 bp) and by excising nine mini-barcodes with established primers (length: 94–407 bp). We then tested whether barcode length and/or position reduces species-level congruence between morphospecies and molecular operational taxonomic units (mOTUs) that were obtained using three different species delimitation techniques (Poisson Tree Process, Automatic Barcode Gap Discovery, and Objective Clustering). Surprisingly, we find no significant differences in performance for both species- or specimen-level identification between full-length and mini-barcodes as long as they are of moderate length ($&gt;$200 bp). Only very short mini-barcodes (&lt;200 bp) perform poorly, especially when they are located near the 5$^\prime$ end of the Folmer region. The mean congruence between morphospecies and mOTUs was ca. 75% for barcodes $&gt;$200 bp and the congruent mOTUs contain ca. 75% of all specimens. Most conflict is caused by ca. 10% of the specimens that can be identified and should be targeted for re-examination in order to efficiently resolve conflict. Our study suggests that large-scale species discovery, identification, and metabarcoding can utilize mini-barcodes without any demonstrable loss of information compared to full-length barcodes. [DNA barcoding; metabarcoding; mini-barcodes; species discovery.]


BMC Biology ◽  
2019 ◽  
Vol 17 (1) ◽  
Author(s):  
Amrita Srivathsan ◽  
Emily Hartop ◽  
Jayanthi Puniamoorthy ◽  
Wan Ting Lee ◽  
Sujatha Narayanan Kutty ◽  
...  

Abstract Background More than 80% of all animal species remain unknown to science. Most of these species live in the tropics and belong to animal taxa that combine small body size with high specimen abundance and large species richness. For such clades, using morphology for species discovery is slow because large numbers of specimens must be sorted based on detailed microscopic investigations. Fortunately, species discovery could be greatly accelerated if DNA sequences could be used for sorting specimens to species. Morphological verification of such “molecular operational taxonomic units” (mOTUs) could then be based on dissection of a small subset of specimens. However, this approach requires cost-effective and low-tech DNA barcoding techniques because well-equipped, well-funded molecular laboratories are not readily available in many biodiverse countries. Results We here document how MinION sequencing can be used for large-scale species discovery in a specimen- and species-rich taxon like the hyperdiverse fly family Phoridae (Diptera). We sequenced 7059 specimens collected in a single Malaise trap in Kibale National Park, Uganda, over the short period of 8 weeks. We discovered > 650 species which exceeds the number of phorid species currently described for the entire Afrotropical region. The barcodes were obtained using an improved low-cost MinION pipeline that increased the barcoding capacity sevenfold from 500 to 3500 barcodes per flowcell. This was achieved by adopting 1D sequencing, resequencing weak amplicons on a used flowcell, and improving demultiplexing. Comparison with Illumina data revealed that the MinION barcodes were very accurate (99.99% accuracy, 0.46% Ns) and thus yielded very similar species units (match ratio 0.991). Morphological examination of 100 mOTUs also confirmed good congruence with morphology (93% of mOTUs; > 99% of specimens) and revealed that 90% of the putative species belong to the neglected, megadiverse genus Megaselia. We demonstrate for one Megaselia species how the molecular data can guide the description of a new species (Megaselia sepsioides sp. nov.). Conclusions We document that one field site in Africa can be home to an estimated 1000 species of phorids and speculate that the Afrotropical diversity could exceed 200,000 species. We furthermore conclude that low-cost MinION sequencers are very suitable for reliable, rapid, and large-scale species discovery in hyperdiverse taxa. MinION sequencing could quickly reveal the extent of the unknown diversity and is especially suitable for biodiverse countries with limited access to capital-intensive sequencing facilities.


Author(s):  
Yan Pan ◽  
Shining Li ◽  
Qianwu Chen ◽  
Nan Zhang ◽  
Tao Cheng ◽  
...  

Stimulated by the dramatical service demand in the logistics industry, logistics trucks employed in last-mile parcel delivery bring critical public concerns, such as heavy cost burden, traffic congestion and air pollution. Unmanned Aerial Vehicles (UAVs) are a promising alternative tool in last-mile delivery, which is however limited by insufficient flight range and load capacity. This paper presents an innovative energy-limited logistics UAV schedule approach using crowdsourced buses. Specifically, when one UAV delivers a parcel, it first lands on a crowdsourced social bus to parcel destination, gets recharged by the wireless recharger deployed on the bus, and then flies from the bus to the parcel destination. This novel approach not only increases the delivery range and load capacity of battery-limited UAVs, but is also much more cost-effective and environment-friendly than traditional methods. New challenges therefore emerge as the buses with spatiotemporal mobility become the bottleneck during delivery. By landing on buses, an Energy-Neutral Flight Principle and a delivery scheduling algorithm are proposed for the UAVs. Using the Energy-Neutral Flight Principle, each UAV can plan a flying path without depleting energy given buses with uncertain velocities. Besides, the delivery scheduling algorithm optimizes the delivery time and number of delivered parcels given warehouse location, logistics UAVs, parcel locations and buses. Comprehensive evaluations using a large-scale bus dataset demonstrate the superiority of the innovative logistics UAV schedule approach.


Water ◽  
2021 ◽  
Vol 13 (7) ◽  
pp. 899
Author(s):  
Djordje Mitrovic ◽  
Miguel Crespo Chacón ◽  
Aida Mérida García ◽  
Jorge García Morillo ◽  
Juan Antonio Rodríguez Diaz ◽  
...  

Studies have shown micro-hydropower (MHP) opportunities for energy recovery and CO2 reductions in the water sector. This paper conducts a large-scale assessment of this potential using a dataset amassed across six EU countries (Ireland, Northern Ireland, Scotland, Wales, Spain, and Portugal) for the drinking water, irrigation, and wastewater sectors. Extrapolating the collected data, the total annual MHP potential was estimated between 482.3 and 821.6 GWh, depending on the assumptions, divided among Ireland (15.5–32.2 GWh), Scotland (17.8–139.7 GWh), Northern Ireland (5.9–8.2 GWh), Wales (10.2–8.1 GWh), Spain (375.3–539.9 GWh), and Portugal (57.6–93.5 GWh) and distributed across the drinking water (43–67%), irrigation (51–30%), and wastewater (6–3%) sectors. The findings demonstrated reductions in energy consumption in water networks between 1.7 and 13.0%. Forty-five percent of the energy estimated from the analysed sites was associated with just 3% of their number, having a power output capacity >15 kW. This demonstrated that a significant proportion of energy could be exploited at a small number of sites, with a valuable contribution to net energy efficiency gains and CO2 emission reductions. This also demonstrates cost-effective, value-added, multi-country benefits to policy makers, establishing the case to incentivise MHP in water networks to help achieve the desired CO2 emissions reductions targets.


Author(s):  
Paul Oehlmann ◽  
Paul Osswald ◽  
Juan Camilo Blanco ◽  
Martin Friedrich ◽  
Dominik Rietzel ◽  
...  

AbstractWith industries pushing towards digitalized production, adaption to expectations and increasing requirements for modern applications, has brought additive manufacturing (AM) to the forefront of Industry 4.0. In fact, AM is a main accelerator for digital production with its possibilities in structural design, such as topology optimization, production flexibility, customization, product development, to name a few. Fused Filament Fabrication (FFF) is a widespread and practical tool for rapid prototyping that also demonstrates the importance of AM technologies through its accessibility to the general public by creating cost effective desktop solutions. An increasing integration of systems in an intelligent production environment also enables the generation of large-scale data to be used for process monitoring and process control. Deep learning as a form of artificial intelligence (AI) and more specifically, a method of machine learning (ML) is ideal for handling big data. This study uses a trained artificial neural network (ANN) model as a digital shadow to predict the force within the nozzle of an FFF printer using filament speed and nozzle temperatures as input data. After the ANN model was tested using data from a theoretical model it was implemented to predict the behavior using real-time printer data. For this purpose, an FFF printer was equipped with sensors that collect real time printer data during the printing process. The ANN model reflected the kinematics of melting and flow predicted by models currently available for various speeds of printing. The model allows for a deeper understanding of the influencing process parameters which ultimately results in the determination of the optimum combination of process speed and print quality.


2020 ◽  
Vol 30 (Supplement_5) ◽  
Author(s):  
D Panatto ◽  
P Landa ◽  
D Amicizia ◽  
P L Lai ◽  
E Lecini ◽  
...  

Abstract Background Invasive disease due to Neisseria meningitidis (Nm) is a serious public health problem even in developed countries, owing to its high lethality rate (8-15%) and the invalidating sequelae suffered by many (up to 60%) survivors. As the microorganism is transmitted via the airborne route, the only available weapon in the fight against Nm invasive disease is vaccination. Our aim was to carry out an HTA to evaluate the costs and benefits of anti-meningococcal B (MenB) vaccination with Trumenba® in adolescents in Italy, while also considering the impact of this new vaccination strategy on organizational and ethics aspects. Methods A lifetime Markov model was developed. MenB vaccination with the two-dose schedule of Trumenba® in adolescents was compared with 'non-vaccination'. Two perspectives were considered: the National Health Service (NHS) and society. Three disease phases were defined: acute, post-acute and long-term. Epidemiological, economic and health utilities data were taken from Italian and international literature. The analysis was conducted by means of Microsoft Excel 2010®. Results Our study indicated that vaccinating adolescents (11th year of life) with Trumenba® was cost-effective with an ICER = € 7,912/QALY from the NHS perspective and € 7,758/QALY from the perspective of society. Vaccinating adolescents reduces the number of cases of disease due to meningococcus B in one of the periods of highest incidence of the disease, resulting in significant economic and health savings. Conclusions This is the first study to evaluate the overall impact of free MenB vaccination in adolescents both in Italy and in the international setting. Although cases of invasive disease due to meningococcus B are few, if the overall impact of the disease is adequately considered, it becomes clear that including anti-meningococcal B vaccination into the immunization program for adolescents is strongly recommended from the health and economic standpoints. Key messages Free, large-scale MenB vaccination is key to strengthening the global fight against invasive meningococcal disease. Anti-meningococcal B vaccination in adolescents is a cost-effective health opportunity.


Water ◽  
2021 ◽  
Vol 13 (5) ◽  
pp. 661
Author(s):  
Luigi Piazzi ◽  
Stefano Acunto ◽  
Francesca Frau ◽  
Fabrizio Atzori ◽  
Maria Francesca Cinti ◽  
...  

Seagrass planting techniques have shown to be an effective tool for restoring degraded meadows and ecosystem function. In the Mediterranean Sea, most restoration efforts have been addressed to the endemic seagrass Posidonia oceanica, but cost-benefit analyses have shown unpromising results. This study aimed at evaluating the effectiveness of environmental engineering techniques generally employed in terrestrial systems to restore the P. oceanica meadows: two different restoration efforts were considered, either exploring non-degradable mats or, for the first time, degradable mats. Both of them provided encouraging results, as the loss of transplanting plots was null or very low and the survival of cuttings stabilized to about 50%. Data collected are to be considered positive as the survived cuttings are enough to allow the future spread of the patches. The utilized techniques provided a cost-effective restoration tool likely affordable for large-scale projects, as the methods allowed to set up a wide bottom surface to restore in a relatively short time without any particular expensive device. Moreover, the mats, comparing with other anchoring methods, enhanced the colonization of other organisms such as macroalgae and sessile invertebrates, contributing to generate a natural habitat.


Author(s):  
Luis A Leiva ◽  
Asutosh Hota ◽  
Antti Oulasvirta

Abstract Designers are increasingly using online resources for inspiration. How to best support design exploration without compromising creativity? We introduce and study Design Maps, a class of point-cloud visualizations that makes large user interface datasets explorable. Design Maps are computed using dimensionality reduction and clustering techniques, which we analyze thoroughly in this paper. We present concepts for integrating Design Maps into design tools, including interactive visualization, local neighborhood exploration and functionality to integrate existing solutions to the design at hand. These concepts were implemented in a wireframing tool for mobile apps, which was evaluated with actual designers performing realistic tasks. Overall, designers find Design Maps supporting their creativity (avg. CSI score of 74/100) and indicate that the maps producing consistent whitespacing within cloud points are the most informative ones.


Nanomaterials ◽  
2021 ◽  
Vol 11 (7) ◽  
pp. 1646
Author(s):  
Jingya Xie ◽  
Wangcheng Ye ◽  
Linjie Zhou ◽  
Xuguang Guo ◽  
Xiaofei Zang ◽  
...  

In the last couple of decades, terahertz (THz) technologies, which lie in the frequency gap between the infrared and microwaves, have been greatly enhanced and investigated due to possible opportunities in a plethora of THz applications, such as imaging, security, and wireless communications. Photonics has led the way to the generation, modulation, and detection of THz waves such as the photomixing technique. In tandem with these investigations, researchers have been exploring ways to use silicon photonics technologies for THz applications to leverage the cost-effective large-scale fabrication and integration opportunities that it would enable. Although silicon photonics has enabled the implementation of a large number of optical components for practical use, for THz integrated systems, we still face several challenges associated with high-quality hybrid silicon lasers, conversion efficiency, device integration, and fabrication. This paper provides an overview of recent progress in THz technologies based on silicon photonics or hybrid silicon photonics, including THz generation, detection, phase modulation, intensity modulation, and passive components. As silicon-based electronic and photonic circuits are further approaching THz frequencies, one single chip with electronics, photonics, and THz functions seems inevitable, resulting in the ultimate dream of a THz electronic–photonic integrated circuit.


Sign in / Sign up

Export Citation Format

Share Document