scholarly journals A technical framework for automatic perceptual evaluation of singing quality

Author(s):  
Chitralekha Gupta ◽  
Haizhou Li ◽  
Ye Wang

Human experts evaluate singing quality based on many perceptual parameters such as intonation, rhythm, and vibrato, with reference to music theory. We proposed previously the Perceptual Evaluation of Singing Quality (PESnQ) framework that incorporated acoustic features related to these perceptual parameters in combination with the cognitive modeling concept of the telecommunication standard Perceptual Evaluation of Speech Quality to evaluate singing quality. In this study, we present further the study of the PESnQ framework to approximate the human judgments. First, we find that a linear combination of the individual perceptual parameter human scores can predict their overall singing quality judgment. This provides us with a human parametric judgment equation. Next, the prediction of the individual perceptual parameter scores from the PESnQ acoustic features show a high correlation with the respective human scores, which means more meaningful feedback to learners. Finally, we compare the performance of early fusion and late fusion of the acoustic features in predicting the overall human scores. We find that the late fusion method is superior to that of the early fusion method. This work underlines the importance of modeling human perception in automatic singing quality assessment.

1989 ◽  
Vol 35 (3) ◽  
pp. 373-378
Author(s):  
Richard A. Nolan

The patterns of protein synthesis associated with three sequential stages in protoplast morphogenesis (spindle-shaped, early fusion sphere, and late fusion sphere protoplasts) of the fungus Entomophaga aulicae were studied using both one-dimensional gels with general protein staining and two-dimensional gels with [35S]methionine protein labelling and fluorography. A total of 332 proteins were observed with 63.5% (211) common to all three developmental stages. Of the individual totals, 3.3% (8 out of 245), 7.3% (22 out of 301), and 4.5% (13 out of 286) of the proteins were unique to the spindle-shaped, early fusion sphere, and late fusion sphere protoplasts, respectively. The molecular mass and pI distribution profiles for early fusion sphere protoplast proteins are discussed.Key words: protein synthesis, stage-specific proteins, fungal protoplasts, Entomophaga aulicae.


2018 ◽  
Vol 2018 ◽  
pp. 1-11 ◽  
Author(s):  
Yongkai Ye ◽  
Xinwang Liu ◽  
Qiang Liu ◽  
Xifeng Guo ◽  
Jianping Yin

In real-world applications of multiview clustering, some views may be incomplete due to noise, sensor failure, etc. Most existing studies in the field of incomplete multiview clustering have focused on early fusion strategies, for example, learning subspace from multiple views. However, these studies overlook the fact that clustering results with the visible instances in each view could be reliable under the random missing assumption; accordingly, it seems that learning a final clustering decision via late fusion of the clustering results from incomplete views would be more natural. To this end, we propose a late fusion method for incomplete multiview clustering. More specifically, the proposed method performs kernel k-means clustering on the visible instances in each view and then performs a late fusion of the clustering results from different views. In the late fusion step of the proposed method, we encode each view’s clustering result as a zero-one matrix, of which each row serves as a compressed representation of the corresponding instance. We then design an alternate updating algorithm to learn a unified clustering decision that can best group the visible compressed representations in each view according to the k-means clustering objective. We compare the proposed method with several commonly used imputation methods and a representative early fusion method on six benchmark datasets. The superior clustering performance observed validates the effectiveness of the proposed method.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Ilya A. Surov ◽  
E. Semenenko ◽  
A. V. Platonov ◽  
I. A. Bessmertny ◽  
F. Galofaro ◽  
...  

AbstractThe paper presents quantum model of subjective text perception based on binary cognitive distinctions corresponding to words of natural language. The result of perception is quantum cognitive state represented by vector in the qubit Hilbert space. Complex-valued structure of the quantum state space extends the standard vector-based approach to semantics, allowing to account for subjective dimension of human perception in which the result is constrained, but not fully predetermined by input information. In the case of two distinctions, the perception model generates a two-qubit state, entanglement of which quantifies semantic connection between the corresponding words. This two-distinction perception case is realized in the algorithm for detection and measurement of semantic connectivity between pairs of words. The algorithm is experimentally tested with positive results. The developed approach to cognitive modeling unifies neurophysiological, linguistic, and psychological descriptions in a mathematical and conceptual structure of quantum theory, extending horizons of machine intelligence.


2018 ◽  
Vol 7 (9) ◽  
pp. 364 ◽  
Author(s):  
Helena Merschdorf ◽  
Thomas Blaschke

Although place-based investigations into human phenomena have been widely conducted in the social sciences over the last decades, this notion has only recently transgressed into Geographic Information Science (GIScience). Such a place-based GIS comprises research from computational place modeling on one end of the spectrum, to purely theoretical discussions on the other end. Central to all research that is concerned with place-based GIS is the notion of placing the individual at the center of the investigation, in order to assess human-environment relationships. This requires the formalization of place, which poses a number of challenges. The first challenge is unambiguously defining place, to subsequently be able to translate it into binary code, which computers and geographic information systems can handle. This formalization poses the next challenge, due to the inherent vagueness and subjectivity of human data. The last challenge is ensuring the transferability of results, requiring large samples of subjective data. In this paper, we re-examine the meaning of place in GIScience from a 2018 perspective, determine what is special about place, and how place is handled both in GIScience and in neighboring disciplines. We, therefore, adopt the view that space is a purely geographic notion, reflecting the dimensions of height, depth, and width in which all things occur and move, while place reflects the subjective human perception of segments of space based on context and experience. Our main research questions are whether place is or should be a significant (sub)topic in GIScience, whether it can be adequately addressed and handled with established GIScience methods, and, if not, which other disciplines must be considered to sufficiently account for place-based analyses. Our aim is to conflate findings from a vast and dynamic field in an attempt to position place-based GIS within the broader framework of GIScience.


2018 ◽  
Author(s):  
Σπυρίδωνας Σταθόπουλος

Η παρούσα διατριβή ερευνά το πρόβλημα της ανάκτησης και κατηγοριοποίησης πολυμεσικού περιεχομένου. Στο πρώτο μέρος γίνεται μία διερεύνηση της εφαρμογής Λανθάνουσας Σημασιολογικής Ανάλυσης για ανάκτηση εικόνας σε συλλογές μεγάλης κλίμακας (LSA). Παρουσιάζεται μία αποτελεσματική προσέγγιση για την εφαρμογή LSA η οποία παρακάμπτει την Ανάλυση Ιδιαζουσών Τιμών (SVD) στον πίνακα χαρακτηριστικών, ξεπερνώντας με αυτόν τον τρόπο το πρόβλημα της εφαρμογής της μεθόδου σε σύνολα δεδομένων μεγάλης κλίμακας. Στη μελέτη αυτή διερευνάται ο συνδυασμός διαφορετικών αναπαραστάσεων εικόνας είτε σε πρώιμο στάδιο (Early fusion) είτε σε μεταγενέστερο (Late fusion) με στόχο την αποτελεσματικότερη ανάκτηση εικόνας. Επιπλέον, προτείνεται μία συνάρτηση πυρήνα (Kernel function) βασισμένη στην LSA η οποία συσχετίζει χαρακτηριστικά από διαφορετικές πηγές σε ένα κοινό λανθάνοντα χώρο. Η προτεινόμενη προσέγγιση συνδυάζει την ταξινόμηση με την ανάκτηση, αναπαριστώντας τις εικόνες με ένα σύνθετο διάνυσμα ενσωματώνοντας την πληροφορία που προκύπτει από την κατηγοριοποίηση. Τα πειραματικά αποτελέσματα δείχνουν ότι υπερέχει της λανθάνουσας ευρετηρίασης που προκύπτει από την εφαρμογή SVD.Για την αναπαράσταση εικόνων, προτείνεται μια γενίκευση του μοντέλου Bag-of-Colors (BoC). Ο νέος αλγόριθμος, που αναφέρεται ως QBoC, βασίζεται στην αποσύνθεση των εικόνων σε ένα δέντρο από τεταρτημόρια κωδικοποιώντας με αυτόν τον τρόπο χωρικές πληροφορίες στην τελική απεικόνιση της εικόνας. Σε συνδυασμό με το μοντέλο Bag-of-Visual-Words (BoVW) χρησιμοποιείται για την αποτελεσματική κατηγοριοποίηση ιατρικών εικόνων.Τέλος, παρουσιάζεται μια νέα προσέγγιση για το συνδυασμό του LSA με Νευρωνικά Δίκτυα Συνέλιξης (CNNs) για την ταξινόμηση εικόνων βάση περιεχομένου. Για το σκοπό αυτό, κατασκευάζεται ένας βελτιστοποιημένος λανθάνων σημασιολογικός χώρος που καταγράφει τη συσχέτιση των εικόνων σε κάθε κατηγορία χρησιμοποιώντας ένα προ-εκπαιδευμένο νευρωνικό δίκτυο.Τα χαρακτηριστικά των εικόνων προβάλλονται μέσο ενός σταθμισμένου Latent Semantic Tensor σε ένα χαμηλότερο χώρο και χρησιμοποιούνται για να εκπαιδεύσουν ένα CNN που πραγματοποιεί την τελική ταξινόμηση. Τα πειραματικά αποτελέσματα καταδεικνύουν την αποτελεσματικότητα αυτής της προσέγγισης σε ότι αφορά την ακρίβεια της ταξινόμησης, επιτυγχάνοντας συγκρίσιμα αποτελέσματα με αντίστοιχες σύγχρονες προσεγγίσεις.


2016 ◽  
Vol 5 (1) ◽  
pp. 16-28
Author(s):  
Noha Saleeb

3D virtual building models are used to help clients reach decisions during concept and detailed design phases. However, previously published research provides evidence for discrepancies between human perception of virtual and physical spaces; thus perceiving each virtual dimension (height, width, depth) differently from its physical counterpart, with varying percentages. This can affect clients' effective decision-making during coordination if 3D virtual representations are not perceived identical to their physical equivalent. This paper discusses the impact of these discrepancies beyond the design phases and into the whole lifecycle, construction and operations. Moreover, descriptive and inferential statistical analysis provides evidence of relationships between the physical and virtual perception differences in dimension, discussing possible factors contributing to perception discrepancies affecting the individual viewer, in 2 main areas 1) 3D authoring software 2) psychophysical factors. Possible solutions are also proposed to accommodate for the discrepancy between physical and virtual spaces.


Buildings ◽  
2020 ◽  
Vol 10 (10) ◽  
pp. 174 ◽  
Author(s):  
Prageeth Jayathissa ◽  
Matias Quintana ◽  
Mahmoud Abdelrahman ◽  
Clayton Miller

Evaluating and optimising human comfort within the built environment is challenging due to the large number of physiological, psychological and environmental variables that affect occupant comfort preference. Human perception could be helpful to capture these disparate phenomena and interpreting their impact; the challenge is collecting spatially and temporally diverse subjective feedback in a scalable way. This paper presents a methodology to collect intensive longitudinal subjective feedback of comfort-based preference using micro ecological momentary assessments on a smartwatch platform. An experiment with 30 occupants over two weeks produced 4378 field-based surveys for thermal, noise, and acoustic preference. The occupants and the spaces in which they left feedback were then clustered according to these preference tendencies. These groups were used to create different feature sets with combinations of environmental and physiological variables, for use in a multi-class classification task. These classification models were trained on a feature set that was developed from time-series attributes, environmental and near-body sensors, heart rate, and the historical preferences of both the individual and the comfort group assigned. The most accurate model had multi-class classification F1 micro scores of 64%, 80% and 86% for thermal, light, and noise preference, respectively. The discussion outlines how these models can enhance comfort preference prediction when supplementing data from installed sensors. The approach presented prompts reflection on how the building analysis community evaluates, controls, and designs indoor environments through balancing the measurement of variables with occupant preferences in an intensive longitudinal way.


1987 ◽  
Vol 33 (9) ◽  
pp. 808-811 ◽  
Author(s):  
Richard A. Nolan

During morphogenesis the fungus Entomophaga aulicae produces both normal size (designated macrostages) and a smaller size class (designated microstages) of the spindle-shaped, early fusion sphere, and late fusion sphere protoplasts. Two additional stages (which lack macrostage counterparts), the granular microstage and the solid microstage (which can germinate to form a hypha), are also produced. The sizes, developmental sequences, and growth rates of the macro- and micro-stages are compared. The microstages are produced in both highly enriched and minimal growth media and by several isolates and are, therefore, considered normal components of E. aulicae protoplast morphogenesis. Some possible implications of microstage occurrence are mentioned.


2006 ◽  
Vol 14 (2) ◽  
pp. 275-285 ◽  
Author(s):  
Angelo Cangelosi

The double function of language, as a social/communicative means, and as an individual/cognitive capability, derives from its fundamental property that allows us to internally re-represent the world we live in. This is possible through the mechanism of symbol grounding, i.e., the ability to associate entities and states in the external and internal world with internal categorical representations. The symbol grounding mechanism, as language, has both an individual and a social component. The individual component, called the “Physical Symbol Grounding”, refers to the ability of each individual to create an intrinsic link between world entities and internal categorical representations. The social component, called “Social Symbol Grounding”, refers to the collective negotiation for the selection of shared symbols (words) and their grounded meanings. The paper discusses these two aspects of symbol grounding in relation to distributed cognition, using examples from cognitive modeling research on grounded agents and robots.


Author(s):  
Attila Zoltán Jenei ◽  
Gábor Kiss

In the present study, we attempt to estimate the severity of depression using a Convolutional Neural Network (CNN). The method is special because an auto- and cross-correlation structure has been crafted rather than using an actual image for the input of the network. The importance to investigate the possibility of this research is that depression has become one of the leading mental disorders in the world. With its appearance, it can significantly reduce an individual's quality of life even at an early stage, and in severe cases, it may threaten with suicide. It is therefore important that the disorder be recognized as early as possible. Furthermore, it is also important to determine the disorder severity of the individual, so that a treatment order can be established. During the examination, speech acoustic features were obtained from recordings. Among the features, MFCC coefficients and formant frequencies were used based on preliminary studies. From its subsets, correlation structure was created. We applied this quadratic structure to the input of a convolutional network. Two models were crafted: single and double input versions. Altogether, the lowest RMSE value (10.797) was achieved using the two features, which has a moderate strength correlation of 0.61 (between estimated and original).


Sign in / Sign up

Export Citation Format

Share Document