music representation
Recently Published Documents


TOTAL DOCUMENTS

76
(FIVE YEARS 21)

H-INDEX

8
(FIVE YEARS 2)

Mathematics ◽  
2021 ◽  
Vol 9 (18) ◽  
pp. 2274
Author(s):  
Lvyang Qiu ◽  
Shuyu Li ◽  
Yunsick Sung

With unlabeled music data widely available, it is necessary to build an unsupervised latent music representation extractor to improve the performance of classification models. This paper proposes an unsupervised latent music representation learning method based on a deep 3D convolutional denoising autoencoder (3D-DCDAE) for music genre classification, which aims to learn common representations from a large amount of unlabeled data to improve the performance of music genre classification. Specifically, unlabeled MIDI files are applied to 3D-DCDAE to extract latent representations by denoising and reconstructing input data. Next, a decoder is utilized to assist the 3D-DCDAE in training. After 3D-DCDAE training, the decoder is replaced by a multilayer perceptron (MLP) classifier for music genre classification. Through the unsupervised latent representations learning method, unlabeled data can be applied to classification tasks so that the problem of limiting classification performance due to insufficient labeled data can be solved. In addition, the unsupervised 3D-DCDAE can consider the musicological structure to expand the understanding of the music field and improve performance in music genre classification. In the experiments, which utilized the Lakh MIDI dataset, a large amount of unlabeled data was utilized to train the 3D-DCDAE, obtaining a denoising and reconstruction accuracy of approximately 98%. A small amount of labeled data was utilized for training a classification model consisting of the trained 3D-DCDAE and the MLP classifier, which achieved a classification accuracy of approximately 88%. The experimental results show that the model achieves state-of-the-art performance and significantly outperforms other methods for music genre classification with only a small amount of labeled data.


2021 ◽  
Vol 11 (8) ◽  
pp. 3621
Author(s):  
María Alfaro-Contreras ◽  
Jose J. Valero-Mas

State-of-the-art Optical Music Recognition (OMR) techniques follow an end-to-end or holistic approach, i.e., a sole stage for completely processing a single-staff section image and for retrieving the symbols that appear therein. Such recognition systems are characterized by not requiring an exact alignment between each staff and their corresponding labels, hence facilitating the creation and retrieval of labeled corpora. Most commonly, these approaches consider an agnostic music representation, which characterizes music symbols by their shape and height (vertical position in the staff). However, this double nature is ignored since, in the learning process, these two features are treated as a single symbol. This work aims to exploit this trademark that differentiates music notation from other similar domains, such as text, by introducing a novel end-to-end approach to solve the OMR task at a staff-line level. We consider two Convolutional Recurrent Neural Network (CRNN) schemes trained to simultaneously extract the shape and height information and to propose different policies for eventually merging them at the actual neural level. The results obtained for two corpora of monophonic early music manuscripts prove that our proposal significantly decreases the recognition error in figures ranging between 14.4% and 25.6% in the best-case scenarios when compared to the baseline considered.


2020 ◽  
Vol 9 ◽  
pp. 82-100
Author(s):  
Arry Maulana Syarif ◽  
Azhari Azhari ◽  
Suprapto Suprapto ◽  
Khafiizh Hastuti

A public database containing representative data of karawitan traditional music is needed as a resource for researchers who study computer music and karawitan. To establish this database, a text-based pitch model for music representation that is both human and computer-based was first investigated. A new model of musical representation that can be read by humans and computers is proposed to support music and computer research on karawitan also known as gamelan music. The model is expected to serve as the initial effort to establish a public database of karawitan music representation data. The proposed model was inspired by Helmholtz Notation and Scientific Pitch Notation and well-established, text-based pitch representation systems. The model was developed not only for pitch number, high or low or middle pitch information (octave information), but for musical elements found in gamelan sheet music pieces that include pitch value and legato signs. The model was named Gendhing Scientific Pitch Notation (GSPN). Ghending is a Javanese word that means “song”. The GSPN model was designed to represent music by formulating musical elements from a sheet music piece. Furthermore, the model can automatically be converted to other music representation formats. In the experiment, data in the GSPN format was implemented to automatically convert sheet music to a binary code with localist representation technique.


2020 ◽  
Vol 10 (9) ◽  
pp. 3053
Author(s):  
Daniel Rivero ◽  
Iván Ramírez-Morales ◽  
Enrique Fernandez-Blanco ◽  
Norberto Ezquerra ◽  
Alejandro Pazos

This paper proposes a new model for music prediction based on Variational Autoencoders (VAEs). In this work, VAEs are used in a novel way to address two different issues: music representation into the latent space, and using this representation to make predictions of the future note events of the musical piece. This approach was trained with different songs of Handel. As a result, the system can represent the music in the latent space, and make accurate predictions. Therefore, the system can be used to compose new music either from an existing piece or from a random starting point. An additional feature of this system is that a small dataset was used for training. However, results show that the system is able to return accurate representations and predictions on unseen data.


2019 ◽  
Vol 24 (02) ◽  
pp. 205-216 ◽  
Author(s):  
Danilo Rossetti ◽  
Jônatas Manzolli

Analysing electroacoustic music is a challenging task that can be approached by different strategies. In the last few decades, newly emerging computer environments have enabled analysts to examine the sound spectrum content in greater detail. This has resulted in new graphical representation of features extracted from audio recordings. In this article, we propose the use of representations from complex dynamical systems such as phase space graphics in musical analysis to reveal emergent timbre features in granular technique-based acousmatic music. It is known that granular techniques applied to musical composition generate considerable sound flux, regardless of the adopted procedures and available technological equipment. We investigate points of convergence between different aesthetics of the so-called Granular Paradigm in electroacoustic music, and consider compositions employing different methods and techniques. We analyse three works: Concret PH (1958) by Iannis Xenakis, Riverrun (1986) by Barry Truax, and Schall (1996) by Horacio Vaggione. In our analytical methodology, we apply such concepts as volume and emergence, as well as their graphical representation to the pieces. In conclusion we compare our results and discuss how they relate to the three composers’ specific procedures creating sound flux as well as to their compositional epistemologies and ontologies.


Sign in / Sign up

Export Citation Format

Share Document