Abstract

In today's world, the use of machine learning models is increasingly being incorporated into technological applications that we use on a daily basis. As technology incorporates machine learning algorithms to automate prediction and decision making by mimicking human operations, shifts in the ways in which we attach meaning and significance to the information we consume as users or receivers of it are created.

This study, focuses on the information meaning-making of sound experience and music. It explores the notion of the label as an interface through the lens of contemporary music streaming services, while also exploring the shift of this semantics from musical genre to mood. It examines the impact that services incorporating machine learning algorithms have on the semantics of tagging as identity in music and sound experience. It studies the notion of musical gender as a label and its gradual transformation into a watery concept such as the description of the state of mental mood. At the same time, it analyses in depth the ways in which we evaluate sound experience and how the concept of auditory perception is formed by humans. This research aspires to introduce a new identity of signification of sound experience which is the Vibe Caption as an interface that describes the possible emotional state or atmosphere of a place, as it could be transmitted to and felt by others.

The solution proposed through this study is the creation of an audio classifier using Deep learning algorithms and the use of convolutional neural networks. It is confirmed that the use of convolutional neural networks can predict features in music like human auditory perception. The contribution of the research focuses on the use of convolutional networks to extract features from within the audio signal. In the context of developing the necessary software, the research contributes an integrated software pipeline solution for the repeated execution of machine learning experiments using artificial neural networks. The development methodology focuses on the continuous code integration software design technique and aspires to be published for free use as open source software. Due to major logistical constraints, to carry out this study, the methodological part managed to implement a sound classifier that identifies the musical genre as a prediction class with 95% accuracy on the training dataset.

Keywords: Convolutional neural network, Mel spectrogram, Audio classifier, Machine learning, Artificial intelligence, Digital audio signal processing, Deep learning, Convolutional neural network, Audio signal processing, Audio classifier

Problem definition