Research
Discussion

Discussion

The theoretical exploration of sound signals, music and the use of modern technologies, highlighted the complexity behind the interpretation we as users-listeners give to the sound experience. Furthermore, how the performance needs of interpretation change and are transformed according to the culture and the correlation between the medium and the listener. The study also highlighted how the attribution of meaning and significance to sound experience is a multifactorial matrix that operates in the form of language and evolves alongside social constructs.

The implementation and successful results of the DL model confirm that extracting features from music and sound signals is feasible. Furthermore, the DL model confirms that the user-listener's sound perception can also be modelled in an AI environment with high accuracy. Even if the primary goal of VC generation was not achieved, certainly the prediction of musical genus is a key foundation for subsequent research in this direction.

An important finding is the introduction of the ultrafrequency spectrum in the context of sensory perception of sound experience. As I discussed in the body of the study, even if we cannot distinguish frequencies through hearing these frequencies as vibration affect us. These super frequencies while not sonically discernible by human senses, are discernible through a digital audio file. Therefore, I believe that further exploration in the field of DL and audio signals using super frequencies is worthy of further investigation.

This research, even at the point it has reached due to the limitations it has encountered, is an addition to the community of DL using audio signals. An addition in terms of prototype software pipeline design and creation of a customized data set. The DL model produced through this study may well be exploited in many ways. For example it can create tags with the musical genre as caption in movies. Moreover it can be used in the context of music streaming service to automate the labeling of tags.The prediction process may well work in reverse. As the infrastructure of the dataset and an API has been created, an app can be implemented which generates lists according to the term entered by the user-listener.

Through the study of the correlations between musical genera, it was found that implementations in DL models generate new musical genera as sub-genres that complement the interpretation of the sound experience. From this point, this research could continue its work and shift to the identification of new features through music. Just as the identification of tonality was presented in feature extraction, exploring the concept of tonality variation within a piece of music could lead us to extract features such as musical scale. Hence, musical scale can convey features of mental state such as sadness and joy.

The dataset created while conducting this research has also collected texts with the description for each musical piece. A possible future research could focus on NLP techniques so as to extract characteristics related to the musical piece through the description. In addition, the dataset has collected other variables such as tag_list. This variable consists of a set of tags that are not related to the musical genre. The utilization of these variables could enrich each music track in the existing dataset with new attributes.

I intend to turn MLVC into an open source software and make it freely and publicly available to the DL community. MLVC could provide a framework for DL model development by providing an integrated software pipeline solution. The software pipeline to a large extent automates the processes from the individual steps in a DL project architecture.