Research
Dataset
Design Principles

Design

For the proof of concept of this project, it was crucial to collect a large amount of tunes. These tunes are available by streaming platforms these, in which are labeled or tagged. Tagging and labeling is crucial for model training's performance and legitimacy later on. However, tagging is not always accurate on these platforms, nd especially in terms of the genre. The main decision was made, is to extract the super genre from each track found on some online streaming services and set it as label at this dataset.

Shape

Keytypevalue
idLong0011101101
artistString
artwork_urlString
audio_fileFile
titleString
captionString
comments_countNumber
likes_countNumber
reposts_countNumber
playback_countNumber
descriptionNumber
tag_listString[]
genreStringClassical, Techno, Grime, Jazz, House, Drum & bass, Jungle, Ambient, Hip Hop & Rap
record labelString
track_typeEnum
track_idLong
transcoding_idLong
class_idInteger