Challenges and Opportunities. Are there new challenges or opportunities that you experienced this year that may require significant attention, resources, or organizational effort in the coming year? San Diego CPPS Fund Evaluation:
Challenges and Opportunities. The main challenge in unsupervised learning is the size of data required. This project will take advantage of the DoReMir dataset for this purpose. The recordings in the dataset are collected from users from over 100 countries, using a mobile music transcription application which gives itself a great potential to represent real-world singing data. The DAMP singing datasets (available for research, released by Xxxxx) [4-7] also provide great potential for the specific task.
Challenges and Opportunities. The only user-driven opportunity that is related to this project is to allow the user choose which instruments they want to be extracted from the recording or which instruments they need to have transcribed, and thereof having its notes recognised.
Challenges and Opportunities. A major challenge in this direction is the gathering of user data - this can be challenging depending upon the level of the sensitivity of the information to be collected as well as the concerns of the user. Another challenge is to ensure the quality of the said collected data, since it is not manually labelled by experts. Another challenge is designing the interactions and the way in which the data would be used by the model. We will investigate the performance of music alignment which uses an “online” learning approach in which, at test stage, the model is continuously adapted to a stream of incoming alignment corrections. In machine learning, online learning is defined as the task of using data that becomes available in a sequential order to step-wise update a predictor for future data. The new data becoming available in our case would be the incoming alignment corrections. These corrections would ideally be coming from the user but we can also explore methods where we could generate these corrections (semi-)automatically. Thus, we will use online learning to perform the exploitation of manual alignment corrections in a continuous learning framework. We could explore an automatic alignment correction approach in resource-scarce conditions, especially when generating robust offline alignments is preferable - for instance, for quick deployment on an iPad. This can be thought of more as a domain adaptation process. Continuous learning is ideal in a resource-intensive scenario, where we have a constant flux of new incoming labels. Our model would constantly improve itself using active learning and online learning strategies.
Challenges and Opportunities. We believe that deep learning-based timbral transformations can help existing user-driven models to converge faster to the imitated sound. Also, as different users have a distinct way of imitating sounds, a good idea would be to personalise these routines by making a model of the user’s vocal imitation style. Using robust timbre analysis along with efficient interactive learning techniques like active learning [11] appears to be a promising way to deal with users’ idiosyncrasies in this respect.
Challenges and Opportunities. The main challenge of our work is to identify the best form of data augmentation for a task and provide some explanations as to why that particular data augmentation technique is suitable for the task. As described before, no user-driven specific challenge will be addressed.
Challenges and Opportunities. By investigating the previous studies, we find that there is no common definition or taxonomy of relevant user’s contextual information in music consumption. Hence, identifying the relevant contexts and building a taxonomy of contexts and how they relate to each other is one important challenge in this project. This would help in formalizing the problem in the research community and help future work in building on top of previous work. Similarly, there are no available standard datasets for this problem, in terms of tracks being labelled with their context classes. This is another challenge that is important for future research to have a baseline and a dataset to compare new results with. Relying on user-created data such as playlists is a suitable approach to create a semi-automated procedure of collecting and labelling tracks using the context classes.
Challenges and Opportunities. In this project, the user has a fundamental role. Music is considered not in an abstract way, but as a cultural and social phenomenon that directly involves the listener. This implies a lot of possible future development of the project in real applications but at the same time implies many problems related to the lack of data. Collecting human data is very expensive and time- consuming, and involves many more variables. This is the main reason why only few and small datasets are available in this field.
Challenges and Opportunities. High-quality music style transfer would open the possibility for user-dependent applications by means of transferring the style from music coming from a similar context. However, transforming or generating audio using deep learning techniques (which is a natural choice for this task) is still very challenging and resource intensive. One alternative is to work with music in a symbolic representation, which is more abstract than audio waveforms or spectrograms and, due to its discrete nature, easy to generate using RNN or transformer models.
Challenges and Opportunities. Although there have been considerable advances in user-driven approaches for generative models of audio, the trend is that the field lacks behind Computer Vision research. Although this might seem to offer many opportunities, this fact is probably due to the multi-scale complexity and sequential nature of music audio, as opposed to image, which makes the field far more difficult.