Student Work

Auditory Grouping: Using Machine Learning to Predict Locations of Groups in Music Clips

Public

Downloadable Content

open in viewer

Humans perceive a variety of features from an auditory stream, such as our acoustic sensors can detect frequency, pitch, dynamics, etc. We can process music in several different ways based on these features. It’s tough for machines, however, to do the same. Some previous research models already can obtain state-of-the-art performance in predicting acoustic boundaries, but machine perception for audio segmentation based on a human perspective remains to be accomplished. Our project aims to use machine learning algorithms to build a model that makes machines able to separate music into segments as humans do. The machine learning model we built allowed for clear grouping distinction for audio clips of the same musical genre we trained the data on, but generalized poorly to other genres. We believe that the model can be improved by having more training data of a larger scope and increasing the quality of grouping boundaries labels for the data.

  • This report represents the work of one or more WPI undergraduate students submitted to the faculty as evidence of completion of a degree requirement. WPI routinely publishes these reports on its website without editorial or peer review.
Creator
Publisher
Identifier
  • E-project-050222-172615
  • 67286
Advisor
Year
  • 2022
Date created
  • 2022-05-02
Resource type
Major
Rights statement

Relations

In Collection:

Items

Items

Permanent link to this page: https://digital.wpi.edu/show/x346d738t