Machine Learning Music

Alex Kim '25 and Assistant Professor of Music Charlotte Botha Photo: Zack Stanek

When planning research projects, many people tend to stay in their academic lane: science with science, humanities with humanities, arts with arts. Alex Kim ’25, however, a music and math double major, spent his summer exploring machine learning and music through the help of an Emerson grant. With the guidance of Assistant Professor of Music Charlotte Botha, Kim developed a vocal register classification tool through machine learning models.

Whether high or low, sweet sounding or not-so-sweet sounding, we all have a vocal range. Within this range there’s a point where one’s vocal register shifts from their chest voice (their lower range), to their head voice, (their higher range). This transition point is different for all singers, making it a difficult register to master.

Alex Kim '25

Majors: Mathematics. Music
Hometown: Wellesley, Mass.
High school: Wellesley High School

“This bridge between registers is tricky to navigate for less experienced singers,” explained Kim, who saw a research opportunity. “My goal was to create something that musicians of varying skill and technique can use to better understand and utilize their vocal registers, particularly that transition between chest and head voice.”

Kim’s idea was to take a dataset of vocal samples and, using machine learning models, develop a user-friendly software that could concretely identify those registers.

“While there have been theoretical models and studies on vocal registers,” he said, “a practical tool specifically designed for this purpose hasn’t been created.” Admittedly, this came with challenges. “It was difficult having to plan, write code, and juggle a lot of things at once. The hardest thing, though, was making it user-accessible. It required a lot of checking for things that could go wrong,” he said.

Kim used a deep learning model called a Convolutional Neural Network and a Support Vector Machine to analyze the dataset and refine the tool he concurrently created: AVRA, or Automatic Vocal Register Analysis.

“Having had this time to sit down and just try to build something from my brain has been the nicest thing.”

“Ideally, the user would be able to choose an audio file [and] zoom in on a specified time unit they want to analyze the register of, and AVRA will do that,” he said. Kim is the music director of the Buffers, an a cappella group on campus, making the ability to analyze vocal registers very important: “I can check my own ear with another reference, but also I can refer it to other people, just as a basis to say this is what you should be listening for.”

Kim feels this project was personally impactful, too: “Having had this time to sit down and just try to build something from my brain has been the nicest thing.” Not only that, his coding skills have also grown exponentially. “In addition to working on my machine learning skills from the ground up, I also built a full-stack application … it’s extended my interest in the intersection of audio and visual data to different machine and deep learning frameworks,” he said.

Looking ahead, Kim hopes to expand beyond male pop voices — the registers he used for his dataset — and develop AVRA further, potentially as a thesis project in the Music Department.

Machine Learning Music

Alex Kim '25

Related News

Social

Social

Site Menu

Site Search

Machine Learning Music

Share Options

Alex Kim '25

Related News

Share Options

Site Menu

Site Search