So I've been meaning to clean up and publish this for a long time, and I guess I realized perfect is the enemy of the sufficient. This is a cool little project I did back in college with Haowei Lu, Chris LaFriniere, and Nathan Yu (TODO: links to respective githubs?). I thought of making it all pretty and changing the way it worked now that I have infinite time to muck with it, but each time I bit off more than I could chew and ended up getting distracted and working on something else. Now I'm in the process of moving data off my old VPS, I figured, fuck it, I'll publish it as is. As such, this code isn't very well-structured and hasn't been cleaned up for release. If you guys have trouble with it, message me and I'll figure it out.
Singing is great, isn't it? You open your mouth and suddenly emotion comes out and fills the room. Sadness, happiness, sultriness all encoded in a series loud and soft sine waves of higher and lower frequency generated by your mouth. There's one problem:
How the fuck do you know if you're singing well? And, once you know you don't (not you personally, though I have seen you at kareoke once and MAN ALIVE), how do you improve?
Enter pitch coach.
Pitch coach is a kinda-gamified experience centering on improving your ability to reproduce pitches you hear effectively and hold a note for a certain amount of time. It's fairly simple: You hear a pitch, you sing into your phone, and your phone tells you how much you suck. Love it, right? There's some cool action items I hope having this published will kick my ass to do, like a song mode where you're doing a song rather than random pitches, but we'll see.
So I didn't do the FFT side of things myself. For that we have to thank the nice folks of JTransforms, but basically, the high level overview of the algorithm is this:
- The program reads sound samples and stores them in a circular buffer.
- The program takes some samples off the circular buffer and FFTs them.
- It then goes over the FFT, compresses it by an integer amount, and multiplies it by itself. What this does is help detect things that have overtones. Like, a random peak in your frequency domain data probably won't have an integer overtone to amplify it if you compressed the frequency domain by 2 and multiplied it by itself. It's called "harmonic product spectrum" and an explanation can be found near the middle of the page here (give anchor links to your header tags, folks!)
- Finally, it detects the lowest peak in the resulting modified frequency domain singal and returns it as the most likely pitch. This has served my pretty well overall.
So pitch coach is an android app. I have no idea how you guys would import it into eclipse and compile it, but once I get my new VPS up and put a link to the compiled APK so you can check it out without having to figure it out. Also, it's probably not that hard and I plan to come back and write a guide eventually.
There are two subprojects in this project:
-
PitchRecognizeAlgorithm is a java project that can run without anything android related. All it does is take some sound data and recognize pitches.
-
PitchRecognize is the Android app.
In the true spirit of not cleaning shit up, I left in place the egregious hack where the JTransforms jar lives in the android project, and the algorithm project is listed as depending on the jar in the algorithm project. Yes, it's dumb. I'll fix it later.
I might try to do a better overview of the code later, but it's been over a year since I've touched it.
The JTransforms code is under MPL/LGPL/GPL. Since I merely link with it, and don't include it directly, I can have most any license I want (?). Since that is the case, I prefer the famous "beer license": I hope you enjoy our code, and if you ever see us around, you're welcome to buy us a beer sometime.