Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pitch and MFCC output lengths differ for same input audio #4960

Open
scottbreyfogle opened this issue Nov 21, 2024 · 0 comments
Open

Pitch and MFCC output lengths differ for same input audio #4960

scottbreyfogle opened this issue Nov 21, 2024 · 0 comments
Labels

Comments

@scottbreyfogle
Copy link

For certain input audio lengths, pitch calculation and MFCC calculation will produce different length results. See below for a minimal repro. Here is the output of this script: output.txt

#include "feat/feature-mfcc.h"
#include "feat/pitch-functions.h"
#include "feat/wave-reader.h"

int main() {
  using namespace kaldi;
  PitchExtractionOptions pitch_options;
  ProcessPitchOptions opp;
  WaveData wave;
  MfccOptions mfcc_options;
  Mfcc mfcc(mfcc_options);

  for (int i = 400; i < 4000; i++) {
    Vector<BaseFloat> waveform(i);
    Matrix<BaseFloat> m1, m2;
    ComputeAndProcessKaldiPitch(pitch_options, opp, waveform, &m1);
    mfcc.Compute(waveform, 1.0, &m2);
    if (m1.NumRows() != m2.NumRows()) {
      KALDI_LOG << "I: " << i << " Pitch " << m1.NumRows() << " MFCC " << m2.NumRows();
    }
  }
}

Note that the phenomenon happens when approaching 600, 720, 880. I think the pattern is when
len(input) - frame_size) % frame_shift > 156
i.e. len(input) - 400) % 160 > 156
i.e. in the last 3 samples oof each frame calculation window (?)
Assuming these values are right, I think that the MFCC lengths are as expected, and pitch extraction sometimes returns one more frame than expected

Doesn't seem like a major bug, but caused some headaches for me when using the MFA library (ref)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant