Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Partial results on the gpu batch recognizer #1539

Open
starfurylab opened this issue Mar 14, 2024 · 7 comments
Open

Partial results on the gpu batch recognizer #1539

starfurylab opened this issue Mar 14, 2024 · 7 comments

Comments

@starfurylab
Copy link

Hi! Great project, especially excited about the gpu support.
But i have a question, is it possible to use something like PartialResult() when working on gpu (rtx2080ti, cuda12.3), as it is done in websocket/asr_server.py?
For example, in a real-time audio stream analysis scenario, which is perfectly handled by asr server running on the cpu, but would like more performance than a cpu can provide.
Best Regards.

@nshmyrev
Copy link
Collaborator

Hello. It is possible but not implemented

@starfurylab
Copy link
Author

Thanks! it's good to know that it's possible in principle. Can you give me a hint? Will such implementation affect only vosk-api or kaldi too? And maybe give me a direction to look in? I want to try to implement this feature.

@nshmyrev nshmyrev transferred this issue from alphacep/vosk-server Mar 16, 2024
@starfurylab
Copy link
Author

Thanks, I'll let you know when I get something

@starfurylab
Copy link
Author

starfurylab commented Apr 8, 2024

Hi. Sorry for the delay. I have created a pull-request: #1554

I added partial results, but I don't know how to link it to other languages, so only in c, and added an example

On tests I got a limit of about 510-530 realtime streams from several test files on the rtx2080ti at about 15-20% of the i7-8700

Problems I noticed: it crashes when removing the model when removing the cuda pipeline instance, but I didn't look deeply into kaldi

delete cuda_pipeline_;

ASSERTION_FAILED ([5.5.1094~1-2b69ae]:~BatchedThreadedNnet3CudaOnlinePipeline():batched-threaded-nnet3-cuda-online-pipeline.cc:60) Assertion failed: (available_channels_.empty() || available_channels_.size() == num_channels_)

[ Stack-Trace: ]
../src/libvosk.so(kaldi::MessageLogger::LogMessage() const+0x7f6) [0x7bf5e0db5076]
../src/libvosk.so(kaldi::KaldiAssertFailure_(char const*, char const*, int, char const*)+0x75) [0x7bf5e0db5ae5]
../src/libvosk.so(kaldi::cuda_decoder::BatchedThreadedNnet3CudaOnlinePipeline::~BatchedThreadedNnet3CudaOnlinePipeline()+0xb1e) [0x7bf5e0951f3e]
../src/libvosk.so(BatchModel::~BatchModel()+0x1d3) [0x7bf5e094af63]
../src/libvosk.so(vosk_batch_model_free+0x12) [0x7bf5e0917842]
./test_vosk_gpu_batch(+0x150b) [0x65475b2dd50b]
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7bf5e0229d90]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80) [0x7bf5e0229e40]
./test_vosk_gpu_batch(+0x12a5) [0x65475b2dd2a5]

Aborted (core dumped)

@SIG777
Copy link

SIG777 commented Dec 11, 2024

Hello! Question about the topic of the issue.

Are there any plans to develop the Python code?
I also need to get the results of PartialResult() when running on the GPU, as done in websocket/asr_server.py..

@nshmyrev
Copy link
Collaborator

Are there any plans to develop the Python code?

Hi. No plans for that. We are moving to pytorch models and they don't have partial results as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants