Releases: Uberi/speech_recognition
Releases · Uberi/speech_recognition
Version 3.7.1
As usual, get it with pip install --upgrade SpeechRecognition
- New
grammar
parameter forrecognizer_instance.recognize_sphinx
- now, you can specify a JSGF or FSG grammar to PocketSphinx (thanks @aleneum!). - Update PyAudio to version 0.2.11 - this fixes a couple memory management issues users have been experiencing.
- Update FLAC to 1.3.2 on all platforms - this will make it easier to support more audio formats in the near future.
- Fixes for various APIs on Python 3.6+ - small changes in
urllib.request
behavior made requests fail in certain situations. - Fixes for Bing Speech API timing out due to some backwards incompatible changes to their API.
- Restore original IBM audio segmentation behaviour - previously, it would stop recognizing after the first pause. Now, it will recognize all speech in the input audio, as it did before IBM's changes.
- Fix links in PocketSphinx docs and library reference. Add-on language models now available from Google Drive, including the now-officially-supported Italian model.
- New troubleshooting entries for JACK server in README.
- Documentation and build process updates.
Version 3.6.5
Quick bugfix for PortableNamedTemporaryFile
:
- Fix file descriptor opening on Python 2.
- Add tests for Sphinx keyword matching.
Version 3.6.4
Bugfix release!
- Fix
tempfile.NamedTemporaryFile
on Windows, by replacing it with aPortableNamedTemporaryFile
class. Previously, it didn't necessarily support the file being re-opened after originally opened. - Documentation/troubleshooting improvements (thanks @hassanmian!).
- Add support for 24-bit FLAC audio files (thanks @sudevschiz!).
- Fix
phrase_time_limit
being ignored forlisten_in_background
(thanks @dodysw!) - Added lots of new audio regression tests.
- Code cleanup for tests and examples.
Version 3.6.3
Version 3.6.0
This is more of a maintenance release, but a few features slipped in as well:
- Support for the Google Cloud Speech API with
recognizer_instance.recognize_google_cloud
(thanks @Thynix!), plus documentation and examples. - Automatic sample rate detection in
speech_recognition.Microphone
- this should fully resolve all the "Invalid sample rate" issues from PyAudio. - Project now has automated tests and continuous integration with TravisCI. It's pretty nifty, and has already caught a few things during development!
- Keywords example for
recognizer_instance.recognize_sphinx
. - Documentation improvements and updated advice in troubleshooting and library reference.
- Bugfix - Google Speech Recognition sometimes didn't return the text with the highest confidence (thanks @akabraham!).
- Bugfix -
EOFError
upon encountering malformed audio files; a proper exception message is now given. - Updated FLAC binaries for OS X.
- Bugfix - invalid FLAC binary path on OS X (thanks @akabraham!).
- Code cleanup.
Version 3.5.0
- Support for the Houndify API with
recognizer_instance.recognize_houndify
(thanks @tb0hdan!). recognize_sphinx
now supports keyword-based matching via thekeywords=[("cat", 30), ("potato", 45)]
parameter.- The second number in each pair is the sensitivity, which determines how loosely Sphinx will interpret speech to be those keywords - higher numbers mean more false positives, while lower numbers mean a lower detection rate.
- A new example for keyword matching is now available.
- BREAKING CHANGE: API.AI STT API IS BEING SHUT DOWN SOON. (source)
- For now, the
recognize_api
function will keep working if you're on a paid API.AI plan, and we will not be removing it until the service is shut down entirely. - It is best to transition to another backend as soon as possible. I recommend Microsoft Bing Voice Recognition or Wit.ai for previous API.AI users.
- For now, the
phrase_time_limit
option for listening functions, to limit phrase lengths to a certain number of seconds.- Support for operation timeouts with
recognizer_instance.operation_timeout
- this can be used to ensure long requests always take finite time. recognize_ibm
now opts out of request logging by default, for improved user privacy (thanks @michellemorales!). This is a breaking change if you previously relied on request logging behaviour.- Bugfix -
listen()
sometimes didn't terminate on finite-length streams. - Bugfix - Microsoft Bing Voice Recognition changed their authentication API endpoint, so that required some small code updates (thanks @tmator!).
- Bugfix - 24-bit audio now works correctly on Python 2.
- Update Wit.ai API version from deprecated version.
- A bunch of documentation updates, fixes, and improvements.
Version 3.4.6
Bugfix release.
Changes:
- api.ai now requires the
sessionId
field, so we'll just add that in (thanks @jhoelzl!). - Improve documentation a bit.
- Various other small fixes.
Version 3.4.5
Changes:
- Bug fix: non-24-bit audio wasn't converted properly to 16-bit audio on Python 2, due to the new 24-bit audio shim. Thanks to @jhoelzl for reporting!
Version 3.4.4
Maintenance release:
- Python versions less than 3.4 don't support 24-bit audio properly. We now have pure-Python shims that will allow 24-bit audio to work on those old Python versions, though they will be somewhat slower. Thanks to @danse for reporting the issue!
- Added updated Pocketsphinx binaries and Pocketsphinx installation procedures to match improvements on their end.
- Fix Unicode file paths on Windows.
- Fix caching in
recognizer_instance.recognize_bing
. - We now use the Manylinux Docker image for building FLAC. Hopefully, this will make building universal Linux binaries easier for packagers.
Version 3.4.3
Bugfix release:
- Thanks to @jhoelzl, api.ai language support works again for non-English languages.
We're now GPG signing all our release tags. Under the releases page, you should see the following:
This tells you that GitHub thinks the Git tag is the same as the one we intended to release.
This key can also be found on the SKS keyservers, and you can import it with the following command:
gpg --keyserver x-hkp://pool.sks-keyservers.net --recv-keys 0x5F56B350
The packages on PyPI are signed as well - the signature can be downloaded under the "pgp" link on the SpeechRecognition PyPI page.