Home Linux Speech Note: An Offline Speech Recognition, Text-to-Speech and Translation App for Linux

Speech Note: An Offline Speech Recognition, Text-to-Speech and Translation App for Linux

By sk
Published: Updated: 2.7K views

Speech Note is an open-source, privacy-focused application that offers offline Speech to Text (STT), Text to Speech (TTS), and Machine Translation (MT) capabilities. With Speech Note, you can take, read, and translate notes in multiple languages.

Speech Note works entirely offline, with all text and voice processing happens locally on your device. This ensures your privacy by never sending data to the internet.

It supports a wide range of languages for speech recognition, synthesis, and translation, with new languages being added regularly.

Speech Note utilizes various processing engines for each function, providing flexibility and choice to users. Currently, Speech Note uses the following processing engines:

  • Speech to Text (STT)
    • Coqui STT (a fork of Mozilla DeepSpeech)
    • Vosk
    • whisper.cpp
    • Faster Whisper
    • april-asr
  • Text to Speech (TTS)
    • espeak-ng
    • MBROLA
    • Piper
    • RHVoice
    • Coqui TTS
    • Mimic 3
    • WhisperSpeech
  • Machine Translation (MT)
    • Bergamot Translator

Advanced users can enable custom models compatible with supported engines by editing the configuration file and restarting the application.

Speech Note is completely free to use and the source code is publicly available under the Mozilla Public License Version 2.0.

Written in C, Speech Note is available for Linux and Sailfish OS.

In the following sections, we will discuss how to install and use Speech Note in Linux.

Install Speech Note in Linux

Speech Note is available for various Linux distributions and Sailfish OS.

For Linux platforms, Speech Note is available on Flathub. Make sure Flatpak is installed on your Linux system.

Once flatpak is installed, you can install Speech Note using command:

flatpak install flathub net.mkiol.SpeechNote

Speech Note's Flatpak distribution offers flexibility through different packages catering to specific needs:

  • Base Package (net.mkiol.SpeechNote): Contains all dependencies for full functionality, including "heavy" libraries. It requires a significant amount of disk space after installation.
  • Add-on Packages: Provide GPU acceleration for AMD (net.mkiol.SpeechNote.Addon.amd) and NVIDIA (net.mkiol.SpeechNote.Addon.nvidia), speeding up certain operations.
  • Tiny Package: A smaller alternative offering only basic features, ideal for users with limited disk space. It can also be combined with GPU acceleration add-ons.

A comparison table outlining the sizes and features of each package is available in the Speech Note's official GitHub repository.

If you're on Arch Linux and its variants like EndeavourOS and Manjaro Linux, Speech Note packages are available as dsnote and dsnote-git in Arch User Repository (AUR). You can install it using any AUR helpers such as Paru;

paru -S dsnote

or Yay.

yay -S dsnote

How to Use Speech Note

After installing it, you can launch Speech Note app from the Menu or by running the following command:

flatpak run net.mkiol.SpeechNote

Upon first launch, Speech Note will prompt you to choose your languages that you want to use:

Speech Note Welcome Message Wizard
Speech Note Welcome Message Wizard

Click Close button to the close the welcome wizard. Go to Languages tab and select a language of your choice.

Select Language
Select Language

Next, you will need to download the model files for the Speech to Text, Text to Speech, and Text Translator engines. Please ensure that you download at least one model for each engine. If you're unsure which model to choose, simply click the info button to view more details about each model.

Download Models
Download Models

You can download multiple models, experiment with different options, and choose the one that works best for you.

Once you have downloaded the language files for each model, you can begin using Speech Note for Text-to-Speech, Speech-to-Text, or to translate the provided text.

Convert Text-to-Speech or Speech-to-Text

To test the Text-to-Speech functionality, type some text in the "Notepad" section and click the "Read" button. The application will read the text aloud for you.

Text to Speech Testing with Speech Note
Text to Speech Testing with Speech Note

Similarly, you can click the "Listen" button to try the "Speech to Text" feature. Speak into your microphone, and Speech Note will recognize your voice and convert it to text in real time.

Again, if you have downloaded multiple models, select your preferred model from the drop-down menu.

Translate Text

To translate text from one language to another, go to the "Translator" section in the top right corner. Enter the text on the left side and click the Translate (arrow) button. Speech Note will then translate it into the corresponding output language.

Translate Text using Speech Note
Translate Text using Speech Note

As shown in the screenshot, I have translated the English text into German.

If you have downloaded multiple models, remember to select your desired model from the drop-down menu.

To enable real-time translation, simply toggle the "Translate as you type" option. Once enabled, Speech Note will translate the text as you type, eliminating the need to click the arrow button for translation.

My Verdict

I tested Speech Note on my Debian 12 desktop, which has 32 GB of RAM and an Intel Core i3 11th Gen processor. It does not have a GPU though.

I downloaded the "English (Piper Bryce Medium Male)" model for the Text-to-Speech feature and the "English Indian (Vosk Small)" model for Speech-to-Text. For translation, I downloaded the "English to German" model.

While the Text-to-Speech and translation features worked as expected, the Speech-to-Text function did not perform as intended. This may be due to the model issue. I plan to try different models later to see if that improves the functionality.

I will continue more testing in the days to come and update this post accordingly.

Conclusion

Speech Note is a powerful and versatile TTS, STT and translator application that prioritizes user privacy. Its offline capabilities, multilingual support, and open-source nature make it an excellent choice for Linux and Sailfish OS users.

If you're looking for a secure and offline TTS or Speech recognition app, Speech Note is definitely worth checking out.

Resource:

Related Read:

You May Also Like

Leave a Comment

* By using this form you agree with the storage and handling of your data by this website.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

This website uses cookies to improve your experience. By using this site, we will assume that you're OK with it. Accept Read More