voice to text python script for ubuntu 24.04 using whisper from openai

created mostly by ChatGPT.

Update

Here is a link to the new code: https://mega.nz/file/OdlHwZLT#q2-t0WIVPPY4l0VVpTshKN19BgHWSFfKPKquoP_5WXU

you’ll have to go into the main Python script and change the path to ydotool to match your own system.

Also, you may have to tweak the SILENCE_THRESHOLD constant to be appropriate to your microphone.

Old Code

download the code here: https://mega.nz/file/zEcGXLQa#hdFt1DO6MRslbmjzk0dmUY6sAkw6crdFCvpmibVHy7k

install the prerequisites in your python virtual environment:

  • pip install openai-whisper sounddevice pystray pillow plyer

install other prerequisites in ubuntu:

  • ydotool
  • aplay

i’m not sure what the best way to install the previous items is

now manually create a desktop launcher for the “voice_to_text.py” script (there should be youtube videos showing how to create a desktop launcher)

the command parameter may look something like this if you have ydotool installed in a non-standard directory (for example, mine is installed at “/home/<username>/.bin/PATH/ydotool”)

bash -c "export PATH=$PATH:/home/<username>/.bin/PATH; /home/<username>/my-venv/bin/python '/home/<username>/voice_to_text/voice_to_text.py'"

“/home/<username>/my-venv/bin/python” is the path to the virtual python environment on my computer.

if you can figure out how to install ydotool system-wide, then you don’t need this complex of a command for the launcher.

now create a GNOME Keyboard Shortcut for the “voice_to_text_toggle.py” file. the command for the keyboard shortcut looks similar to this on my system:

python3 /home/<username>/voice_to_text/voice_to_text_toggle.py

by pressing this keyboard shortcut you can start or stop the transcription.

if you have a faster computer than mine (my cpu was first released about 15 years ago), then you can change this line: “model = whisper.load_model(“tiny.en”)” in the python script to use a larger whisper model instead. just search their github page to find a list of models.