vx

In 2017, I wanted to be a fast typist, so I practiced typing a lot. Then one day I started feeling pain in my finger joints when I type. Unfortunately, it never went away from that point onward.

Among many ways to manage this, I try to type less. This means using my voice more. And hence I started the vx project.

This project began as a CLI application vxcli. Then an Electron application vxtron. Finally, the vx Chrome extension is its current form, is what I use to type on a day-to-day basis.

Get the Chrome extension https://chrome.google.com/webstore/detail/vx/obopnfigmanifpiojfhebcegjepgaiif

# Problems with existing offerings

I use a MacBook, and macOS comes with Dictation. Google Docs also has Voice Typing. You press a key to make it start listening. You speak. It types what you say. Pretty simple, right?

Here’s the problem: I am not a native speaker. A lot of times the computer would misrecognize what I said. And each time I had to hit the undo button.

This is me struggling to voice-type “Here’s the problem” in the paragraph above… ugh!

Just the punim, she has the problem, she has to pop him, he is the problem, ...

macOS’s Dictation also seems to have a few problems with rich text fields and web applications.

# Maybe I can do something to solve this…

I had an idea. Why not make computer just listen to me, and then copy what I said to the clipboard? Don't try to be smart by typing into the fields directly.

  • If it misrecognized what I said, I can just say it again. No need to press Undo.

  • If it got the results right, I just hit Paste. This way I can use it with most applications.

…with the plan in mind, I then got to work.

# vxcli

vxcli demo

vxcli is a simple command line application written in Node.js. To run it I just type vx into the terminal.

Speech recognition is done by the Google Cloud’s Speech-To-Text API. The accuracy is very high, but it’s also quite expensive.

vxcli’s GitHub repository https://github.com/dtinth/vxcli

The result is so great, it led me to pursue this project further. Having to run a command in the terminal every time I want to write something… that’s just isn’t quite convenient!

# vxtron

vxtron is, you guessed it, an Electron application.

Now, instead of running a command, I can just run vxtron and leave it in the background. It registers a global hotkey which I can press to make it start/stop listening.

I developed it live on stream (Thai language).

  • Part 1 — Prototyping as a web application
  • Part 2 — Porting to Electron application

vxtron’s GitHub repository https://github.com/dtinth/vxtron

I also made it about to listen to 2 different languages (a hotkey for each language). I also had my sister try out vxtron. She told me that because of it she can input text almost twice as fast.

It was designed for macOS. But I also had a Chromebook, and Chromebooks don’t run Electron apps…

# vxchrome

vxchrome is a Chrome extension. I was able to turn it into a Chrome extension because:

  1. Chrome extensions can register keyboard shortcuts, and a user can configure these shortcuts to be system-wide. That means you can use in any application, not just Google Chrome.

  2. Google Chrome implements the Web Speech API. It allows web applications (and Chrome extensions) to perform speech recognition for free. (This isn't available in Electron.)

vxchrome’s GitHub repository https://github.com/dtinth/vxchrome

Being a Chrome extension, it works on macOS, Windows, Linux, as well as Chrome OS.

In April 2020, I polished the extension and published it to the Chrome Web Store.

Get the Chrome extension

published by at , last updated at