This is a collection of command-line and GUI tools for capturing and analyzing audio data. The most interesting tool is called keytap - it can guess pressed keyboard keys only by analyzing the audio captured from the computer's microphone.
Build instructions
Dependencies:
- SDL2 - used to capture audio and to open GUI windowslibsdl
- FFTW3 - some of the helper tools perform Fourier transformations fftw
git clone https://github.com/ggerganov/kbd-audio
cd kbd-audio
git submodule update --init
mkdir build && cd build
cmake ..
make
Windows(todo, PRs welcome)
Tools
record-full
Record audio to a raw binary file on disk
Usage: ./record-full output.kbd
play-full
Playback a recording captured via the record-full tool
Usage: ./play-full input.kbd
record
Record audio only while typing. Useful for collecting training data for keytap
Usage: ./record output.kbd
play
Playback a recording created via the record tool
Usage: ./play input.kbd
keytap
Detect pressed keys via microphone audio capture in real-time. Uses training data captured via the record tool.
Usage: ./keytap-gui input0.kbd [input1.kbd] [input2.kbd] ...
Live demo (WebAssembly threads required)
keytap2 (work in progress)
Detect pressed keys via microphone audio capture. Uses statistical information (n-gram frequencies) about the language. No training data is required. The 'recording.kbd' input file has to be generated via the record-full tool and contains the audio data that will be analyzed. The 'n-gram.txt' file has to contain n-gram probabilities for the corresponding language.
Usage: ./keytap2-gui recording.kbd n-gram.txt
Feedback
Any feedback about the performance of the tools is highly appreciated. Please drop a comment here.