Browser-Based Voice Composer Converts Humming to Code

📋

Key Facts

✓ Uses TensorFlow.js for real-time pitch detection.
✓ Outputs MIDI files and Strudel/TidalCycles code.
✓ Runs entirely client-side using Web Audio API.
✓ Utilizes four algorithms: CREPE, YIN, FFT/HPS, and AMDF.
✓ Built to bridge the gap for users without music theory knowledge.

Quick Summary

A new browser-based application called Voice Composer has been released, designed to convert voice input into usable musical data. The tool targets live coding and live DJing communities by allowing users to hum melodies and instantly generate code for pattern-based music systems.

The application utilizes TensorFlow.js and other algorithms for real-time pitch detection. It runs entirely in the browser via the Web Audio API, ensuring that raw audio data never leaves the user's machine. The tool outputs MIDI files, visual piano rolls, and code compatible with Strudel and TidalCycles.

Core Functionality and Algorithms

The Voice Composer addresses a specific problem for aspiring live coders: the difficulty of translating melodic ideas into code without extensive music theory knowledge. By capturing audio in real-time, the tool converts vocal input into algorithmic patterns immediately.

The application employs four distinct pitch detection methods to handle various audio inputs:

CREPE: A deep learning model via TensorFlow.js, noted for high accuracy but higher computational cost.
YIN: An autocorrelation-based fundamental frequency estimation method, fast and effective for clean monophonic input.
FFT with Harmonic Product Spectrum: Optimized for handling harmonic-rich sounds.
AMDF: Average Magnitude Difference Function, a lightweight option for quick processing.

Users can switch between these algorithms based on their specific use case and input quality.

Technical Architecture

Built using React, the tool operates entirely within the browser. It leverages the Canvas API to provide real-time waveform rendering and visual feedback through a piano roll interface.

The decision to keep all processing client-side ensures privacy and low latency. The creator envisions the tool evolving into a full-featured Digital Audio Workbench (DAW) over time. Currently, it is optimized for desktop use, where it functions most effectively.

Integration with Live Coding

The primary output targets the live coding environment. By generating Strudel/TidalCycles code, the tool allows immediate integration into existing performance setups. This removes the barrier of manually writing syntax for complex patterns.

The creator built the application over a weekend to solve their own challenge of learning live coding without a musical background. The resulting software makes it "trivial to capture melodic ideas and immediately use them in pattern-based music systems."

Availability and Future Development

The tool is currently available for testing via a hosted link. The source code has been made available on a public repository, inviting community contributions and feedback.

Future updates aim to expand the application's capabilities, moving closer to the functionality of a standard DAW. This suggests potential support for multi-track recording, effects processing, and broader file format compatibility in subsequent releases.

Browser-Based Voice Composer Converts Humming to Code

Key Facts

Quick Summary

Core Functionality and Algorithms

Technical Architecture

Integration with Live Coding

Availability and Future Development

Related Articles

AI Transforms Mathematical Research and Proofs

Erin Doherty Wins Golden Globe for Adolescence

Connor Storrie and Hudson Williams Present at Golden Globes

US Senators Express Skepticism on Iran Military Options

Key Facts

Quick Summary#

Core Functionality and Algorithms#

Technical Architecture#

Integration with Live Coding#

Availability and Future Development#

Related Articles

AI Transforms Mathematical Research and Proofs

Erin Doherty Wins Golden Globe for Adolescence

Connor Storrie and Hudson Williams Present at Golden Globes

US Senators Express Skepticism on Iran Military Options

Quick Summary

Core Functionality and Algorithms

Technical Architecture

Integration with Live Coding

Availability and Future Development