The Goal: Click a Button, Get Markers
The idea behind BeatMarker is simple enough to explain in one sentence: you select an audio clip in your Adobe timeline, click a button, and the plugin places colored markers on every beat of the music. No manual tapping, no external tools, no subscription.
The implementation is the interesting part. Claude Code needed to decode audio files — WAV and MP3 — entirely inside an Adobe plugin environment, extract beats with high accuracy, and do it fast enough that the editor doesn't sit waiting. No server. No AI model. No cloud API. Just JavaScript running locally in a sandboxed plugin panel.
This is the story of how that pipeline was built, what worked, what catastrophically failed, and what the final numbers look like.
Why Not Just Use the Web Audio API?
The obvious approach is AudioContext.decodeAudioData() — it's the standard Web Audio way to decode audio files. It handles WAV, MP3, AAC, everything. You pass it an ArrayBuffer, it gives you back a decoded PCM AudioBuffer. Clean, fast, battle-tested.
In UXP (Adobe Premiere Pro plugins), AudioContext doesn't exist. The entire Web Audio API is absent. Claude Code confirmed this by checking:
typeof AudioContext // "undefined"
typeof webkitAudioContext // "undefined"
This meant building a complete audio decoding pipeline from scratch, in pure JavaScript, with no browser APIs. Every byte of the audio file has to be read and interpreted manually.
Decoding WAV Files by Hand
WAV files follow the RIFF container format. The file starts with a header that describes the audio: sample rate, bit depth, number of channels, encoding type. Everything after the data chunk is raw PCM samples. Claude Code implemented a DataView-based decoder to parse this:
function decodeWav(arrayBuffer) {
const view = new DataView(arrayBuffer);
// Verify RIFF header
const riff = String.fromCharCode(view.getUint8(0), view.getUint8(1),
view.getUint8(2), view.getUint8(3));
if (riff !== 'RIFF') throw new Error('Not a WAV file');
const audioFormat = view.getUint16(20, true); // 1=PCM, 3=float32
const numChannels = view.getUint16(22, true);
const sampleRate = view.getUint32(24, true);
const bitsPerSample = view.getUint16(34, true);
// Find the 'data' chunk (it may not start at byte 44)
let dataOffset = 12;
while (dataOffset < view.byteLength - 8) {
const chunkId = String.fromCharCode(
view.getUint8(dataOffset), view.getUint8(dataOffset+1),
view.getUint8(dataOffset+2), view.getUint8(dataOffset+3)
);
const chunkSize = view.getUint32(dataOffset + 4, true);
if (chunkId === 'data') break;
dataOffset += 8 + chunkSize;
}
dataOffset += 8; // skip 'data' + size fields
// Decode samples to Float32 [-1.0, 1.0]
const samples = extractSamples(view, dataOffset, bitsPerSample, audioFormat);
const mono = numChannels > 1 ? mixToMono(samples, numChannels) : samples;
return { sampleRate, samples: mono };
}
The format supports 8-bit unsigned, 16-bit signed, 24-bit signed, and 32-bit float samples. Each encoding requires different math to convert to the normalized float range expected by the beat detection library.
MP3 Support: WASM Crashes Premiere Pro
WAV is straightforward because it's uncompressed PCM. MP3 is an entirely different beast — it uses psychoacoustic compression, Huffman coding, and MDCT transforms. Implementing an MP3 decoder from scratch is a weeks-long project.
The obvious solution is a WASM (WebAssembly) based decoder — fast, accurate, widely used. Claude Code tried mpg123-decoder, which is one of the best JS MP3 decoders available. It crashed Premiere Pro immediately. Not a JavaScript error. An actual application crash. No log, no error message — the host app just died.
The reason: WASM with pthreads (multi-threading) is not supported in UXP. Even single-threaded WASM variants caused crashes in testing. This ruled out virtually every high-performance audio decoder.
The solution was js-mp3 — a pure JavaScript MP3 decoder with zero native dependencies. Slower than WASM, but it doesn't crash the host. It works. For 3–5 minute tracks at 44.1 kHz, decoding takes 1–2 seconds — acceptable for a tool that's replacing manual marker placement.
The MP3 Timing Problem: Phantom Samples
MP3 decoders don't start cleanly at sample 0. There are two sources of phantom samples at the beginning of every MP3 decode that cause timing offsets — markers placed slightly before the actual beat:
- Encoder delay — recorded in the Xing/LAME header of the MP3 file. Varies per encoder and encode settings.
- MDCT startup delay — specific to js-mp3's implementation. After testing, this is consistently 2,070 samples at 44,100 Hz regardless of the source file.
Both delays must be stripped before passing audio to the beat detector. Claude Code calibrated the final offset against mpg123-decoder as a reference: the result was 0 frames offset at any standard video frame rate. Markers land exactly on the beat.
// Read encoder delay from Xing/LAME header
function getEncoderDelay(arrayBuffer) {
const view = new DataView(arrayBuffer);
// LAME header starts after Xing/Info tag
// Encoder delay is stored at a known byte offset in the LAME tag
// Returns 0 if no LAME tag is found (CBR files)
return parseLameHeader(view) ?? 0;
}
// Strip phantom samples before analysis
const encoderDelay = getEncoderDelay(mp3Buffer);
const mdctDelay = 2070; // js-mp3 specific
const totalDelay = encoderDelay + mdctDelay;
const cleanSamples = decodedSamples.slice(totalDelay);
The Beat Detector: music-tempo
With clean PCM samples in hand, the next step is extracting beats. BeatMarker uses music-tempo — a pure JavaScript library that implements BPM detection via autocorrelation and onset detection. It takes a Float32Array of audio samples and a sample rate, and returns a list of beat timestamps in seconds.
import { analyzeAudio } from 'music-tempo';
// Resample to 44100 Hz if needed — music-tempo expects a standard rate
const normalizedSamples = resampleIfNeeded(samples, sampleRate, 44100);
const result = analyzeAudio(normalizedSamples, 44100);
// result.beats — array of beat timestamps in seconds
// result.bpm — detected tempo
The library handles the heavy lifting: onset detection, tempo estimation, beat phase alignment. What it doesn't provide is a confidence metric — how reliable is this BPM estimate? That had to be built.
Confidence Scoring: The CV Formula
A metronome click track would give perfectly even beat intervals. Real music has swing, rubato, fills, and tempo variation. The beat detector still finds beats, but they're spaced less evenly. The consistency of spacing is a reliable proxy for detection confidence.
Claude Code implemented confidence scoring using the Coefficient of Variation (CV) — a normalized measure of how much beat intervals vary relative to their mean:
function computeConfidence(beats) {
const intervals = [];
for (let i = 1; i < beats.length; i++) {
intervals.push(beats[i] - beats[i-1]);
}
const mean = intervals.reduce((a, b) => a + b, 0) / intervals.length;
const variance = intervals.reduce((a, b) => a + Math.pow(b - mean, 2), 0) / intervals.length;
const stddev = Math.sqrt(variance);
const cv = stddev / mean;
// Scale to 0–100: CV of 0 = 100%, CV of 0.25 = 0%
return Math.max(0, Math.min(100, Math.round((1 - cv * 4) * 100)));
}
The resulting score drives a color-coded indicator in the plugin UI:
High confidence (>85%) means the detected beats are probably right. Low confidence (<60%) means the track has complex rhythm, variable tempo, or the audio quality is poor — the editor should verify the markers manually.
Resampling: When the Audio Isn't 44.1 kHz
Music recorded at 48 kHz, 96 kHz, or 22.05 kHz all need to be resampled to 44.1 kHz before analysis. The beat library expects standard audio rates, and mismatched rates cause the BPM estimate to be wrong by a ratio (e.g., 48/44.1 ≈ 1.088 — beats appear 8.8% too fast or slow).
Claude Code implemented linear interpolation resampling — fast enough for plugin use, accurate enough that the BPM result doesn't shift perceptibly:
function resample(samples, fromRate, toRate) {
if (fromRate === toRate) return samples;
const ratio = fromRate / toRate;
const outputLength = Math.round(samples.length / ratio);
const output = new Float32Array(outputLength);
for (let i = 0; i < outputLength; i++) {
const srcPos = i * ratio;
const srcIndex = Math.floor(srcPos);
const fraction = srcPos - srcIndex;
const a = samples[srcIndex] ?? 0;
const b = samples[srcIndex + 1] ?? 0;
output[i] = a + (b - a) * fraction; // linear interpolation
}
return output;
}
The Complete Pipeline
Putting it all together, the full audio processing pipeline for one click of the "Analyze" button:
- Get file path — read from the selected clip's media path via the Premiere Pro API
- Read file —
fs.readFile()returns a UXP proxy ArrayBuffer; copy to native ArrayBuffer - Detect format — check magic bytes (RIFF = WAV, 0xFF 0xFB = MP3); don't trust the file extension
- Decode audio — custom WAV decoder or js-mp3 with delay stripping
- Mix to mono — average channels; beat detection is mono
- Resample — linear interpolation to 44,100 Hz
- Analyze —
music-temporeturns beat timestamps and BPM - Score confidence — CV formula → 0–100 → green/yellow/red
- Place markers — transaction batching in groups of 50 (Premiere limit)
Total time for a 4-minute WAV track: under 500ms. For a 4-minute MP3: 1.5–2 seconds (js-mp3 decode dominates). Both are fast enough to feel instant to a video editor whose alternative was placing markers by hand.
What This Means for You
The full audio pipeline — WAV decoder, MP3 decoder with timing correction, resampler, beat detector, confidence scorer — is bundled into a single JS file using esbuild. It runs inside any Adobe UXP plugin or CEP panel with no external dependencies, no WASM, no server calls.
The plugins that use this: BeatMarker for Premiere Pro and BeatMarker for After Effects — both open source, both free.