.sound
DotSound
Behind a simple player sits a distributed system: its own GPU worker for lyrics, a closed core for business rules.
the essence
Upload your own tracks or pull music from VK, SoundCloud, Spotify, YouTube and Yandex Music: all in one ad-free player. Entry point is Telegram: open, search, listen. Lyrics surface line by line on their own, mixes adapt to your taste, and everything heavy (ASR, vocal separation, embeddings) runs on a separate GPU worker.
Under the hood
An interactive breakdown of the core mechanics. Pick a domain to see how it works inside.
Track upload
A track enters the system once and for good.
On upload the file is scanned for viruses and gets a unique digital fingerprint. If that track is already in the database it isn't duplicated: the reference count goes up instead. One file in storage no matter how many people upload it.
- Velvet fog in my lungs, I breathe slow
- You said I'm too cold, I said "Yeah, I know"
- Welcome to forever
- Okay, it's shorter than you think
- Ты запомнишь меня, you'll remember me
- LIPSTICK Kai Angel ft. 9mice 2:47
- JUMP! Kai Angel 3:12
- DANCE LIKE U IN PAIN Kai Angel 1:27
- SLAYERR Kai Angel 1:55
- welcome to forever Kai Angel 2:48
added to the catalog
- ONDA ANDAR artist translit онда → onda
- red weather ONDA ANDAR · 2025 metadata
- Ночное кафе Mirèle & ONDA ANDAR fuzzy match
tap a job, I'll show what it does
scale
size and timeframe: what stands behind the product.
architecture
Thin aiogram client: Mini App entry, inline search, likes
- Backend Hub HTTPS
Platform web player: track upload, HLS streaming, WebSocket
- Backend Hub WS · HLS
GPU pull worker: Demucs + faster-whisper, audio embeddings
- Backend Hub HMAC pull
FastAPI: API, Taskiq queue, SHA-256 dedup, HLS, Mini App delivery
- PrivateCore import
- PostgreSQL 16
- Redis 7
- Elasticsearch 8
- MinIO · S3
PrivateCore: business rules, ranking policy, security constants
Metadata, users, playlists; ~70 models, 241 migrations
WS pub/sub, rate limit, file_id cache, play counters
Full-text over tracks, artists and recognized lyrics
Content-addressed audio by SHA-256, HLS segments
stack
- Language 2
- Framework 5
- Data 4
- Infrastructure 4
- Client 3
- AI / ML 3
Language
- Python 3.12
- TypeScript
Framework
- FastAPI 0.136
- SQLAlchemy 2
- aiogram 3.28
- React 18
- Vite 5
Data
- PostgreSQL 16
- Elasticsearch 8
- Redis 7
- MinIO / S3
Infrastructure
- Docker
- Taskiq
- HMAC pull protocol
- Prometheus + OpenTelemetry
Client
- Telegram Mini App
- WebSocket
- hls.js
AI / ML
- faster-whisper 1.2
- Demucs 4.0
- PyTorch 2.7
what it does
the product's key capabilities right now.
Upload right in the player
A track uploads straight from the web player: resilient chunked upload, a duplicate check before you even send, your own cover and video. No separate form, no app to install.
The upload runs in chunks (S3 multipart) and survives a dropped connection or a tab reload, because the queue lives in IndexedDB and resumes on its own. The server hashes the content with SHA-256: re-uploading the same track adds a reference (ref_count++) instead of a copy. One file for everyone.
Auto-lyrics (ASR)
Demucs lifts the vocals, faster-whisper transcribes and syncs the text line by line with timecodes. The words appear on their own.
Worker pipeline: Demucs (--two-stems=vocals) lifts the vocals, faster-whisper transcribes, alignment timestamps each line, so lyrics surface in sync with the music.
HLS + content dedup
Adaptive HLS over content-addressed storage: the same track is stored once for everyone, segments cached as immutable.
A segment's address is the content hash itself: blobs/a1/a1f9...ts. An identical track is never transcoded twice, and segments cache immutable for a year.
Search over lyrics and metadata
A full-text Elasticsearch index over metadata and recognized lyrics, ranked by popularity.
Both metadata and recognized lyrics are indexed, so a track is found by a line from the song. Ranked by popularity.
Personal mixes and radio
Daily Mix, Weekly Mix, genre sets and endless radio from any seed track: the feed adapts to what you actually play.
Similar tracks and artists come from audio embeddings on the worker, while the ranking rules and delivery policy live in the closed core. Radio starts from a seed track and pulls its nearest neighbors by vector.
Collaborative playlists
Playlists you can edit together, with a collage cover that assembles itself from the tracks inside.
Several collaborators keep one playlist; the cover rebuilds from the tracks' artwork. CRUD and shared access live on the backend, with no separate app.
Telegram as the way in
A thin bot: one-tap entry to the player, inline track search in any chat, likes and mixes right inside Telegram. Sign-in and job-ready alerts land there too.
The bot decides nothing on its own: it reads the message and calls a single Backend client, and all logic stays in the core. Inline mode drops tracks with play and like buttons straight into the chat.
Import from sources
Upload your own tracks right from the player or import from SoundCloud, YouTube, VK, Yandex Music and Spotify: the catalog fills without extra steps.
Personal tracks upload from the web player in chunks (S3 multipart). External sources import like this: SoundCloud, YouTube and VK via yt-dlp; Spotify and Yandex Music connect by linking an account over OAuth. The catalog grows without manual uploads.
HMAC GPU offload
The worker pulls jobs over HMAC and runs lyrics and recommendations on GPU. No exposed ports anywhere.
Worker loop: heartbeat → claim with a lease and deadline → fetch audio over a one-time link (OTT, 5 min, IP-pinned) → result. Not a single exposed port.
Moderation and compliance
Track complaints, auto-hide past a threshold, and admin moderation. Some social features (chats, comments) are deliberately switched off to meet Russia's 149-FZ information-intermediary rules.
Complaints accumulate on a track and auto-hide it past a threshold until a moderator rules. The disabled features stay in the code, flagged and unrouted, ready to switch on when regulation allows.
timeline
how the product grew from its first version.
-
27 Mar 2026
Project start: the first commit
27 March 2026, 12:13: the first commit. A first version stood up in a day across two repositories (Backend + Bot): FastAPI, PostgreSQL 16, React 18 Mini App, Telegram auth; SHA-256 dedup ensures a track is never stored twice.
-
31 Mar 2026
Adaptive HLS streaming
Dual-bitrate HLS (128k/64k) over content-addressed storage: segments cached as immutable.
-
12 Apr 2026
PrivateCore: the closed core
Business rules, ranking policy and security constants move into a separate closed repository, PrivateCore, the fourth in the ecosystem. The open client talks to it via an internal-token scoped JWT; passwordless sign-in (Magic Link) and TOTP 2FA arrive alongside.
-
22 Apr 2026
ComputeWorker: the GPU worker
The third open repository: a GPU worker on Demucs + faster-whisper. Vocals separated, lyrics transcribed and synced line by line.
-
24 Apr 2026
Search, import and content storage
Full-text Elasticsearch over tracks and lyrics; import from SoundCloud, Spotify, YouTube and VK; audio stored by SHA-256 with ref-count, one file for everyone.
-
26 Apr 2026
Recommendations and radio
Audio embeddings on the worker, Daily/Weekly Mix, endless radio from a seed track, and collaborative playlists; ranking rules live in the closed PrivateCore.
-
7 May 2026
iOS interface redesign
Motion primitives, Dynamic Island and an Apple Music aesthetic; gapless crossfade between tracks and karaoke-synced lyrics.
-
10 May 2026
Law and moderation
Alignment with Russia's 152/149/242/436-FZ, text censorship of lyrics and descriptions, auto-hide of reported tracks, and anti-abuse.
-
17 May 2026
Offline and production hardening
Full offline mode with auto-cache and prefetch, a dedicated streaming egress pool with Tor circuits, ClamAV scanning on upload, a background-job dispatcher, Prometheus + OpenTelemetry and SSH deploy.
-
Jun 2026
Now: source-available and stabilization
The three open repositories ship as a source-available showcase with licenses, gitleaks in CI and DotCore-standard docs; PrivateCore stays closed. No new modules, just targeted tuning of lyrics and recommendation quality.
repositories
- Backend hub
The heart of the system: API, database, task queue, React Mini App delivery, SHA-256 dedup, HLS streaming, and full-text search.
github.com/network-user/DotSoundBackend - Bot telegram-ui
A thin Telegram client: Mini App entry, inline search, likes and mixes, sign-in alerts. All logic stays in the core.
github.com/network-user/DotSoundBot - ComputeWorker asr-worker
A pull-based worker on faster-whisper + Demucs over an HMAC protocol: lyrics and recommendations are computed on GPU with no exposed ports.
github.com/network-user/DotSoundComputeWorker