AI-Enriched Music Discovery & Playlist Curation Platform
TagTune is a fullstack web application and automated data engineering pipeline designed to bridge the gap between black-box streaming recommendations and controlled music curation. Built on a curated YouTube Music database, it solves the core limitations of standard music services by providing users with explicit, tag-based recommendation control and syncing directly into their YouTube account as a private playlist.
Interactive walkthrough of the core user interface and features
Using NextAuth v5 to request specific Google and YouTube OAuth scopes for playlist synchronization.
Platform engineering benchmarks and implementation details
Extracted and normalized 5 key audio metrics (BPM, Energy, Valence, Danceability, Acousticness) from audio files using pre-trained Essentia TensorFlow neural networks.
Combined Essentia CNN audio vectors with 768-dimensional semantic text embeddings from the Gemini Embedding API to represent comprehensive track profiles.
Developed a prompt-engineered pipeline using Gemini 3.1 Flash to classify tracks into a 3-tier genre taxonomy (Primary, Sub, and Micro-genres) across 200+ distinct genres.
Engineered regex-based description parsers and search fallbacks to identify cover songs and trace original tracks, linking them via self-referential PostgreSQL keys.
Select a tab below to inspect detailed documentation, codebase flows, and code examples
The system is split into two primary components: an automated Python ingestion and DSP audio feature extraction pipeline, and a Next.js 15 web application dashboard that enables real-time recommendation scoring and playlist syncing.
graph TD
subgraph DP["1. Python DSP and Ingestion Pipeline"]
A["YT Playlist"] -->|ytmusicapi| B["prepare_metadata.py"]
B -->|"Batch Raw Meta"| C["Gemini 3.5 Flash Tagger"]
C -->|"Genre and Cover Metadata"| B
B -->|"Description Regex and Search"| D["Original Song Resolver"]
B -->|"Metadata Output"| E["songs_to_review.json"]
E -->|"audio_pipeline.py"| F["yt-dlp Downloader"]
F -->|"Essentia TensorFlow pb"| G["DSP Feature Extractor"]
E -->|"Gemini Embedding API"| H["Semantic Text Embeddings"]
G -->|"Save JSON"| I["audio_features_output.json"]
H -->|"Save JSON"| I
E -->|"import_to_db.py"| J[("Supabase PostgreSQL")]
I -->|"audio_db_insert.py"| J
end
subgraph WA["2. Next.js 15 Web Application"]
K["React Client UI"] -->|"1. Submit Seed URL"| L["GET /api/songs"]
L -->|"Query Track and Genre RPC"| J
K -->|"2. Generate Recommendations"| M["POST /api/recommend"]
M -->|"Apply Hard Filters"| J
M -->|"Compute Weighted Similarity"| N["Scoring Engine"]
N -->|"Candidate JSON List"| K
K -->|"3. Sync Playlist"| O["POST /api/playlist/push"]
O -->|"OAuth Access Token"| P["YouTube Data API v3"]
P -->|"Create and Populate"| Q["User YouTube Account"]
end
The PostgreSQL database hosted on Supabase manages song relationships, hierarchical genres, raw audio DSP values, and vector embeddings.
erDiagram
songs {
int song_index PK
varchar title
int original_song_id FK
int artist_id FK
int album_id FK
int group_id FK
int release_year
varchar url
varchar language
}
artists {
int artist_id PK
varchar name
}
groups {
int group_id PK
varchar name
}
albums {
int album_id PK
varchar title
int artist_id FK
}
genres {
int genre_id PK
varchar name
int level
}
song_genres {
int song_id FK
int primary_genre_id FK
int sub_genre_id FK
int micro_genre_id FK
}
song_audio_features {
int song_id FK
float tempo
float energy
float valence
float danceability
float acousticness
}
song_vectors {
int song_id FK
vector artist_vector
vector audio_vector
}
songs ||--o{ song_genres : has
songs ||--o| song_audio_features : has_features
songs ||--o| song_vectors : has_vectors
songs }o--|| artists : main_artist
songs }o--|| groups : "group"
songs }o--|| albums : album
The ingestion pipeline extracts track lists, feeds metadata to Gemini to clean credit roles and genres, resolves cover reference mappings, downloads audio, and executes deep neural model feature extractions.
Scrapes source playlist metadata via ytmusicapi and runs structured JSON prompts on Gemini 3.5 Flash to automatically categorise elements (Primary/Sub/Micro genres, original song artist credits, featuring credentials).
If Gemini marks a song as a cover, the pipeline parses the description for original song YouTube URLs or triggers fallback search logic combining original titles and artists to link the original_song_id reference key.
Downloads audio streams synchronously via yt-dlp. The pipeline uses Essentia TensorFlow neural models to extract core acoustic metrics, outputting normalized scores for Valence, Energy, Danceability, Acousticness, and BPM.
# Extract BPM using rhythm extractors on 44.1kHz audio
bpm, _, _, _, _ = es.RhythmExtractor2013()(audio_44k)
# Extract deep acoustic features from EffNet-Discogs graph embeddings
emb_effnet = effnet(audio_16k)
danceability = float(np.mean(dance_model(emb_effnet)[:, 0]))
acousticness = float(np.mean(acoustic_model(emb_effnet)[:, 0]))
# Extract valence and energy from MusiCNN DEAM models
emb_musicnn = musicnn(audio_16k)
av_preds = av_model(emb_musicnn)
valence = round((float(np.mean(av_preds[:, 0])) - 1) / 8, 3) # Normalize to [0, 1]
energy = round((float(np.mean(av_preds[:, 1])) - 1) / 8, 3) # Normalize to [0, 1]
The recommendation engine resides inside a Next.js API Route (/api/recommend). It processes user inputs, performs strict SQL filtering, and scores candidate songs using a weighted affinity matrix.
Filters out songs not matching the user's primary selection criteria (e.g. vocal language, excluding/including cover songs, or target seed genres) before scoring metrics are evaluated.
let query = supabase
.from('songs')
.select(`
song_index, title, url, language, release_year, original_song_id, group_id, album_id, artist_id,
artists!songs_artist_id_fkey ( artist_id, name ),
song_featuring ( artist_id ),
song_producers ( artist_id )
`);
Candidates passing the hard filters are run through a weighted scoring matrix comparing the candidate song against the seed song and user-selected secondary tags:
| Match Parameter | Weight Value | Description / Reasoning |
|---|---|---|
| Same Micro-Genre | +40 | Most specific acoustic vibe match (e.g. City Pop, Shibuya-kei) |
| Same User-Target Tag | +30 | Matches custom requested artist, micro-genre, or group search tags |
| Same Sub-Genre | +25 | Matches structural style taxonomy (e.g. Dance Pop, Indie Rock) |
| Same User-Target Sub-genre | +25 | High-relevance feature credit alignments |
| Same Language | +20 | Shares identical vocal language constraints (e.g. Korean, Japanese) |
| Same Artist Group | +15 | Tracks from members of the same group/band (e.g. solo releases) |
| Same Primary Genre | +10 | Shares broad genre alignment (e.g. J-Pop, K-Pop, Rock) |
| Same Decade | +5 | Released during the same musical era |
Caps output limits to maximum 5 tracks per artist and 5 tracks per album. When the user requests a "Regenerate" reshuffle, a Knuth shuffle is executed on the score ties, injecting catalog variety.
Compiling user playlists requires access to official Google APIs. We manage the OAuth authorization and token rotation lifecycle inside auth.js powered by NextAuth v5.
sequenceDiagram
autonumber
Client UI->>NextAuth: Trigger Login via Google
NextAuth->>Google OAuth: Request offline access with YouTube scopes
Google OAuth-->>NextAuth: Return Access Token and Refresh Token
NextAuth->>Supabase: Save tokens in users table
NextAuth-->>Client UI: Establish Session via JWT Cookie
Note over Client UI, NextAuth: On API request check JWT expiry
alt Token Expiry is less than 10 mins
NextAuth->>Supabase: Fetch Refresh Token if missing in JWT
NextAuth->>Google OAuth: Request Token Refresh via POST oauth2 token
Google OAuth-->>NextAuth: Return New Access Token and Expiry
NextAuth->>Supabase: Update tokens in users table
NextAuth-->>Client UI: Return Refreshed Session
end
The YouTube Data API restricts users to 10,000 units/day. To prevent early exhaustion (where detail queries eat 1 unit per item), TagTune stores details locally on Supabase. YouTube APIs are queried strictly for the initial ingestion and compiling the final playlist sync, reducing quota footprint by ~90%.
The system spans across Python data processing scripts and the Next.js fullstack application layout.
prepare_metadata.py - Scrapes playlists and runs Gemini AI tagging.audio_pipeline.py - Runs Essentia TF models and Gemini Embeddings.import_to_db.py - Maps relational records and imports into PostgreSQL.audio_db_insert.py - Inserts extracted audio vectors and DSP metrics.schema.sql - Configures PostgreSQL tables on Supabase.auth.js - Manages NextAuth and token rotation workflows.api/recommend/route.js - Real-time weighted recommendations.api/songs/route.js - Metadata fetchers and RPC call handlers.api/playlist/push/route.js - Creates and populates YouTube playlists.page.jsx - Main React user dashboard page.