TagTune

AI-Enriched Music Discovery & Playlist Curation Platform

Fullstack Developer Data Engineer Next.js 15 Python DSP Pipeline

TagTune is a fullstack web application and automated data engineering pipeline designed to bridge the gap between black-box streaming recommendations and controlled music curation. Built on a curated YouTube Music database, it solves the core limitations of standard music services by providing users with explicit, tag-based recommendation control and syncing directly into their YouTube account as a private playlist.

Role

Lead Developer & Data Engineer

Primary Stack

Next.js, Python, PostgreSQL

APIs & Models

Gemini Flash, Gemini Embeding, YouTube Data v3

Application Showcase

Interactive walkthrough of the core user interface and features

Secure OAuth Login

Using NextAuth v5 to request specific Google and YouTube OAuth scopes for playlist synchronization.

1. Authentication Secure YouTube OAuth Sign-in

2. Tag Customization Tune genre and attribute weights

3. Playlist Creation Sync recommendation list

Key Technical Achievements

Platform engineering benchmarks and implementation details

DSP Feature Extraction

Extracted and normalized 5 key audio metrics (BPM, Energy, Valence, Danceability, Acousticness) from audio files using pre-trained Essentia TensorFlow neural networks.

Dense Multimodal Vectors

Combined Essentia CNN audio vectors with 768-dimensional semantic text embeddings from the Gemini Embedding API to represent comprehensive track profiles.

3-Tier Genre Taxonomy

Developed a prompt-engineered pipeline using Gemini 3.1 Flash to classify tracks into a 3-tier genre taxonomy (Primary, Sub, and Micro-genres) across 200+ distinct genres.

Cover Song Linkage

Engineered regex-based description parsers and search fallbacks to identify cover songs and trace original tracks, linking them via self-referential PostgreSQL keys.

System Architecture & Deep Dives

Select a tab below to inspect detailed documentation, codebase flows, and code examples

Overall System Architecture

The system is split into two primary components: an automated Python ingestion and DSP audio feature extraction pipeline, and a Next.js 15 web application dashboard that enables real-time recommendation scoring and playlist syncing.

graph TD
    subgraph DP["1. Python DSP and Ingestion Pipeline"]
        A["YT Playlist"] -->|ytmusicapi| B["prepare_metadata.py"]
        B -->|"Batch Raw Meta"| C["Gemini 3.5 Flash Tagger"]
        C -->|"Genre and Cover Metadata"| B
        B -->|"Description Regex and Search"| D["Original Song Resolver"]
        B -->|"Metadata Output"| E["songs_to_review.json"]
        E -->|"audio_pipeline.py"| F["yt-dlp Downloader"]
        F -->|"Essentia TensorFlow pb"| G["DSP Feature Extractor"]
        E -->|"Gemini Embedding API"| H["Semantic Text Embeddings"]
        G -->|"Save JSON"| I["audio_features_output.json"]
        H -->|"Save JSON"| I
        E -->|"import_to_db.py"| J[("Supabase PostgreSQL")]
        I -->|"audio_db_insert.py"| J
    end

    subgraph WA["2. Next.js 15 Web Application"]
        K["React Client UI"] -->|"1. Submit Seed URL"| L["GET /api/songs"]
        L -->|"Query Track and Genre RPC"| J
        K -->|"2. Generate Recommendations"| M["POST /api/recommend"]
        M -->|"Apply Hard Filters"| J
        M -->|"Compute Weighted Similarity"| N["Scoring Engine"]
        N -->|"Candidate JSON List"| K
        K -->|"3. Sync Playlist"| O["POST /api/playlist/push"]
        O -->|"OAuth Access Token"| P["YouTube Data API v3"]
        P -->|"Create and Populate"| Q["User YouTube Account"]
    end

Database Schema Design

The PostgreSQL database hosted on Supabase manages song relationships, hierarchical genres, raw audio DSP values, and vector embeddings.

erDiagram
    songs {
        int song_index PK
        varchar title
        int original_song_id FK
        int artist_id FK
        int album_id FK
        int group_id FK
        int release_year
        varchar url
        varchar language
    }
    artists {
        int artist_id PK
        varchar name
    }
    groups {
        int group_id PK
        varchar name
    }
    albums {
        int album_id PK
        varchar title
        int artist_id FK
    }
    genres {
        int genre_id PK
        varchar name
        int level
    }
    song_genres {
        int song_id FK
        int primary_genre_id FK
        int sub_genre_id FK
        int micro_genre_id FK
    }
    song_audio_features {
        int song_id FK
        float tempo
        float energy
        float valence
        float danceability
        float acousticness
    }
    song_vectors {
        int song_id FK
        vector artist_vector
        vector audio_vector
    }

    songs ||--o{ song_genres : has
    songs ||--o| song_audio_features : has_features
    songs ||--o| song_vectors : has_vectors
    songs }o--|| artists : main_artist
    songs }o--|| groups : "group"
    songs }o--|| albums : album

Multi-Stage Data Pipeline

The ingestion pipeline extracts track lists, feeds metadata to Gemini to clean credit roles and genres, resolves cover reference mappings, downloads audio, and executes deep neural model feature extractions.

Phase 1: Metadata Preparation & LLM Tagging

Scrapes source playlist metadata via ytmusicapi and runs structured JSON prompts on Gemini 3.5 Flash to automatically categorise elements (Primary/Sub/Micro genres, original song artist credits, featuring credentials).

Phase 2: Cover Song Resolution Logic

If Gemini marks a song as a cover, the pipeline parses the description for original song YouTube URLs or triggers fallback search logic combining original titles and artists to link the original_song_id reference key.

Phase 3: DSP Feature Extraction

Downloads audio streams synchronously via yt-dlp. The pipeline uses Essentia TensorFlow neural models to extract core acoustic metrics, outputting normalized scores for Valence, Energy, Danceability, Acousticness, and BPM.

Python (audio_pipeline.py)

# Extract BPM using rhythm extractors on 44.1kHz audio
bpm, _, _, _, _ = es.RhythmExtractor2013()(audio_44k)

# Extract deep acoustic features from EffNet-Discogs graph embeddings
emb_effnet = effnet(audio_16k)
danceability = float(np.mean(dance_model(emb_effnet)[:, 0]))
acousticness = float(np.mean(acoustic_model(emb_effnet)[:, 0]))

# Extract valence and energy from MusiCNN DEAM models
emb_musicnn = musicnn(audio_16k)
av_preds = av_model(emb_musicnn)
valence = round((float(np.mean(av_preds[:, 0])) - 1) / 8, 3) # Normalize to [0, 1]
energy  = round((float(np.mean(av_preds[:, 1])) - 1) / 8, 3) # Normalize to [0, 1]

Real-Time Scoring & Recommendation Engine

The recommendation engine resides inside a Next.js API Route (/api/recommend). It processes user inputs, performs strict SQL filtering, and scores candidate songs using a weighted affinity matrix.

1. SQL Hard Filters

Filters out songs not matching the user's primary selection criteria (e.g. vocal language, excluding/including cover songs, or target seed genres) before scoring metrics are evaluated.

JavaScript (recommend/route.js)

let query = supabase
  .from('songs')
  .select(`
    song_index, title, url, language, release_year, original_song_id, group_id, album_id, artist_id,
    artists!songs_artist_id_fkey ( artist_id, name ),
    song_featuring ( artist_id ),
    song_producers ( artist_id )
  `);

2. Similarity Scoring Matrix

Candidates passing the hard filters are run through a weighted scoring matrix comparing the candidate song against the seed song and user-selected secondary tags:

Match Parameter	Weight Value	Description / Reasoning
Same Micro-Genre	+40	Most specific acoustic vibe match (e.g. City Pop, Shibuya-kei)
Same User-Target Tag	+30	Matches custom requested artist, micro-genre, or group search tags
Same Sub-Genre	+25	Matches structural style taxonomy (e.g. Dance Pop, Indie Rock)
Same User-Target Sub-genre	+25	High-relevance feature credit alignments
Same Language	+20	Shares identical vocal language constraints (e.g. Korean, Japanese)
Same Artist Group	+15	Tracks from members of the same group/band (e.g. solo releases)
Same Primary Genre	+10	Shares broad genre alignment (e.g. J-Pop, K-Pop, Rock)
Same Decade	+5	Released during the same musical era

3. Deduplication and Variety Shuffling

Caps output limits to maximum 5 tracks per artist and 5 tracks per album. When the user requests a "Regenerate" reshuffle, a Knuth shuffle is executed on the score ties, injecting catalog variety.

OAuth Token Lifecycle & Authentication

Compiling user playlists requires access to official Google APIs. We manage the OAuth authorization and token rotation lifecycle inside auth.js powered by NextAuth v5.

sequenceDiagram
    autonumber
    Client UI->>NextAuth: Trigger Login via Google
    NextAuth->>Google OAuth: Request offline access with YouTube scopes
    Google OAuth-->>NextAuth: Return Access Token and Refresh Token
    NextAuth->>Supabase: Save tokens in users table
    NextAuth-->>Client UI: Establish Session via JWT Cookie
    Note over Client UI, NextAuth: On API request check JWT expiry
    alt Token Expiry is less than 10 mins
        NextAuth->>Supabase: Fetch Refresh Token if missing in JWT
        NextAuth->>Google OAuth: Request Token Refresh via POST oauth2 token
        Google OAuth-->>NextAuth: Return New Access Token and Expiry
        NextAuth->>Supabase: Update tokens in users table
        NextAuth-->>Client UI: Return Refreshed Session
    end

Quota & Rate Limit Mitigation

The YouTube Data API restricts users to 10,000 units/day. To prevent early exhaustion (where detail queries eat 1 unit per item), TagTune stores details locally on Supabase. YouTube APIs are queried strictly for the initial ingestion and compiling the final playlist sync, reducing quota footprint by ~90%.

Source Files Manifest

The system spans across Python data processing scripts and the Next.js fullstack application layout.

Python Pipeline Subsystem

prepare_metadata.py - Scrapes playlists and runs Gemini AI tagging.
audio_pipeline.py - Runs Essentia TF models and Gemini Embeddings.
import_to_db.py - Maps relational records and imports into PostgreSQL.
audio_db_insert.py - Inserts extracted audio vectors and DSP metrics.
schema.sql - Configures PostgreSQL tables on Supabase.

Next.js 15 Web Application

auth.js - Manages NextAuth and token rotation workflows.
api/recommend/route.js - Real-time weighted recommendations.
api/songs/route.js - Metadata fetchers and RPC call handlers.
api/playlist/push/route.js - Creates and populates YouTube playlists.
page.jsx - Main React user dashboard page.