Kamus Hokkien Nusantara
ActiveA digital lexicon and cultural archive for the Chinese-Indonesian diaspora, engineered with in-memory fuzzy search engine.

Overview
Most existing Hokkien dictionaries focus exclusively on standard Taiwanese (Taigi) or Amoy dialects. However, these resources often fail to reflect the linguistic reality of the Chinese-Indonesian diaspora, whose language has uniquely evolved over generations.
I built Kamus Hokkien Nusantara to serve as a definitive digital lexicon and cultural archive for this community. The platform explicitly highlights the linguistic split between the two major migration routes:
- The Northern Route (Zhangzhou variant, spoken in Medan/Penang)
- The Southern Route (Quanzhou variant, spoken in Riau/Singapore).
Beyond its cultural mission, the platform was engineered as an exercise in extreme performance and cost-optimization, utilizing Static Site Generation (SSG) to deliver 0ms-latency search with zero active database reads in production.
Features
The platform is designed to be instantly accessible, heavily SEO-optimized, and resilient to typos.
- Zero-Latency Fuzzy Search: By utilizing an in-memory client-side search engine (
Fuse.js), users get instant, typo-tolerant translations without ever waiting for a network request or database query. - Linguistic Routing: Every vocabulary entry is strictly categorized by its migration route, highlighting differences in pronunciation (e.g., the Northern
-uisound vs. the Southern-ngsound). - Content Collections: Features long-form cultural articles and historical documentation written in
.mdxand compiled statically for maximum SEO performance and readability.
Architecture
To achieve 100% free edge hosting while maintaining a rich database, I implemented an Island Architecture paradigm. The site ships 0kb of JavaScript for standard content pages, only hydrating the React components strictly necessary for the interactive search bar.
The Stack Breakdown
- Core Framework: Astro 6 (SSG) utilizing React 19 for interactive client-side Islands (
client:load). - Database & ORM: Supabase (PostgreSQL) managed with Drizzle ORM for strict schema typing.
- The Search Engine: At build time (
pnpm build), Astro uses Drizzle to fetch the entire lexicon from Supabase, injecting it as a static JSON payload directly into aFuse.jsinstance. - UI & Accessibility: Tailwind CSS v4 paired with headless Radix UI primitives (shadcn/ui), rigorously tested via Playwright and Axe-core.
Milestones & Impact
Because the goal of this project is cultural preservation, ensuring maximum uptime and preventing cloud infrastructure costs from scaling out of control was the primary architectural driver.
2026: Foundation & Launch
- Architected and deployed the initial version to the Vercel Edge Network.
- Designed the database schema in Supabase to effectively map the many-to-many relationships between root words, migration routes, and regional phonetic variations.
- Successfully achieved the primary infrastructure goal: decoupling the PostgreSQL database from production traffic, resulting in instantaneous search queries and zero database reads during active user sessions.