Okmain: Perceptual Color Extraction Using Oklab and K-Means

Okmain is a high-performance library designed to extract representative colors from images using K-means clustering in the Oklab color space. By prioritizing central pixels and high-chroma values, it avoids the muddy results typical of simple pixel averaging. The project, implemented in Rust with Python support, demonstrates how perceptual color science can significantly improve UI-driven color extraction.

Key Points

Standard 1x1 image resizing for color extraction often produces dull, muddy results due to sRGB's non-linear nature.
Okmain utilizes K-means clustering in the Oklab color space to find distinct, perceptually accurate color groups.
Dominant colors are selected using a combination of pixel frequency, spatial centrality via a distance mask, and color saturation.
High performance is achieved through downsampling to 250,000 pixels and using data structures optimized for SIMD auto-vectorization.
LLM agents proved useful for generating auxiliary 'debug' code but were less effective at high-level architectural design and manual optimization.

Sentiment

The community is broadly positive and technically engaged. Commenters appreciate the principled use of Oklab color space and find the library genuinely useful, while offering constructive suggestions around performance, safety, and edge cases. The author's active participation strengthens the discussion. Minor friction arises around open-source contribution expectations but resolves amicably.

In Agreement

Using a perceptual color space like Oklab is the right approach — naive sRGB averaging produces muddy, inaccurate results
The 1x1 pixel resize is a surprisingly common but poor baseline for extracting a representative color
K-means clustering is a solid, well-understood technique for this problem and the library fills a real practical need
The author's heuristics for prominence (distance mask, chroma weighting) address real shortcomings of simpler approaches

Opposed

100ms per image is too slow for production workloads, and the library's memory usage could be problematic with large images
The library doesn't address image decoding safety (PNG bombs, malicious inputs) — it should at minimum document these risks
No heuristic can solve all edge cases: white/transparent backgrounds in product photos, adversarial compositions, and images where the most frequent color isn't the most visually meaningful one
Random or patch-based sampling might be more efficient alternatives to full downsampling, though the author explains why random sampling produces poor results in practice