RakanEmbed — BRIGHT Benchmark Results

Full evaluation on the BRIGHT benchmark — designed to test reasoning-intensive retrieval, not simple keyword matching.

12 domains evaluated · Average NDCG@10: 0.524

Domain	Queries	Documents	NDCG@10	MAP@10	Recall@10	MRR
Biology	103	57,359	0.659	0.549	0.737	0.753
TheoremQA (Theorems)	76	23,839	0.645	0.566	0.791	0.644
Psychology	101	52,835	0.607	0.481	0.620	0.674
Economics	103	50,220	0.590	0.450	0.625	0.644
Earth Science	116	121,249	0.569	0.451	0.594	0.713
Pony (Programming)	112	7,894	0.554	0.200	0.281	0.680
TheoremQA (Questions)	194	188,002	0.539	0.496	0.580	0.589
Sustainable Living	108	60,792	0.533	0.424	0.565	0.628
Stack Overflow	117	107,081	0.528	0.429	0.619	0.590
Robotics	101	61,961	0.490	0.381	0.535	0.568
LeetCode	142	413,932	0.356	0.285	0.456	0.413
AoPS (Math)	111	188,002	0.220	0.149	0.223	0.380
Average	—	—	0.524	0.405	0.552	0.606

Want to use RakanEmbed?

RakanEmbed is available as a hosted API. Get in touch for endpoint access, credentials, and integration support.

We train and deploy custom models tailored to your domain and data.