soruly

trace.moe is 10x-100x faster and more accurate now

Added 2025-08-30 15:40:53 +0000 UTC

I've rewritten the search algorithm and changed the database from Solr to Milvus, an in-memory vector database. This brings down the search time from about 1-10 seconds to under 0.1 second. This is the biggest improvements I have ever made since I created this search engine 10 years ago.

search time used to fluctuate between 1-10s by traffic. now it drops to a flat line around 0.1s

I have also updated the way it calculates similarity. Previously, even for images that doesn't look similar at all, it still says it's about 60% similar, which is confusing. Now, not-so-similar images would have a much lower similarity, and the least similar ones close to 0%. Threshold of a match remains at about 90% similarity.

the similarity rage is more widespread, making it easier to judge if it is a match or just similar

The new search algorithm also shows more results that looks similar. The old algorithm with LireSolr only partially search the database as a method to speed up search. As shown below only 31 million frames out of 1.5 billion are being searched. While they both return the same correct scene as the top result, Milvus searches through the whole database without sacrificing accuracy for speed, so it can find more similar results.

Left: old results from LireSolr ; Right: new results from Milvus

With this improvement, not only it is faster and more accurate, it is more stable too. As the server takes only one tenth of the time to search, it means it can handle more traffic without being overloaded. Previously, it used to reject hundreds and thousands of requests per day to prevent overload, now it's serving every single search request without a sweat.

Lastly, I'd like to say thanks to two friends that recommended Milvus to me: Leslie, who wrote a paper on arXiv (https://arxiv.org/html/2404.12169v1) that analyzes the system of trace.moe pretty well and shared some methods for improvements. Vankerkom, who rewrote the whole system in Rust and Milvus and showed me it is fesible to make such a huge improvement on the same server that I've just upgraded earlier this year.

Milvus is an in-memory database, if there is not enough RAM, it would not run at all. Currently it takes 160GB RAM to fit in 1.5 billion frames in-memory. As more anime is added to index, the memory requirement would eventually hit the 192GB limit on this server and I'll need to upgrade again. Thanks for your support all these years and I hope you would continue to support trace.moe

If you're interested in the technical details, feel free to read the source code on GitHub. If you have you're own set of anime, you can run it locally in docker too.