Like Bitly, Twitter has a great real-time data set and very smart data scientists and engineers. But instead of relying on a primarily computational solution, Twitter treats real-time search more like a CAPTCHA problem. With this kind of messy data, lots of human brains can find meaning much faster and more accurately than lots of lines of code. So Twitter uses a real-time computation system called Storm to identify search spikes, then Mechanical Turk (Amazon’s crowdsourcing online platform for small jobs) to farm out annotating that data to human beings all over the world. The annotations basically take the spiking search term and tag it for relevance and intent. A human annotator (Twitter calls them “judges”) can tell Twitter’s systems whether searches for “Stanford” refer to a university or to its football team, or that searches for “Big Bird” aren’t primarily referencing a children’s show, but a political debate. This helps Twitter make trending topics smarter and more coherent.

Source: Twitter just told us how cool its real-time search is… and how it makes its money | The Verge