Social Media Analytics



Client challenges

Project Scope

The seventh largest stock exchange in the world by market capitalization, our client has been manually monitoring print and social media to identify suspicious news and rumors that may have an impact on the price and/or volume of securities listed on its platform. The manual process of such physical scanning or monitoring is immense and fraught with challenges.


Statistical model, identifying rumors

Datametica built a statistical model through training of 6.5 years of historical web articles, manually tagged or untagged by data scientists and business users. The model uses NLP based artificial intelligence to identify rumors. It is enriched with 5K+ company names using the n-gram technique, filtration of 80 suspected and 20 rejection keywords, 25+ websites and 300+ web source integrations. This is a continuous process, and involves adding the company list, keywords and web links slowly and steadily to ensure maximum accuracy.

The end-to-end solution was developed on Hadoop to provide a scalable solution at lower cost.


Business impact

85% accuracy in identifying articles and 92% accuracy in avoiding false positives.

Increased investor confidence

Maintains the integrity of the Stock Market and helps to increase the brand value and investor confidence


Processes 1200 web articles and 0.2 million tweets every day.

Time factor

Near real time framework to process and shortlist the suspected articles significantly reduced human intervention.

Media metrics

Improved surveillance of news articles, social media and blogs; and enhanced tagging accuracy and alerts system implementation