This article was originally featured on the United Nations Data Blog
Accurate measurement of the United Nations Sustainable Development Goals (SDGs) is critical to achieving their implementation by 2030. The logic behind this is simple – to make evidence-based decisions that will enable countries to meet the SDG targets, governments and policymakers need to have a clear view of existing conditions and be able to track any progress – or regress – that is being made.
Accurate measurement, however, requires accurate data. At present, SDG progress on a per-country basis is primarily determined by traditional surveys. This is a capital-, resource- and time-intensive exercise which can delay the delivery of findings and strategic intervention. There is also the issue of sample size in relation to population, meaning that the findings may not be statistically representative across individual targets.
Social media offers a vast untapped data pool which can be structured and analysed in real time, delivering immediate feedback to stakeholders around key themes driving negative and positive interactions with initiatives and policies. Given the ubiquity of these platforms, they offer an attractive alternative to traditional market research and polling.
This begs the question – why isn’t social media sentiment analysis being used more commonly as a tool to track SDG progress? To answer this, we need to unpack the concept and current capabilities of sentiment analysis before considering the ways in which this methodology could aid in the monitoring and benchmarking of SDG progress.
What is sentiment analysis?
Simply put, sentiment analysis is the science of understanding how people really feel. It relies on artificial intelligence (AI) and analytical techniques to extract data around emotion and opinion from large volumes of text.
Both private and public sector entities are interested in sentiment analysis for a variety of reasons. A multinational might want to collect sentiment data for the purposes of market research or to better understand customers’ experiences. Governments and political parties might want to get a read on the public’s feelings towards a proposed policy or election candidate.
While social media dialogue often forms the basis of sentiment analysis, its unstructured nature means that social media conversations typically contain a lot of irrelevant data and local vernacular. The value of this unstructured data lies in the ability to convert it into decipherable and valuable insights at scale.
How accurate is AI when it comes to sentiment analysis?
As sophisticated as AI has become, the complexity of human communication still confounds machines. The result is that sentiment analysis driven purely by machine learning can often deliver inaccurate results.
Irony, sarcasm, figurative language, slang and localised idiosyncrasies pose interpretive problems for machines. So too can complex or meandering sentences, and typing or grammatical errors, which are ever-present in social media conversations.
One solution is to layer an element of human insight over the analytical work performed by machines. In other words, get real people – a crowd – to refine the work done by AI. Human beings are not only better at dealing with sarcasm and local references, they’re also better at performing granular topic-level analysis required for datasets as diverse as SDGs.
Can sentiment analysis be employed to track SDG progress?
By using different language crowds, relevant country-specific social media data can be structured to showcase SDG progress and highlight any impediments for intervention. A measurement framework can be created based on this methodology which places real-time social media data at the heart of understanding performance by governments and corporations against the 17 SDGs.
To understand how the raw social media data would be structured, it is easiest to follow a social media post on its journey through a sentiment analysis system. The below example uses a Tweet made about the recent riots in Durban, South Africa.
Step One: Determine sentiment
When a tweet is posted that matches the data collection criteria, AI will check that it contains sentiment and is relevant to the SDGs. From there, the mention will be processed by an English-language crowd for sentiment - positive, negative or neutral.
Step Two: Topic analysis Step Three: SDG benchmarking
After sentiment and topic verification is complete for an individual mention, it is returned to the broader data pool of mentions related to the specific SDG. Using customisable dashboards, this verified dataset can be filtered to identify trends based on targets, locations and priority areas for intervention and analysis.
In this process, all data verification happens in real time, with the highly accurate, fully structured data being presented in live dashboards within five minutes or less. The ability to use crowds to do this work at scale with high degrees of accuracy and low latency means that sentiment analysis can be used as a valuable tool in the monitoring and benchmarking of SDG progress.
Sentiment analysis use cases
Sentiment analysis can be employed by policymakers, responsible departments, monitoring agencies and local delivery units to:
- Understand location-specific sentiment towards overall SDGs, individual targets and initiatives, organisations and policies;
- Identify key areas of improvement based on past performance to develop KPIs and measure SDG progress;
- Identify new and existing projects which have high levels of negative sentiment in order to mitigate risk or address issues;
- Identify key entities that are receiving or driving high levels of engagement to develop communications best practice or strategies to effect behavioural change;
- Identify key organisations achieving success around specific targets and initiatives to build cross-industry partnerships to increase awareness and uptake;
- Benchmark SDG progress of specific locations against desired cities or countries;
- Gather unsolicited feedback to inform survey design for other research methodologies to increase data coverage across the population; and
- Report back to stakeholders and boost accountability on the achievement of specific targets.