A researcher at The University of Texas at Dallas has developed a mathematical model that uses events described in news articles to more accurately predict the prices of natural resources and raw materials.
Commonly referred to as commodities, prices for agricultural products, energy products and metals are driven by events such as tax increases, strikes, natural disasters and wars. Their prices fluctuate significantly often, without any discernible patterns.
“Essentially, any kind of shock to the system can impact players across the supply chain, starting from the farmers all the way to the end consumer,” said Dr. Ashwin Venkataraman, assistant professor of operations management in the Naveen Jindal School of Management and corresponding author of the study published in a recent issue of Manufacturing & Service Operations Management.
“That makes the people and businesses working with these commodities especially susceptible to risk,” he said. “Given commodities prices’ extreme volatility, a new methodology for forecasting was necessary to make better informed decisions.”
The standard way of predicting a commodity’s price is to use historical data to forecast what’s going to happen, Venkataraman said.
“But commodities are special because they are heavily impacted by external factors, not just how they’ve behaved in the past,” he said. “These factors change so rapidly, and different commodities are impacted by different events.”
To address this issue, Venkataraman and his colleagues proposed a method for forecasting prices that automatically extracts events from news articles and combines this information with historical price data using a predictive model.
In their study, the researchers used the framework to forecast prices of four essential staple crops in India — onions, potatoes, rice and wheat — using events reported in 1.6 million news articles published in a major Indian newspaper between 2006 and 2020.
Venkataraman said news articles follow a specific structure — headlines report main events, lead paragraphs summarize important details, and the remainder of the text provides more depth.
Their model captures this structure by identifying “event triggers,” or important words and phrases signaling the type and occurrence of events driving price fluctuations. They found words such as “hoarding,” “festivals,” “protest” and “hike,” as in price hike, were common underlying events.
“Given commodities prices’ extreme volatility, a new methodology for forecasting was necessary to make better informed decisions.”
Dr. Ashwin Venkataraman, assistant professor of operations management in the Naveen Jindal School of Management
Venkataraman said that although some existing forecast techniques incorporate information from news articles, they often fail to distinguish which words are important from an article’s text because they process the article as a generic document. Those forecasts are less accurate because the non-trigger words do not contain relevant signals, introducing “noise” and skewing the significance of the trigger words.
“It’s very hard to leverage this information effectively because there’s a lot of noise,” Venkataraman said. “So many words don’t add value, and it’s important to be able to extract them in the right way.”
To learn more about how UT Dallas is enhancing lives through transformative research, explore New Dimensions: The Campaign for UT Dallas.
Their new model outperformed several benchmark models by as much as 13% by leveraging the news articles’ structure. He said the model also can be used to identify the key events driving price fluctuations, lending interpretability to the forecasts. This information can be invaluable in mitigating the complex and often long-term effects of price shocks.
Venkataraman said one shortcoming of other predictive pricing models is their use of predetermined events, or structured data.
“If you decide on preexisting factors, there’s no guarantee the actual underlying event driving the price has been captured,” he said. “Structured data approaches miss the subtleties of multiple important factors playing different roles at different times for the same commodity.”
Venkataraman said their method can help address a lack of reliable and actionable data available to people and businesses across various industries’ supply chains. For example, investors managing stock portfolios, farmers choosing when to sell a crop, and policymakers designing subsidies to protect citizens from inflation can all benefit from the event information and forecasts generated by their method.
“Ideally, these forecasts offer insights enough in advance that you have time to take the appropriate precautionary actions and minimize the impact of these shocks,” he said.
Although the researchers focused on extracting data from news articles from a single source, Venkataraman said other news sources such as blogs or posts on the social media platform X (formerly Twitter) could be used to inform predictions.
Other contributors to the study include senior authors Dr. Sunandan Chakraborty of Indiana University Indianapolis, and Dr. Srikanth Jagabathula and Dr. Lakshminarayanan Subramanian of New York University.