Data Analysis

LLMs can help locate data sources, format data, extract data from text, classify and score text, create figures, extract sentiment, and even simulate human test subjects. Most of these capabilities can be accessed not only through a web interface as shown in the demonstrations below, but also via an API (Application Programming Interface) that allows large amounts of data to be formatted, extracted, classified etc. The operations can also be performed in batches to remain within the token limit for each request. Moreover, building on the section on coding, it goes without saying that LLMs can write the computer code necessary to access their own APIs — for example, try out "Write python code to ask GPT-4 to do [any data extraction or manipulation task]''.

When performing data analysis tasks in bulk, cost is an important consideration. Although a single prompt to a cutting-edge LLM costs just fractions of a cent, the cost of performing thousands or millions of queries quickly adds up. For many of the tasks described below, smaller and cheaper models are available. In those cases, it is not advisable to use the most cutting-edge LLM.

Creating figures

One of the most useful functions of ChatGPT for economists is Advanced Data Analysis, which employs the coding capabilities of GPT-4 to create versatile figures and graphs.
In the following example, I uploaded a file that contained stock market prices for three large technology companies and instructed ChatGPT Advanced Data Analysis to create one graph displaying stock performance labeled with the corresponding betas and another graph displaying portfolio weights.*To compile the underlying stock market data,
I asked ChatGPT to write a script to
download the data, as described in the
Online Appendix of the paper.




Extracting data from text




Other applications of extracting data from text include numerical data, e.g., stock prices from news articles or dosage information from drug databases. When I prompted the LLM with "Can you provide examples of what kinds of numbers you can extract from text?'' it answered the following 10 types: phone numbers, zip codes, social security numbers, credit card numbers, bank account numbers, dates, times, prices, percentages, measurements (length, weight etc.) The process can be automated for large quantities of data using API access, and can typically performed with smaller and cheaper models than GPT-4. Dunn et al. (2022)Dunn, A., Dagdelen, J., Walker, N., Lee,
S., Rosen, A. S., Ceder, G., Persson,
K., and Jain, A. (2022). Structured
information extraction from complex scientific text with
fine-tuned large language models. arXiv:2212.05238
.
show how to use LLMs for structured information extraction tasks from scientific texts. This can also be used in economics, for example, for entity recognition in economic history research.

Reformatting data





Classifying and scoring text

Social science research frequently employs statistical techniques to represent text as data (Gentzkow et al., 2019)Gentzkow, M., Kelly, B. T., and Taddy,
M. (2019). Text as data. Journal
of Economic Literature, 57(3):535-74
.
. Modern LLMs can go beyond traditional techniques for this because they are increasingly capable of processing the meaning of the sentences that they are fed.

The following example asks GPT-4 to classify whether a given task listed in the US Department of Labor's Occupational Information Network (O*NET) database is easy or hard to automate and to justify its classification.*Eloundou et al. (2023)Eloundou, T., Manning, S., Mishkin, P., and
Rock, D. (2023). GPTs are GPTs:
An early look at the labor market
impact potential of large language models. arXiv:2303.10130
.
employ GPT-4 in this manner
to systematically estimate the labor market impact of
LLMs.
Following the principle of chain-of-thought prompting suggested by Wei et al. (2022b)Wei, J., Wang, X., Schuurmans, D., Bosma,
M., Ichter, B., Xia, F., Chi, E.,
Le, Q., and Zhou, D. (2022b). Chain-of-thought
prompting elicits reasoning in large language models.
arXiv:2201.11903.
, the prompt asks first for the justification in order to induce the LLM to reason about its response before performing the actual classification. This is akin to asking a student to think before they respond to a question.



Extracting sentiment




I also explored whether the LLM could identify whether the December 2022 or February 2023 FOMC statement was more hawkish, but its ability to assess Fed-speak was not quite nuanced enough — it focused mainly on the level of interest rates in February 2023 being higher as opposed to the small and nuanced changes in the text of the statement that indicated a potential change in direction. It did so even when I explicitly instructed it to report its assessment while "disregarding the target level for the federal funds rate.'' Only when I manually replaced the numbers for the target level by "[range]'' did the system correctly replicate the assessment that the February 2023 statement was slightly more dovish, as was widely reported in the financial press at the time.*See, for example, \urlhttps://www.cnbc.com/2023/02/01/live-updates-fed-rate-hike-february.html Ardekani et al. (2023)Ardekani, A. M., Bertz, J., Dowling, M.
M., and Long, S. (2023). EconSentGPT: a
universal economic sentiment engine? SSRN Working Paper.
develop an economic sentiment prediction model along similar lines and employ it to analyze US economic news and the ECB's monetary policy announcements.

Simulating human subjects

Argyle et al. (2022)Argyle, L. P., Busby, E. C., Fulda,
N., Gubler, J., Rytting, C., and Wingate,
D. (2022). Out of one, many:
Using language models to simulate human samples.
arXiv:2209.06899
.
propose the use of LLMs to simulate human subjects, based on the observation that the training data of LLMs contains a large amount of information about humanity. They condition GPT-3 on the socio-demographic backstories of real humans and demonstrate that subsequent answers to survey questions are highly correlated with the actual responses of humans with the described backgrounds, in a nuanced and multifaceted manner. Horton (2022)Horton, J. J. (2022). Large language
models as simulated economic agents: What can
we learn from homo silicus? NBER Working
Paper 31122
.
showcases applications to economics, using simulated test subjects to replicate and extend upon several behavioral experiments.
The following example illustrates the concept:

There is a significant risk that the simulated results simply propagate false stereotypes, and they must hence be used with great care. However, they also contain valuable information. If used correctly, they can provide useful insights about our society, from which all the data used to train the LLMs ultimately originate. For experimental economists who prefer keeping to human subjects, Charness et al. (2023)Charness, G., Jabarian, B., and List, J.
A. (2023). Generation next: Experimentation with AI.
Working Paper, University of Chicago.
describe how LLMs can help to improve the design and implementation of experiments.

From: Generative AI for Economic Research: Use Cases and Implications for Economists
by Anton Korinek, Journal of Economic Literature, Vol. 61, No. 4, December 2023.
Copyright (c) by American Economic Association. Reproduced with permission.