The ability of general-purpose LLMs to perform mathematical
derivations are growing but still limited at this point.
The reasons for the relatively poor performance include that the training
data for current LLMs is mostly text-based and includes comparatively
little math, and that math requires high-level abstractions that are
still difficult for current systems. Frieder et al. (2023)Frieder, S., Pinchetti, L., Griffiths, R.-R., Salvatori,

T., Lukasiewicz, T., Petersen, P. C., Chevalier,

A., and Berner, J. (2023). Mathematical

capabilities of ChatGPT. arXiv:2301.13867. develop a dataset of graduate-level mathematical questions and show that GPT3.5\textquoteright s mathematical abilities are significantly below those of an average mathematics graduate student — perhaps providing some solace for the short term. However, Noorbakhsh et al. (2021)Noorbakhsh, K., Sulaiman, M., Sharifi, M., Roy,

K., and Jamshidi, P. (2021). Pretrained

language models are symbolic mathematics solvers too!

arXiv:2110.03501. show that LLMs can also be fine-tuned for mathematical tasks. Moreover, there have been noticable performance gains in math going from GPT-3.5 to GPT-4, as documented, e.g., by Bubeck et al. (2023)Bubeck, S., Chandrasekaran, V., Eldan, R., Gehrke,

J., Horvitz, E., and others (2023).

Sparks of artificial general intelligence: Early experiments

with GPT-4. arXiv:2303.12712.. Moreover, datasets like the one created by Frieder et al. (2023)Frieder, S., Pinchetti, L., Griffiths, R.-R., Salvatori,

T., Lukasiewicz, T., Petersen, P. C., Chevalier,

A., and Berner, J. (2023). Mathematical

capabilities of ChatGPT. arXiv:2301.13867. will certainly be useful for making future LLMs better at math. This is an area in which further progress would be very valuable for researchers.

T., Lukasiewicz, T., Petersen, P. C., Chevalier,

A., and Berner, J. (2023). Mathematical

capabilities of ChatGPT. arXiv:2301.13867. develop a dataset of graduate-level mathematical questions and show that GPT3.5\textquoteright s mathematical abilities are significantly below those of an average mathematics graduate student — perhaps providing some solace for the short term. However, Noorbakhsh et al. (2021)Noorbakhsh, K., Sulaiman, M., Sharifi, M., Roy,

K., and Jamshidi, P. (2021). Pretrained

language models are symbolic mathematics solvers too!

arXiv:2110.03501. show that LLMs can also be fine-tuned for mathematical tasks. Moreover, there have been noticable performance gains in math going from GPT-3.5 to GPT-4, as documented, e.g., by Bubeck et al. (2023)Bubeck, S., Chandrasekaran, V., Eldan, R., Gehrke,

J., Horvitz, E., and others (2023).

Sparks of artificial general intelligence: Early experiments

with GPT-4. arXiv:2303.12712.. Moreover, datasets like the one created by Frieder et al. (2023)Frieder, S., Pinchetti, L., Griffiths, R.-R., Salvatori,

T., Lukasiewicz, T., Petersen, P. C., Chevalier,

A., and Berner, J. (2023). Mathematical

capabilities of ChatGPT. arXiv:2301.13867. will certainly be useful for making future LLMs better at math. This is an area in which further progress would be very valuable for researchers.

The following prompt is an example of how LLMs can be useful
for setting up economic models. I prompted the LLM to generate LaTeX
code that I could directly paste into my editor, generating the results
shown below.

The optimization problem of a consumer with CES (Constant Elasticity
of Substitution) preferences over two goods can be set up as follows
in LaTeX code:
In this problem, the consumer is trying to maximize their utility
function, which is represented by a CES function. The variables x_1
and x_2 represent the quantities of the two goods that the consumer
chooses. The parameters \\alpha and \\rho are the CES preference
parameters, p_1 and p_2 are the prices of the two goods,
and M is the consumer's income. The constraint represents the consumer's
budget constraint.

At the time of writing, cutting-edge LLMs were capable of
generating the type of model setups that are commonly used in, for
example, undergraduate problem sets. This is useful because the results
appear in seconds and save time typing.

The solution will depend on the specific values of the parameters and the income and prices.

So far, the first-order conditions obtained are correct.

%

**(Author's note: the remainder of the solution is correct.)**

As the example illustrates, LLMs still routinely make
mistakes in mathematical derivations. I entered "When you divided
the two equations, you forgot the alpha terms.'' The LLM proceeded
to re-derive the results, corrected the mistake, but made a mistake
in the following line. This underlines that current LLMs do not possess
the high-level abstract representations necessary to perform math
at a consistent level. They can serve, at best, as a writing assistant
that quickly types lots of math but needs very careful oversight and
potential editing.

Current LLMs also have some ability to explain simple models.
This may be useful — but also risky — for students. In the following
example, I pasted LaTeX code into the LLM and asked it to explain
the underlying model and the steps in deriving a solution:

This is a model of optimal consumption and capital accumulation,
often referred to as the Ramsey-Cass-Koopmans model in economics.
It is a continuous-time version of the neoclassical growth model,
where a representative agent chooses consumption and capital accumulation
paths to maximize lifetime utility.

The agent's utility is discounted over time at rate \\rho, and depends on consumption C(t) according to the utility function U(C(t)). The agent's capital stock K(t) evolves according to the law of motion given by the differential equation, where F(K(t),L) is the production function, \\delta is the depreciation rate of capital, and C(t) is consumption.

The Hamiltonian function, denoted by H(t,K(t),C(t),\\lambda(t)), is a tool used in optimal control theory to solve this kind of dynamic optimization problem. It is a function of time t, state variable K(t), control variable C(t), and costate variable \\lambda(t), which is the shadow price of capital.

The first-order conditions for the optimal control problem are derived by taking derivatives of the Hamiltonian with respect to the control**and state variables, and setting them equal to zero**. These
conditions characterize the optimal paths of consumption and capital
over time.

The first condition U'(C(t))=\\lambda(t) implies that the marginal utility of consumption is equal to the shadow price of capital. This means that the agent is indifferent between consuming an additional unit of output and saving it to increase the capital stock.

The second condition, the costate equation, describes the evolution of the shadow price of capital over time. It says that the rate of change of the shadow price is equal to the negative of the shadow price times the marginal product of capital net of depreciation. This reflects the trade-off between consuming today and saving for future consumption.

The transversality condition, \\lambda(t)K(t)=0 as t goes to infinity, ensures that the agent does not accumulate an infinite amount of capital, which would not be optimal because of the assumption of diminishing marginal utility of consumption. This condition rules out paths where the agent accumulates capital indefinitely and never consumes it.

The agent's utility is discounted over time at rate \\rho, and depends on consumption C(t) according to the utility function U(C(t)). The agent's capital stock K(t) evolves according to the law of motion given by the differential equation, where F(K(t),L) is the production function, \\delta is the depreciation rate of capital, and C(t) is consumption.

The Hamiltonian function, denoted by H(t,K(t),C(t),\\lambda(t)), is a tool used in optimal control theory to solve this kind of dynamic optimization problem. It is a function of time t, state variable K(t), control variable C(t), and costate variable \\lambda(t), which is the shadow price of capital.

The first-order conditions for the optimal control problem are derived by taking derivatives of the Hamiltonian with respect to the control

The first condition U'(C(t))=\\lambda(t) implies that the marginal utility of consumption is equal to the shadow price of capital. This means that the agent is indifferent between consuming an additional unit of output and saving it to increase the capital stock.

The second condition, the costate equation, describes the evolution of the shadow price of capital over time. It says that the rate of change of the shadow price is equal to the negative of the shadow price times the marginal product of capital net of depreciation. This reflects the trade-off between consuming today and saving for future consumption.

The transversality condition, \\lambda(t)K(t)=0 as t goes to infinity, ensures that the agent does not accumulate an infinite amount of capital, which would not be optimal because of the assumption of diminishing marginal utility of consumption. This condition rules out paths where the agent accumulates capital indefinitely and never consumes it.

The LLM correctly identified the model that was described
in the math and explained the derivations mostly correctly, except
for one small point, which I highlighted in bold: the LLM incorrectly
suggested that the derivative of the Hamiltonian with respect to state
variables be set to zero. Since most of the explanation is useful,
it is likely that many students will resort to LLMs to tutor them
on examples like the one presented. Yet this comes with the risk that
part of the information will be incorrect — the LLM is not an infallible
tutor.

From: Generative AI for Economic Research: Use Cases and Implications for Economists

by Anton Korinek, Journal of Economic Literature, Vol. 61, No. 4, December 2023.

Copyright (c) by American Economic Association. Reproduced with permission.