salad pricing

Food for Regression: Using Sales Data to Identify Price Elasticity

Daniel Lüttgau Blog, Data Science

A few hundred meters from our office, there is a little lunch place. It is part of a small chain that specializes in assemble-yourself, ready-to-eat salads. When we moved into our new office a few years ago, this salad vendor quickly became a daily fixture. However, overtime, this changed. We still eat there regularly, but I am certain, if one were to look at their STATWORX – related turnover the trend would not delight management and the question is why?
The answer has a lot to do with the arrival of new competitors, improved cooking skills, elaborate promotions and certainly also pricing. It is the latter – pricing – that will be at the center of this series.

When analyzing pricing related issues, it is often of essential interest to have a measure of how some change in price affects demand. The measure generally agreed upon by economists to describe this relationship is that of price elasticity of demand, \epsilon. As a relative measure, it is unit independent, which turns it into a winner. Elasticity is defined as the percent change in quantity divided by percentage change in price:

    \[\epsilon = \frac{qty/\Delta qty}{price/\Delta price}\]

Conceptually, three conditions are commonly distinguished: elasticity scores of < -1 indicate an 'elastic demand.' This means that if one increases the price by one percent, the quantity of demand decreases by more than one percent. The other two conditions are elasticities of demand > -1, in which case we speak of 'inelastic demand' and the case when elasticity equals -1. This is called 'unit elastic demand.'

Being able to deduce the actual price elasticity of demand for our salad bar would be of great help. With a reliable elastic score at hand, we can answer questions like: How many salads can we expect to sell at a given price? How does a price change of 10% affect demand? With this knowledge, it would be possible to utilize one's pricing as a tool in order to target different salad-business KPIs. Eventually, the salad bar can adjust its price in order to maximize profit or to increase sales – depending on their strategic objectives.

It is the intricacies of deriving this price elasticities of demand with regressions that will be the subject of this short blog. The situation of our salad vendor makes for an illustrative case on how we can calculate price elasticity and how they can be used to adjust one's pricing strategy.

Setup

To be upfront – although this salad vendor exists, and it is in fact an integral part in the STATWORX food chain – all the data we work with is made up. It describes, how I imagine this little salad vendor's market situation to be. With each new post, as we examine more complex issues, we will delve deeper into the intricacies of the salad vendor's world.

The question of this blog post is simple: How can we use linear regression to derive price elasticities? To explore this, we need historic prices and sales information. To begin, there will be no consideration of competition, no in-store alternatives, no new promotional activates, no seasonal-effects, or anything else.

Daily sales prices of the past two years were simulated for our little salad bar by randomly selecting prices between 5.59€ and 9.99€ – clearly not a great pricing strategy, but it suffices for this post's purposes. A multiplicative demand function was used to derive sales with some randomness added. And with that we are done simulating the data. For more details, check out the code at our Githubpage.

Calculating Elasticity of Demand

We want to know how a linear regression function relates to elasticity. It turns out that this depends on how the variables have been transformed. It is possible to deduce elasticity – a factor of relative of change – in almost any situation. Here you find the four most common transformations.

TransformationFunctionElasticity
Level-LevelY = a+bX\epsilon=b*\frac{X}{Y}
Log-Levellog(Y) = a+bX\epsilon=b*X
Level-LogY = a+b*log(X)\epsilon=\frac{b}{Y}
Log-Loglog(Y) = a+b*log(X)\epsilon=b

Dependent on the pre-regression variable transformation, different post-regression transformations are necessary in order derive the elasticity scores. The table above shows that in the case of a log-log model, the elasticity is a constant value across the entire demand curve; while in all other cases, it is dependent on the specific current price and/or demand. This means that the choice of the model is indicative of the assumed demand curve. Choosing wrongly results in a misspecified model.

This is great to know, but which model should one use? To evaluate this, I simply ran each of these four models. The results you can find in the table below, but they are nothing like you will ever find in the real world, in that all effects are highly significant and the R^2 is ridiculously high for any social or economic analysis. This is by design as hardly any randomness was added. In addition, the data was setup up in a way that the log-log model was predestined to generate the winning model.

ModelInterceptPrice Variable\varnothing ElasticityR^2
Level-Level439.58 (3.2)-38.57 (0.42)-2.500.84
Log-Level6.59 (0.01)-0.23 (0.01)-1.630.95
Level-Log671.22 (3.53)-265.66 (1.83)-2.110.93
Log-Log7.86 (0.01)-1.52 (0.01)-1.510.97

The argument is not that a log-log model is the best model to derive elasticities. Although, there are strong microeconomics arguments to be made about why the log-log model is the most reasonable model to describe demand elasticity. The underlying demand curve describes demand most like economists assume it to behave. It ensures that demand cannot sink below zero as the price increases and on the other side demand exponentially grows as the price decreases. Yet, the deductibility of a constant elasticity value, as aforementioned, is its most desirable feature. This fact makes it much easier to apply elasticity to optimize pricing.

Still, empirical analysis might guide us to assume other price-demand relationships. The graphic below shows this in an illustrative way. The legend of the graph orders the models in increasing order of fit. Looking at each graph, it becomes clear why the level-log model fares better than the level-level model, and why the log-level model outperforms the level-log model and so on. The non-linear relationship between price and demand that we introduced by relying a multiplicative demand curve is best described by the log-log model. Had I used an additive demand curve the ranking would have been the other way around. Thus, the argument is that under certain circumstances the model choice can have a significant impact.

elasticity regression comp

For the application in practice we have to be very aware of the functional form that is indicated by the regression we chose. The effects can have severe consequences. The elasticity with which the data was generated was -1.5. In order to illustrate the effect that model choice can have on the estimated elasticity, I calculated average elasticities for level-level, log-levvel and level-level models and compared it with the price coefficient of the log-log model. This is a bit of an oversimplification, but the point still stands: The results are substantially different, which has consequences when one tries to utilize the deducted elasticities. Yet, based on the graphical analysis and the model information, we would come to the conclusion that the log-log fares best, so we can proceed as theory would want us to.

Price Optimalization

Before we finish, let's quickly look at how we can use elasticity to improve the little salad vendor's erratic pricing strategy (my random daily price change). For this we need to know the salad bar's cost function. Luckily, we do: it has fix costs of about 300€ for every day it is open. The preparation of a single salad costs about 2.50€ per salad. The cost function is thus:

    \[€€TotalCost = 300€ + SaladesSold * 2.50€\]

Microeconomic theory teaches us that fix costs do not matter when calculating elasticity based margin optimized prices. I'll spare you the details, but the function to calculate the optimized price eventually states:

    \[OptimalPrice = \frac{Elasticity*CostPerSalad}{1+Elasticity}\]

Applying the elasticity derived from the log-log model, this results in a proposed optimized price that lies somewhere between 7.21€ and 7.47€. The estimation is 7.34€. Instead of daily changing its price randomly, it is best to stick to prices in this range. The salad vendor can expect to sell around 125 Salads each day, ensuring a daily profit of between 287€ and 320€.

KPIsLowerbound ElasticityExp. ElasticityUpperbound Elasticity
Elasticity-1.51-1.52-1.54
Opt. Price7.43€7.30€7.17€
Quantity126125125
Profit319€302€285€

This is a daring statement. With the actual example data the conclusion would be fine. But in practice such perfect regression results, with so little uncertainty, are unrealistic. And this is where it tends to get tricky. To illustrate this point, I adjusted the standard error from almost nonexistent to 0.15. The results should still be highly significant, but looking at the table below one would be surprised about the consequences of such small changes.

KPIsLowerbound ElasticityExp. ElasticityUpperbound Elasticity
Elasticity-1.23-1.52-1.82
Opt. Price13.51€7.30€5.57€
Quantity106125114
Profit864€302€51€

The certainty with which we proposed the optimal price was very much unfounded. In this example, the range for elasticity still is relatively small despite the increased uncertainty. Yet, the resulting price range for the ideal price is between 5.58€ and 13.73€, which is not a very precise proposal. The price range actually exceeds the highest price that the little salad vendor ever dared to set. The consequences are severe: the resulting profit varies almost sixteen-fold between the highest and lowest prices.

To state the obvious, the illustrated approach to elasticity calculations is just the tip of the iceberg. Meaning that we need to invest time into improving the current approach. The next posts will focus on intervening factors like promotional activities and similar products.

Über den Autor
Daniel Lüttgau

Daniel Lüttgau

Daniel ist Mitglied in unserem Data Science Team und organisiert unsere Wissenstransfers. In seiner Freizeit ist er politisch interessiert und ein Hundefreund.