How to Test New Pricing & Packaging
The 4th issue in our P&P masterclass series: a practical guide on the details of how to test different pricing, whether that’s globally, by country, or in an A/B test
In my last few roles, I’ve led a lot of testing on pricing:
At thredUP, we built a system to change pricing by user type and inventory
In Fortnite, we constantly tested out new bundles and packages
At Affirm, we changed our APR caps and credit models
At Apollo, we tested multiple new pricing variants
The battle wounds from these pricing initiatives have given me perspective on how to mess up rolling out new pricing & packaging.
Today, we’ll help you avoid those mistakes.
Too Many People Test Too Little
Way too many companies under-utilize this lever.
Why do I say that?
Pricing tests are really high ROI:
They’re relatively little engineering work, generally
They can have a much larger impact than other product levers
The only catch is: There is a little more work required in terms of product discovery & stakeholder alignment. So we’ll cover that.
But as far as product growth levers, it’s one of the most reliable. It’s a tool that every product growth leader should have in their toolkit.
Words: 5,038 | Est. Reading Time: 23 minutes
We’re going to give you the tools to increase confidence in your next pricing read:
3 destructive myths
Your portfolio of options
The importance of your pricing equation
How to execute on just changing the price
How to execute on the by-country test
How to execute on multiple variants
Sunsetting old pricing
Migrating users for good
The ideal pricing squad
Most common mistakes
Along the way, we’ll cover off on all the lessons from my real-life examples testing these things.
3 destructive myths in the way of pricing testing
There are three myths that drive this hesitancy with pricing. All hold grains of truth, but are hindering your growth.
1. The Myth of Fairness
This is the idea that it might be “unfair” for someone to be able to pay less than another for your product.
The reasons it’s wrong:
Just by virtue of purchasing power differences, every consumer already has a different ability to pay for your product.
The vast majority of other companies and products also do it.
Even if it is unfair, the potential benefit outweighs the gain.
2. The Myth of Dissuasion
This is the idea that users might be dissuaded from purchasing your product if they see different prices.
The reasons it’s wrong:
In reality, most people just try to get the lower price.
There are multiple other explanations, like a bug, currency, or cost.
The amount of people who might be dissuaded is less than the potential gain you might be get from learning more about your pricing.
3. The Myth of Impossible Migration
This is the idea that it’s only worth changing your price if you can get to statistical significance in a few weeks.
The reasons it’s wrong:
While people do churn on new, higher prices, if timed at their contract renewal, most customers are used to it nowadays.
A price decrease is almost always well-received by a customer. Just handle the messaging.
Migration is mostly a pain to manage, not impossible.
Your portfolio of options
Even if you’re not actually testing your pricing, you are implicitly testing your pricing. At every moment, potential buyers are being tested on whether to use your product.
This is why it’s better to be intentional about your testing than unintentional. There are three main different types of pricing testing:
Changing the entire plan
Here’s when to pursue each.
Option 1 - Changing the entire plan
There are three main reasons you might just make a global change for everyone:
If you don’t have sophisticated A/B testing infrastructure, like this newsletter.
You have leadership on down conviction that global pricing changes are your strategy, like Fortnite or Apple.
You won’t hit statistical significance in another scheme anyways, like an enterprise SaaS such as Snowflake.
All three are totally valid.
Option 2 - Testing by-country
Geographical testing is interesting. A lot of people have a lot of faith in it.
But it violates even more tenets of statistics - the people are totally different with many confounds - so there’s no point in applying statistical significance to the data.
But, nevertheless, you can get much more specific data.
Its best use is not for ARPA and conversion rate, since willingness to pay vary so greatly from country to country.
Its best use is for retention and expected LTV. Single country testing is widely used in gaming for that very reason. Testing in New Zealand, for instance, returns very representative D30 retention numbers to what you will get in a global roll-out.
Option 3 - Testing variants
Finally, you have the holy grail of pricing testing: shipping different variants that are randomly assigned to a user.
You tend to see this most with products where the price is not a widely discussed topic - because otherwise people would be confused at seeing different prices - but volume is high enough to get data relatively quickly.
For instance, to calculate a 1% vs 10% vs 20% conversion rate differences would take X weeks at 95% statistical significance and a 5% conversion:
If you have a 100,000 visitors, you can detect a 10% change in 1 week with 95% confidence. But if you only have 1,000 visitors, that test becomes untenable. It will take years.
Your Pricing Equation
In the generalized case, for pricing changes, the three metrics to look at are:
Conversion Rate x ARPA x expected LTV
Here’s what I mean by each of those:
Conversion Rate: % of people who see your pricing page who convert
Average Revenue per Account: this is the monthly $ amount they pay you per conversion
Expected LTV: how many months you expect users to stay, or how many times you expect them to repeat purchase
This is what I call, “your pricing equation.”
What about high ticket businesses?
This even can be adapted for sales-led businesses.
Depending on your business model (eg, if you don’t show pricing), you may want to change the conversion rate from pricing page to something like sales qualified lead to closed-won. It looks like this:
Close-won rate * ARPA * Exp LTV
On the margins, it doesn’t affect such a high-level output metric. But if you 10% your prices, you end up winning way more deals.
What about non-zero margin businesses?
This can also be adapted to businesses with relevant Cost of Good Sold (COGS).
You just want to adjust your revenue variable to something more like contribution margin (revenue with a relevant margin adjustment). It looks like this:
Conversion rate * Average contribution margin per user * expected LTV
It’s mainly relevant if your margin varies significantly by plan or visitor type.
How do you measure Expected LTV?
There’s two things on expected LTV:
For most testing, people just look at the actuals
For hypotheticals, people look at an existing retention multiplier
For the most advanced:
You can use an existing retention multiplier via multiple regression including usage data
Here’s how the multiplier works: you take your historical ratio and apply it
The reason this works is, most retention curves stack, they don’t cross:
How to execute on changing the entire plan
So now that we understand our metrics, let’s talk through the intricacies of reading them when you don’t properly test.