Creating a synthetic control group (B2B quasi-experiment)

Modeling a synthetic (but high quality) control group as a baseline to infer whether the change you've made in your business was worth it or not.

We’ve all been there.

Skipping an A/B test even though we knew it was the best way to see if a change had an impact.

Fortunately, you can still conduct a so-called quasi-experiment by creating a synthetic control group as a baseline for your already-implemented treatment.

In 2015, researchers from Harvard and Stanford successfully used the synthetic control method to estimate the economic impact of the 1990 German reunification on West Germany.

However, this got me thinking — could a similar approach be used for a business context? — Yes.

Google Inc. researchers have used a similar technique by applying Bayesian models to create a synthetic control group, predicting what market responses would have been WITHOUT their advertising campaigns.

So let's evaluate how the synthetic control group method performs in another interesting business scenario—pricing.

After reading this article, you'll know:

What a synthetic control method is.
What are the requirements for the dataset.
How to model a high-quality synthetic control group.
How to visualize the results of your synthetic experiment.
How to infer whether there is a “lift” from your intervention.
How to stress-test your results with placebos.

What is a synthetic control method?

A synthetic control method is a statistical technique used to evaluate the effect of an intervention or treatment in cases where A/B tests aren’t possible (or you simply did not perform them but implemented the change already).

We do so by estimating what would have happened if the intervention HAD NOT occurred. We essentially create a fake (but high quality) control group for our RCT experiment and see how the treatment helped lifting/not lifting the main variable of interest compared to this made-up control group. The algorithm creates a synthetic control group by finding a weighted average of control units that closely matches the treated unit in terms of other variables (columns) BEFORE the treatment.

This method is useful for understanding what effect a specific action (like a new marketing campaign or new pricing) has had on a particular outcome (like your customer base or sales).

Synthetic control group example with a dataset.

In this article, I will show you how to use synthetic groups for a hypothetical B2B business.

B2B case study example.

The chief commercial officer (CCO) of a B2B manufacturing business recently changed the pricing and adapted a new pricing strategy for just one of their customers, Company_1, starting from January 2024.

This business sells materials to other companies, and the monthly revenue from each customer varies based on their needs.

Now, more than 4 months later, the CCO wants to know if the new pricing strategy for Company_1 should be gradually implemented for all other customers to increase the overall monthly revenue.

However, back in time, the CCO couldn’t perform an A/B test due to several reasons:

lack of funds as it was not planned and accounted in the budget,
absence of the necessary software to conduct the test on-site,
insufficient knowledge on how to properly carry out an A/B test.

To bring the mentioned case study about our B2B manufactiring business to life, we simulate a dataset example:

And here is a better look at how the company, date, and month_num columns are sorted:

The structure of the dataset is as follows:

there are 5 different variables (columns)
there are 320 observations (rows)
company = the name of the customer
date = the date (month) when the company was a customer
sales_revenue = the total revenue generated from the customer in that specific month
company_size = the number of employees
month_num = the numerical value of the month

The algorithm requires numerical values for the time series. In this example, you pay monthly for the raw materials you purchase from the company. However, you can also pay yearly. Then, you would only need one column for the date, called “year”. As years are numerical (integers), it will work with the synthetic control group algorithm. However, here, I show you a more complicated scenario where the customer pays monthly.

You can simulate the same made-up dataset with the code below:

import pandas as pd
import numpy as np

np.random.seed(3)

num_companies = 20
months = pd.date_range(start='2023-01-01', end='2024-04-01', freq='MS')
num_months = len(months)

companies = np.repeat([f"Company_{i}" for i in range(1, num_companies + 1)], num_months)
months = np.tile(months, num_companies)

company_sizes = np.random.randint(50, 1000, num_companies)
company_sizes = np.repeat(company_sizes, num_months)

subscription_revenue = np.random.normal(8000, 2000, num_companies * num_months)

df = pd.DataFrame({
    'company': companies,
    'date': months,
    'subscription_revenue': subscription_revenue,
    'company_size': company_sizes
})

treatment_start = pd.to_datetime('2024-01-01')
df.loc[(df['company'] == 'Company_1') & (df['date'] >= treatment_start), 'subscription_revenue'] += 5000  # Treatment effect

df['month_num'] = df.groupby('company').cumcount() + 1

treatment_period = df[df['date'] == treatment_start]['month_num'].values[0]

df

There are some requirements for the dataset when using the SyntheticControlMethods library in Python:

All variables should be numerical, except for the unit identifier, which can be categorical. In this case, the unit identifier is company.
The dataset must be sorted by the unit identifier (company) and then by time (month_num) — this can also just be a year column as I have mentioned before, as the year will be automatically an integer.
Only one unit identifier should be used, preferably in the form of a string with the unit name. For this analysis, use the company column as the unit identifier. Avoid using multiple unit identifiers. If necessary, exclude additional identifiers by dropping the column from the dataframe or using the exclude_columns argument when calling Synth().
The dataset must contain one treated unit and multiple control units (in our case, multiple companies). The algorithm will attempt to find a weighted average of these control units (companies) that most closely resembles the treated unit in terms of other variables (columns) and outcome in the pre-treatment period.
Make sure there are no missing values for the outcome variable (subscription_revenue). The synthetic control method uses the entire outcome timeseries, and thus does not accept missing values. TIP: Handle missing outcome data by imputation (averages/medians) or by dropping the units.

Modeling your synthetic group.

The CCO has changed the pricing for one of the customer (Customer_1).

To understand the impact of this change, we need to simulate a synthetic control group that mimics what would have happened if the pricing change HAD NOT been implemented for Customer_1.

We can do that by using the SyntheticControlMethods library, importing the Synth class/function, and running the following code:

from SyntheticControlMethods import Synth

sc = Synth(dataset = df,
           outcome_var = "sales_revenue",
           id_var = "company",
           time_var = "month_num",
           treatment_period = 13, #the first month of the treatment
           treated_unit = "Company_1",
           n_optim=30,
           pen=0,
           exclude_columns = "date")

Where:

dataset = pandas data frame containing the dataset.
outcome_var = name of outcome column in data.
id_var = name of unit indicator column in data, e.g. which column contains the other examples of the customers.
time_var = name of time column in data, e.g. “year”
treatment_period = time of first observation after the treatment took place, i.e. first observation affected by the treatment effect for the specific customer.
treated_unit = name of the unit that recieved treatment, e.g. the specific customer “Customer_1”.
n_optim = defaults to 10, number of different initialization values for which the optimization is run (higher number means longer runtime, but a higher chance of a globally optimal solution).
pen = defaults to 0, penalization coefficient that controls the importance of minimizing differences between individual control units and the treated unit (higher values give more weight to these pairwise differences).
exclude_columns = columns in the df that are not needed for the analysis (in this case, we have to exclude date, as it's not an integer, as we have only used that column to calculate our time integer variable month_num).

The ML algorithm will find the optimal weights (all together equaling 1) for each of the customer, their features and historical data to model a synthetic control group that we will be able to retrospectively use as our baseline comparison for the A/B quasi-experiment.

Visualizing the results of your synthetic control method.

To visualize the results, we will show how the sales revenue from the synthetic company we have just modeled compares to the sales revenue from the real company over time.

We can do that by running the following code:

sc.plot(["original", "pointwise", "cumulative"], 
        treated_label="Company_1", 
        synth_label="Synthetic Company_1", 
        treatment_label="New Pricing Strategy")

Where:

[“original”, “pointwise”, “cumulative”] = the list of three charts we want to create when we use the synthetic control method.
treated_label = the label on the chart you want the treated variable to be named as.
synth_label = the label on the chart you want the synthetic (modeled) variable to be named as.
treatment_label = the label on the chart with the arrow pointing at when the treatment happened.

After running the code, we will get one plot that includes three subplots showing the results of the synthetic control experiment:

Just a note: You see that the new pricing strategy in the chart is indicated from the month 12, even though in our model, we specified the treatment_period = 13. This is because the treatment_period parameter in the model is the time of the FIRST full observation after the treatment took place, i.e. first observation affected by the treatment effect for the specific customer (that’s why on the chart it’s shifted back by one unit, as the treatment had been set to have its first full impact in the 13th month).

Three types of charts for visualizing the results of the synthetic control analysis.

Let's explain very simply what these three charts show.

a) Original chart.

This chart shows the actual sales revenue for the Company_1 (blue line) and what the machine learning algorithm predicted the sales revenue would be WITHOUT the new pricing strategy (red dashed line).

If after the treatment, the actual revenue is consistently higher than the synthetic one, the new pricing strategy shows a positive improvement in revenue.

Notice how the synthetic group (red dashed line) remains consistent after the treatment happens in the 12th month — this is what you would want to see.

b) Pointwise chart.

This chart shows the differences in sales revenue at each moment in time for the Company_1 (blue line) RELATIVE to the synthetic control group (red dashed line). The red synthetic group line is a constant of a 0 because relative to itself: Y_predicted — Y_predicted = 0.

If the difference is above 0, it shows a positive improvement in revenue after the treatment.

c) Cumulative chart.

This chart adds up all the differences in sales revenue over time for the Company_1 (blue line) RELATIVE to the synthetic control group (red dashed line). The red synthetic group line is a constant of a 0 because relative to itself: ∑ (Y_predicted — Y_predicted) = 0.

If the difference is positive and it accumulates towards a higher positive value, it shows a consistent positive improvement after the treatment.

Printing out weights and RMSPE.

The machine learning algorithm modeled the synthetic group successfully by using different weights of the combined factors for each company.

If you want to see what companies the algorithm used to modeled your synthetic control group, you can print out the weights with the following code:

print(sc.original_data.weight_df)

You will then be able to see the exact weights that were used:

The results will show you the names of the companies (which could also be stores, suppliers, competitors, partners etc.) and how much of their variables, including other columns and historical data, were used to create your synthetic group.

In this case, we can see that company_3, company_5, and company_13 were used because the machine learning algorithm found the biggest similarities between these companies and your main company, company_1, which you are evaluating. Remember that if you sum up all the weights that the ML algorithm used, you always get 1.

However, if we change the argument called “pen” to penalize larger deviations in the matching process, we will see how the machine learning algorithm adapts and now chooses different weights.

This “pen” argument is used to control the penalty for discrepancies between the treated unit and the synthetic control. By increasing this penalty, the algorithm is "forced" to select companies whose data more closely matches the historical data of your main company, company_1. This adjustment may result in either a) selecting different companies or b) altering the contribution of existing ones to better fit the desired criteria. Playing with this “pen” parameter is where you are creating a high-quality synthetic control group. However, you can also set pen = “auto”, and the algorithm will find the most optimal penalization for your model.

After running the synthetic model once again, but now with pen = “auto”, we can see that the machine learning algorithm chooses only 2 companies out of 20 to model the synthetic group.

In general, you should always include at least some penalization in your control group model because it helps prevent overfitting. It simply ensures that the synthetic control group is not too closely matched to the treated unit by minimizing large deviations, which makes the results more robust and reliable.

Along with the weights, we can also print out the Root Mean Squared Prediction Error (RMSPE) for both, pre and post treatment, and calculate the post/pre RMSPE ratio.

We can do that with the following code:

print(sc.original_data.rmspe_df)

We then get these results:

The RMSPE measures how well the synthetic control (red dashed line) matches the actual sales data (blue line) before AND after the intervention.

In our example, Company_1 has a pre-intervention RMSPE of 21854.67 and a post-intervention RMSPE of 63198.92.

Thus, these two values are divided, resulting in a post/pre ratio of 2.89, which indicates that the prediction error increased after the intervention (which is — suprisingly — what we want to see in this case). This increase in error suggests that the new pricing strategy had a significant impact on the revenue.

We are looking for a post/pre RMSPE ratio that is at least greater than 1, as this would suggest that the error is larger after the treatment. Simply put, we want the two lines (blue and red) to be as far apart from each other as possible after the treatment.

Stress-testing your results with a placebo.

There are two placebo methods to test the reliability of your model.

a) In-time placebo.

We can find whetehr our results are reliable by stress-testing them with a so-called “in-time placebo”.

This will run the algorithm to simply “pretend” that the treatment happened EARLIER than it actually happened. We then try to observe whether the actual sales_revenue (blue line) and the synthetic group (red dashed line) are staying close to each UNTIL the ACTUAL treatment happened when we would expect them to start diverging as seen before.

We can do that by running the following code:

sc.in_time_placebo(4)

sc.plot(['in-time placebo'], 
        treated_label="Company_1", 
         synth_label="Synthetic Company_1")

Where:

4 = when you want to pretend the treatment happened (in our case 4th month)
n_optin is not specified here, but it's automatically chosen as 10 and as mentioned before, it's the number of repeated initialization of the algorithm.
['in-time placebo'] = the method used to test the results
treated_label = a label name for the unit that recieved treatment
synth_label = a label name for the synthetic group

After running the algorithm with the placebo treatment, we get these results:

The results show how the placebo treatment is applied at a different time point where no actual intervention occurred (indicated by the vertical dotted line = 4th month).

The blue line again represents Company_1, and the red dashed line represents the recalculated synthetic control.

The placebo part comes in when the algorithm pretends there’s a treatment earlier to see if the results are still the same. In this case, the results showed that when the placebo treatment was applied at a sooner time before the actual treatment, the red line (synthetic control) circa stayed close to the blue line (actual sales_revenue). The fact that the actual and synthetic lines stayed approximately close (from the “pretended” placebo treatment in the 4th month, until the “real” treatment in the 12th month) means the placebo had no effect (as expected in real life) and the changes we saw in the first analysis are most likely real and not just random.

b) In-space placebo.

Using the in-space placebo test, we can obtain synthetic control estimates for companies that did not experience the new pricing strategy (treatment).

We essentially create multiple control groups for each of the company and will be able to calculate the pre/post RMSPE ratio for all companies individually, visualize the results, and find out whether the Compapny_1 is an outlier with the highest RMSPE ratio.

To do so we run this code:

sc.in_space_placebo(5)

Where:

5 = placebo replications (how many control units will be used to create placebo treatments for the companies)

To visualize the pre/post RMSPE ratio results for each of the company, we can run this code:

sc.plot(['rmspe ratio'])

And we get this bar chart showing the distribution of the RMSPE ratio for each of the company:

From the results we can observe that the Company_1 has the highest post/pre ratio.

The highest RMSPE ratio shows that our results are reliable. As explained before, the RMSPE measures how well the synthetic control groups (now for each of the company) match the actual sales data before and after the intervention. With this in mind, we can see that this highest error for the Company_1 suggests that the new pricing strategy had a significant impact on the revenue when compared to other companies where the treatment WAS NOT performed. This is a randomly generated dataset, so keep in mind that you should see a bigger difference in your treated unit and other non-treated units. However, we can still conclude that we get the highest RMSPE ratio for the Company_1.

We can also visualize all of our in-space placebos using the following code:

sc.plot(['in-space placebo'], 
        treated_label="Company_1",
        synth_label="Synthetic Company_1")

The output will be displayed as a line chart, which will show the comparison of the treated unit against the in-space placebo units:

The chart shows the rest of 19 companies (grey lines) compared to the treated company (blue line). From the chart, we can easily observe that in the 12th month, something happened for the Company_1, as the sales revenue increased significantly, while the other companies show no change in trend after the 12th month (as expected, as placebos should not have any effect in real life).

FAQ:

Here are answers for FAQs I got asked.

Can you use the synthetic control method on other examples than “customers”?

Yes. You can use synthetic control analysis on multiple different scenarios in your business.In this example, we have implemented a change to one customer in a B2B business. But, you can perform similar analysis on: customers, suppliers, stores, countries, regions, competitors, partners, even your own employees.

How do I model a high-quality synthetic control group?

To model a high-quality synthetic control group, use the machine learning algorithm and you don’t forget to manually adjust the pen parameter to pen = “auto” when you fit your model (as it defaults to 0). You should also initialize the algorithm to run multiple times by increasing the n_optim = 30 (as it defaults to 10).

What's in it for you?

You now know how to simulate A/B test retrospectively by creating a synthetic made up control group for your quasi-experiment to see whether the change you have implemented in your business has had a causal effect or not.

This can help you:

conduct quick "quasi" A/B tests without needing additional funds or budget adjustments,
perform evaluations without the need for specific on-site software,
assess if changes in policy, product, or marketing are effective,
test changes safely on one unit (customer, supplier, store, country, region, competitor, partner) before wide implementation, minimizing potential negative impacts,
and quickly compare business performance on your main KPIs and metrics before and after interventions.

You didn't conduct an A/B test. You can still simulate one retrospectively.

Tomas Jancovic

Data @ Trustpilot (3x Microsoft-certified Data Analyst, 2x Meta-certified Professional Data Scientist)

Modeling a synthetic (but high quality) control group as a baseline to infer whether the change you've made in your business was worth it or not.

What is a synthetic control method?

Synthetic control group example with a dataset.

B2B case study example.

Modeling your synthetic group.

Visualizing the results of your synthetic control method.

Three types of charts for visualizing the results of the synthetic control analysis.

a) Original chart.

b) Pointwise chart.

c) Cumulative chart.

Printing out weights and RMSPE.

Stress-testing your results with a placebo.

a) In-time placebo.

b) In-space placebo.

FAQ:

Can you use the synthetic control method on other examples than “customers”?

How do I model a high-quality synthetic control group?

What's in it for you?

More articles by Tomas Jancovic

Explore topics

Modeling a synthetic (but high quality) control group as a baseline to infer whether the change you've made in your business was worth it or not.

What is a synthetic control method?

Synthetic control group example with a dataset.

B2B case study example.

Modeling your synthetic group.

Visualizing the results of your synthetic control method.

Three types of charts for visualizing the results of the synthetic control analysis.

a) Original chart.

b) Pointwise chart.

c) Cumulative chart.

Printing out weights and RMSPE.

Stress-testing your results with a placebo.

a) In-time placebo.

b) In-space placebo.

FAQ:

Can you use the synthetic control method on other examples than “customers”?

How do I model a high-quality synthetic control group?

What's in it for you?

More articles by Tomas Jancovic

Applying causal inference and placebo tests to infer business causations (not just correlations).

Thinking fast and distributions.

Explore topics