Practical Approaches to Optimizng Budget in Marketing Mix Modeling | by Slava Kisilevich

[ad_1]

How you can optimize the media combine utilizing saturation curves and statistical fashions

Marketing Combine Modeling (MMM) is a data-driven method that’s used to establish and analyze the important thing drivers of the enterprise final result similar to gross sales or income by analyzing the influence of assorted components that will affect the response. The aim of MMM is to supply insights into how advertising actions, together with promoting, pricing, and promotions, may be optimized to enhance the enterprise efficiency. Amongst all of the components influencing the enterprise final result, advertising contribution, similar to promoting spend in numerous media channels, is taken into account to have a direct and measurable influence on the response. By analyzing the effectiveness of promoting spend in numerous media channels, MMM can present beneficial insights into which channels are the simplest for growing gross sales or income, and which channels might have to be optimized or eradicated to maximise advertising ROI.

Advertising Combine Modeling (MMM) is a multi-step course of involving collection of distinctive steps which can be pushed by the advertising results being analyzed. First, the coefficients of the media channels are constrained to be constructive to account for constructive impact of promoting exercise.

Second, adstock transformation is utilized to seize the lagged and decayed influence of promoting on client habits.

Third, the connection between promoting spend and the corresponding enterprise final result is just not linear, and follows the legislation of diminishing returns. In most MMM options, the modeler sometimes employs linear regression to coach the mannequin, which presents two key challenges. Firstly, the modeler should apply the saturation transformation step to ascertain the non-linear relationship between the media exercise variables and the response variable. Secondly, the modeler should develop hypotheses concerning the potential transformation features which can be relevant to every media channel. Nonetheless, extra complicated machine studying fashions might seize non-linear relationships with out making use of the saturation transformation.

The final step is to construct a advertising combine mannequin by estimating the coefficients, and parameters of the adstock and saturation features.

Each saturation curves and a educated mannequin can be utilized in advertising combine modeling to optimize price range spend. The benefits of utilizing saturation curves are:

Simplicity in visualizing the affect of spend on the result
The underlying mannequin is just not required anymore so price range optimization process is simplified and requires solely the parameters of the saturation transformation

One of many disadvantages is that saturation curves are based mostly on historic information and should not at all times precisely predict the response to future spends.

The benefits of utilizing the educated mannequin for price range optimization is that the mannequin makes use of complicated relationship between media actions and different variables together with pattern, and seasonality and may higher seize the diminishing returns over time.

Information

I proceed utilizing the dataset made out there by Robyn below MIT Licence as in my earlier articles for sensible examples, and observe the identical information preparation steps by making use of Prophet to decompose developments, seasonality, and holidays.

The dataset consists of 208 weeks of income (from 2015–11–23 to 2019–11–11) having:

5 media spend channels: tv_S, ooh_S, print_S, facebook_S, search_S
2 media channels which have additionally the publicity info (Impression, Clicks): facebook_I, search_clicks_P (not used on this article)
Natural media with out spend: e-newsletter
Management variables: occasions, holidays, competitor gross sales (competitor_sales_B)

Modeling

I constructed a whole working MMM pipeline that may be utilized in a real-life situation for analyzing media spend on the response variable, consisting of the next elements:

A be aware on coefficients

In scikit-learn, Ridge Regression doesn’t supply the choice to set a subset of coefficients to be constructive. Nonetheless, a potential workaround is to reject the optuna resolution if among the media coefficients grow to be unfavourable. This may be achieved by returning a really giant worth, indicating that the unfavourable coefficients are unacceptable and should be excluded.

A be aware on saturation transformation

The Hill saturation perform assumes that the enter variable falls inside a variety of 0 to 1, which implies that the enter variable should be normalized earlier than making use of the transformation. That is vital as a result of the Hill perform assumes that the enter variable has a most worth of 1.

Nonetheless, it’s potential to use the Hill transformation on non-normalized information by scaling the half saturation parameter to the spend vary through the use of the next equation:

half_saturation_unscaled = half_saturation * (spend_max - spend_min) + spend_min

the place half_saturation is the unique half saturation parameter within the vary between 0 and 1, spend_min and spend_max signify the minimal and most spend values, respectively.

The whole transformation perform is supplied beneath:

class HillSaturation(BaseEstimator, TransformerMixin):
def __init__(self, slope_s, half_saturation_k):
self.slope_s = slope_s
self.half_saturation_k = half_saturation_kdef match(self, X, y=None):
return self
def rework(self, X: np.ndarray, x_point = None):
self.half_saturation_k_transformed  = self.half_saturation_k * (np.max(X) - np.min(X)) + np.min(X)
if x_point is None:
return (1 + self.half_saturation_k_transformed**self.slope_s / X**self.slope_s)**-1
#calculate y at x_point
return (1 + self.half_saturation_k_transformed**self.slope_s / x_point**self.slope_s)**-1

Price range Optimization utilizing Saturation Curves

As soon as the mannequin is educated, we are able to visualize the influence of media spend on the response variable utilizing response curves which have been generated via Hill saturation transformations for every media channel. The plot beneath illustrates the response curves for 5 media channels, depicting the connection between spend of every channel (on weekly foundation) and response over a interval of 208 weeks.

Optimizing price range utilizing saturation curves entails figuring out the optimum spend for every media channel that can consequence within the highest total response whereas holding the whole price range fastened for a particular time interval.

To provoke optimization, the typical spend for a selected time interval is mostly used as a baseline. The optimizer then makes use of the price range per channel, which may fluctuate inside predetermined minimal and most limits (boundaries), for constrained optimization.

The next code snippet demonstrates how price range optimization may be achieved utilizing the minimize perform from the scipy.optimize package deal. Nonetheless, it’s value noting that different optimization packages, similar to nlopt or nevergrad, may also be used for this function.

optimization_percentage = 0.2media_channel_average_spend = consequence["model_data"][media_channels].imply(axis=0).values
lower_bound = media_channel_average_spend * np.ones(len(media_channels))*(1-optimization_percentage)
upper_bound = media_channel_average_spend * np.ones(len(media_channels))*(1+optimization_percentage)
boundaries = optimize.Bounds(lb=lower_bound, ub=upper_bound)
def budget_constraint(media_spend, price range):  
return np.sum(media_spend) - price range
def saturation_objective_function(coefficients, 
hill_slopes, 
hill_half_saturations, 
media_min_max_dictionary, 
media_inputs):
responses = []
for i in vary(len(coefficients)):
coef = coefficients[i]
hill_slope = hill_slopes[i]
hill_half_saturation = hill_half_saturations[i]
min_max = np.array(media_min_max_dictionary[i])
media_input = media_inputs[i]
hill_saturation = HillSaturation(slope_s = hill_slope, half_saturation_k=hill_half_saturation).rework(X = min_max, x_point = media_input)
response = coef * hill_saturation
responses.append(response)
responses = np.array(responses)
responses_total = np.sum(responses)
return -responses_total
partial_saturation_objective_function = partial(saturation_objective_function, 
media_coefficients, 
media_hill_slopes, 
media_hill_half_saturations, 
media_min_max)
max_iterations = 100
solver_func_tolerance = 1.0e-10
resolution = optimize.decrease(
enjoyable=partial_saturation_objective_function,
x0=media_channel_average_spend,
bounds=boundaries,
technique="SLSQP",
jac="3-point",
choices={
"maxiter": max_iterations,
"disp": True,
"ftol": solver_func_tolerance,
},
constraints={
"sort": "eq",
"enjoyable": budget_constraint,
"args": (np.sum(media_channel_average_spend), )
})

Some vital factors:

enjoyable — the target perform to be minimized. On this case, it takes the next parameters:
media coefficients — Ridge regression coefficients for every media channel which can be multiplied with the corresponding saturation degree to estimate the response degree for every media channel.
slopes and half saturations — two parameters of the Hill transformation spend min-max values for every media channel to appropriately estimate the response degree for a given media spend.
The target perform iterates over all media channels and calculates the whole response based mostly on the sum of particular person response ranges per media channel. To maximise the response within the optimization perform, we have to convert it right into a minimization downside. Due to this fact, we receive the unfavourable worth of the whole response, which we then use as the target for the optimization perform.
technique = SLSQP — The Sequential Least Squares Programming (SLSQP) algorithm is a well-liked technique for constrained optimization issues, and it’s typically used for optimizing price range allocation in advertising combine modeling.
x0 — Preliminary guess. Array of actual components of measurement (n,), the place n is the variety of unbiased variables. On this case, x0 corresponds to the media channel common spend, i.e., an array of common spends per channel.
bounds — refers back to the bounds of media spend per channel.
constraints — constraints for SLSQP are outlined as a listing of dictionaries, the place budget_constraint is a perform that ensures that the sum of media spends is the same as the fastened price range: np.sum(media_channel_average_spend)

After the optimization course of is full, we are able to generate response curves for every media channel and examine the spend allocation earlier than and after optimization to evaluate the influence of the optimization course of.

Price range Optimization utilizing the Skilled Mannequin

The method of optimizing the price range utilizing the educated mannequin is sort of much like the earlier method, and may be utilized to each fashions which have and people who do not need the saturation transformation. This method affords higher flexibility for optimizing advertising combine, permitting for optimization throughout numerous time durations, together with future ones.

The next code highlights the variations between the present and the earlier method:

The common spend per channel is multiplied by the specified optimization interval

optimization_period = consequence["model_data"].form[0]
print(f"optimization interval: {optimization_period}")optimization_percentage = 0.2
media_channel_average_spend = optimization_period * consequence["model_data"][media_channels].imply(axis=0).values
lower_bound = media_channel_average_spend * np.ones(len(media_channels))*(1-optimization_percentage)
upper_bound = media_channel_average_spend * np.ones(len(media_channels))*(1+optimization_percentage)
boundaries = optimize.Bounds(lb=lower_bound, ub=upper_bound)

We will interpet the outcomes of the optimization as “what’s the applicable quantity of spending per channel throughout a selected time interval”

The target perform expects two extra parameters: optimization_periodand additional_inputs— all different variables like pattern, seasonality, management variables used for mannequin coaching and out there for the chosen time interval:

def model_based_objective_function(mannequin, 
optimization_period, 
model_features, 
additional_inputs, 
hill_slopes, 
hill_half_saturations, 
media_min_max_ranges, 
media_channels, 
media_inputs):media_channel_period_average_spend = media_inputs/optimization_period
#rework authentic spend into hill remodeled
transformed_media_spends = []
for index, media_channel in enumerate(media_channels):
hill_slope = hill_slopes[media_channel]
hill_half_saturation = hill_half_saturations[media_channel]
min_max_spend = media_min_max_ranges[index]
media_period_spend_average = media_channel_period_average_spend[index]
transformed_spend = HillSaturation(slope_s = hill_slope, half_saturation_k=hill_half_saturation).rework(np.array(min_max_spend), x_point = media_period_spend_average)
transformed_media_spends.append(transformed_spend)
transformed_media_spends = np.array(transformed_media_spends)
#replicate common perio spends into all optimization interval
replicated_media_spends = np.tile(transformed_media_spends, optimization_period).reshape((-1, len(transformed_media_spends)))
#add _hill to the media channels
media_channels_input = [media_channel + "_hill" for media_channel in media_channels]
media_channels_df = pd.DataFrame(replicated_media_spends, columns = media_channels_input)
#put together information for predictions
new_data = pd.concat([additional_inputs, media_channels_df], axis = 1)[model_features]
predictions = mannequin.predict(X = new_data)
total_sum = predictions.sum()
return -total_sum

The target perform takes in media spends which can be bounded by our constraints throughout the time interval via the media_inputs parameter. We assume that these media spends are equally distributed alongside all weeks of the time interval. Due to this fact, we first divide media_inputs by the point interval to acquire the typical spend after which replicate it utilizing np.tile.After that, we concatenate the non-media variables with the media spends and use them to foretell the response withmannequin.predict(X=new_data)for every week throughout the time interval. Lastly, we calculate the whole response because the sum of the weekly responses and return the unfavourable worth of the whole response for minimization.

Optimizing price range spend in advertising combine modeling is vital as a result of it permits entrepreneurs to allocate their sources in the simplest manner potential, maximizing the influence of their advertising efforts and attaining their enterprise aims.

I confirmed two sensible approaches to optimizing advertising combine utilizing saturation curves and educated fashions.

For an in depth implementation, please consult with the entire code out there for obtain on my Github repo.

Thanks for studying!

[ad_2]

Source link

Practical Approaches to Optimizng Budget in Marketing Mix Modeling | by Slava Kisilevich | Feb, 2023

Scaling False Peaks – O’Reilly

Meet RLPrompt: A New Prompt Optimization Approach with Reinforcement Learning (RL)

Editor

Meet RLPrompt: A New Prompt Optimization Approach with Reinforcement Learning (RL)

Leave a Reply Cancel reply

Browse by Category

Categories

Recommended

Practical Approaches to Optimizng Budget in Marketing Mix Modeling | by Slava Kisilevich | Feb, 2023

How you can optimize the media combine utilizing saturation curves and statistical fashions

Information

Modeling

Scaling False Peaks – O’Reilly

Meet RLPrompt: A New Prompt Optimization Approach with Reinforcement Learning (RL)

Editor

Meet RLPrompt: A New Prompt Optimization Approach with Reinforcement Learning (RL)

Leave a Reply Cancel reply

Browse by Category

Browse by Tags

Categories

Recommended