Ok. You’ve written your killer ads and included distinct messages in them for testing purposes. Now it’s time to evaluate that performance using the data you’ve collected. It’s time to judge which messages are the strongest and which are the weakest.
What do we need to know in order to evaluate these ads:
- Have they each collected enough data to reach an informed decision
- What are the data points that the ad can be evaluated on
- What are the KPIs we will want to keep in mind while evaluating the data
I would strongly advise setting an impression threshold minimum you want to consider that would generate enough opportunity for clicks. In Google, it can be difficult to run ads equally so that each receives their fair share of the available impressions, even setting ads to run evenly; Google has tucked away the option to run ads evenly ever more in recent years in favor of showing “the best” ad. This leads to situations where your impressions are more heavily weighted to one ad and it’s difficult to compare objectively. The impression threshold allows for an equal playing field.
Ad A would have 1000 impressions – 3.12% CTR – 31 clicks – 4.18% Conv. rate 6.8% – 2 Conversions
Ad B would have 1000 impressions – 2.86% CTR – 28 clicks – 11.16% Conv. rate – 3 Conversions
In all likelihood, both ads won’t have the same number of impressions but with a set threshold, you can plug in their CTR and Conv. rate to get a sense of if they were held equal how would their performance metrics rival each other. In this case, the ad with the lower CTR would produce more conversions even though the competing ad would get more clicks overall.
Now that we’ve evaluated their performance, it’s time to judge them based on our KPI. If you know that your organization’s Close rate works best when the sales pipeline is fed as much quantity as possible then you might opt for the ad with the higher number of clicks and CTR and that’s good, it works for your business. For someone else, who needs to have a smaller number of clicks to filter through but solid leads then driving more conversions would be advantageous. Whichever of the two scenarios is best for your business, having run the competing messages you can gain a clearer understanding of which is driving the best possibilities for you. Taking it a step further from the previous blog post, you can review the messages themselves to learn what phrasing helps you achieve your goal best and look at how that fits into your overall marketing and whether those same tones and “pain points” align.
Admittedly, Responsive Search Ads create an opportunity to do more testing more efficiently with the sheer number of headlines and descriptions that can be loaded into one ad at a time. The issue I take with them is not being able to compare them from an unbiased data-based standpoint because the performance metrics we used to evaluate in the above example aren’t available line by line. The best we can do is control the variables to position ourselves to gain data for evaluation. Whether that’s two RSA with 4 headlines 2 descriptions but only one of the headlines is different to test for tone or one RSA with 10 headlines 4 descriptions and 3 headlines and 4 descriptions to keep the same tone but test Google’s algorithm is serving ads.
The problem Google has created is in the ad serving based on the ad score. Less headlines and descriptions, while a viable option for testing, rate the ad lower thus it is not shown as often. This difficult-to-quantify variable is ever looming when trying to not only plan setup for an account but also in the eventual and continuous evaluation of that initial setup plan. Thankfully, a good campaign strategist like myself enjoys the challenge and will find a way to work around it using the mountain of data available, deftly.