Which Package Design Metrics are Most Predictive of Success in Market?

When a brand embarks on a design initiative, the creative brief may prioritize any number of objectives, ranging from tactical considerations (e.g., “improve standout at shelf”) to branding and aesthetic improvements (e.g., “look more modern”). In many cases, consumer research will reveal that a redesign concept gains ground in one area, but loses in another. For example, new packaging may communicate “natural” more effectively than the existing design, but take longer for consumers to find within a competitive set—or it may grab attention more quickly, but be less aesthetically appealing to consumers. Are these worthwhile trade-offs? Historically, there hasn’t been a robust, data-driven way to answer questions like these.

As the only syndicated design data provider, Designalytics detects when redesigns occur across hundreds of consumer-packaged-goods (CPG) categories, then evaluates the old versus the new packaging across several design performance areas: “capture” (standout, findability), “convert” (purchase preference, communication), and “diagnostics and supporting factors” (design element resonance, mental availability). We’ve been conducting an ongoing meta-analysis of our redesign database, correlating changes in specific metrics (when comparing old and new designs) with year-over-year sales trends following the design changes.

The result? The first-ever data of its kind, designed to help brands and creative agencies answer the question, “how much does this metric really matter?”

This data is ground-breaking, both for the way it contradicts long-held industry assumptions about the value of specific design metrics and, more significantly, for the way it asserts the vital importance of design in driving business outcomes. The fact that any design metric is so strongly correlated with sales performance should be eye-opening for every senior executive in the industry—especially when one considers that current marketing mix models attribute little or no value to design.

Summary: the predictive power of select design performance metrics¹

Designalytics metric	Correlation with sales outcomes^*
Purchase preference Do consumers prefer either design for purchase over the brand they currently buy?	96%
Communication How effectively does each design convey the top purchase-driving attributes in the product category?	88%
Find time How quickly can consumers locate the brand when actively searching for it?	50%
Likability How much do consumers like or dislike the designs?	46%

^*Meaningful increases or decreases in these metrics align with sales trends following the redesign, compared to the same period during the prior year. Note that predictions are binary, not volumetric (i.e., they predict whether sales will increase or decrease, but not by how much).

Metrics that matter a lot

Purchase preference. Measures of purchase intent should reliably predict business outcomes but, to the best of our knowledge, no other research provider has successfully validated its metrics. Designalytics’ unique measure of purchase preference is 96% predictive, and our unprecedentedly high data quality is a primary contributor.²
Communication. Saying important things better contributes mightily to sales performance—in fact, this measure is nearly as important as purchase intent, with an 88% correlation to directional in-market outcomes. Designalytics objectively determines the top purchase-driving attributes in a given product category through research with hundreds of consumers, then utilizes a forced-choice exercise to ascertain which design is communicating each attribute most effectively.

Metrics that don’t directly correlate with sales performance

Find time. Many brands worry that dramatic changes to an existing package could render the brand harder to find—and consumers, not inclined to spend much time scanning store shelves, will simply give up and pick up a competitor’s product instead. Often, this leads brands to implement action standards that require redesigns to perform as well as their predecessors on measures of findability. Our analysis suggests this criterion could contribute to misguided design decisions. In reality, redesigns that take longer for consumers to locate don’t lead to diminished sales; changes in the time required to find a product within a competitive set have zero correlation with in-market outcomes. If you consider how important communication shifts are to driving outcomes—and that these shifts may dramatically impact the package content and hierarchy of communication—it makes sense that redesigns focused on improved messaging could result in longer measured find times. Moreover, as consumers acclimate to the new look over time, deficiencies in findability are likely to level out.
Likability. Surprisingly, overall design appeal has little predictive power; redesigns that improved on likability correlated with sales gains only 46% of the time. In other words, likability measures are less predictive than a coin toss. This reality underscores the importance of objective, quantitative design testing. According to a Nielsen survey, 53% of CPG professionals indicated that “senior executive favorite” is a primary influencer of design selection at their organizations.³Designalytics’ data suggests something even more profound though; not only does it not matter if the brand owner finds a design aesthetically appealing, it doesn’t seem to matter much whether consumers en masse do, at least where purchasing behavior is concerned. This isn’t to suggest that masterful creative talent isn’t essential; it just means that packaging can’t only be pretty—it needs to be strategic too.

Some readers may wonder, “why measure likability at all then?” It’s a good question—and the value of doing so lies in its ability to provide specific diagnostic feedback. Many research providers will ask consumers for their overall impression of a package: “On a scale of 1 to 5, how much does this package design appeal to you?” When the question is framed this way, the resultant data is likely to be useless.

At Designalytics, we take a fundamentally different approach to assessing likability. We ask consumers which specific elements they like or dislike (quantitatively) and why they like or dislike them (qualitatively). This helps designers understand whether the execution of specific elements is confusing or polarizing to consumers, and provides vital insights for creative refinement: “That’s a lot of text and it’s too small to read.” “The fruit imagery looks unappetizing.” “Why does a children’s character look so sinister?” “Is that a raisin or a bug?”

Metrics that may matter

Readers may be wondering about one popular design metric that hasn’t yet been mentioned: standout. Because Designalytics had not initially included our innovative eye-tracking methodology in syndicated redesign reporting, we are awaiting a critical mass of validation cases to determine the predictive value of standout. Given that designs must first be noticed in order to be considered for purchase, we expect to see a strong correlation between standout shifts and sales outcomes—but the jury is still out.

Caveats

Some metrics will matter more or less to particular brands, depending on the brand size, strengths and weaknesses of the current design (when applicable), and strategic objectives. For example, a challenger brand with low awareness may put significantly more weight on having strong standout performance than a category leader with dominant shelf position and share—or a diaper brand may seek to increase consumers’ emotional response to its design because this less tangible attribute is essential to succeed in that particular category. These are worthy strategic objectives that depend, to some extent, on a brand’s specific circumstances. That said, the two metrics we’ve determined to be highly correlated with sales outcomes—purchase preference and communication—are critical for all brands.

¹Based on an analysis of 52 case studies across CPG categories, cross-referencing Designalytics’ consumer data with IRI/Nielsen retail scanner data. (The retail data analysis compares the six months following the new design’s launch to the same period during the prior year to eliminate seasonal effects.)

²Designalytics is the only research provider to offer “first-view data quality,” where each online activity utilizes a new set of consumers. This completely eliminates the pre-exposure bias that plagues all other design research, resulting in much more reliable data. Additionally, Designalytics uses innovative behavioral exercises (not surveys), discrete choice methodologies (not monadic), and higher sample sizes than the industry “gold standard”—all of which contribute to our uniquely high data quality.