As an SEO who is often concerned with large keyword research projects, I frequently find that keyword grouping is the most labour intensive of all of the tasks that I do. It’s labour intensive because it requires solid understanding of a business offering to determine its significant constituent parts well enough to support a healthy SEO campaign. You can’t properly chop up what you don’t yet understand. I’m reminded of The Simpsons episode where Homer visits a sushi restaurant and eats hastily & improperly prepared fugu, a fish which must be chopped expertly before serving to avoid the poisonous sections.
It’s easy to see the challenge when it comes to setting up campaigns for complex B2B clients selling all manner of exotic services such as recruitment process outsourcing, HR analytics and – my favourite – single customer view.
Surprisingly, I found that things were a little different when it came to ostensibly simple verticals such as fashion. I won’t list in detail the number of items of basic women’s clothing I did not know well enough to properly place in a keyword group, but suffice to say it was plenty.
It would be bad enough if all that was needed was to place keywords in their correct topical groups. But as is quickly becoming industry standard, keyword grouping has become hierarchical keyword clustering. Doing this hierarchical clustering is essential for proper actionable insight into landing page recommendations and structure but it adds a layer of complexity to the grouping process itself. Now, not only do, say, obscure financial software products need to be placed in the topically correct bucket, they also need to be placed under the correct parent and linked hierarchically with their correct child attributes. Depending on the product, doing this properly may deserve a product immersion session and full explainer from the client, and this is something which can’t always happen.
So what’s a keyword-grouping SEO to do?
First, get your search term classification approach in order.
There’s no way around it. You have to understand the product, and you haven’t got much time. Don’t just try to wing it. If you don’t plan the approach out, you reach a point where you are trying to stuff two quite different things into the same group just to make them fit somewhere and you already have too many groups. You think “that didn’t quite fit but just leaving that one in there won’t hurt”. But soon the integrity of your groups has begun to break down and value of your keyword research document as a guide to a successful SEO campaign is beginning to degrade.
So you need a classification approach to make sure that doesn’t happen.
The classification approach I use is based on clear predefined taxonomies using these 4 levels of analysis which structure and streamline the process of understanding the product, its context and how to chop it up.
How to use Taxonomies to Classify your Search Terms
This approach is borrowed from biologists and it’s the big man on campus when it comes to classification of things that are difficult to classify such as, say, biological organisms, or the different services your new fintech client provides. In this approach, search terms are classified according to taxonomies (naming systems) specific to each level of resolution (broad and top-level, or granular and low level). A taxon is the name of a tag applied to a search term, like ‘red’. Taxa is the plural of taxon. Multiple taxonomies may be used as required at each level of resolution, such as colour and size.
I’ll illustrate how the taxonomies are nested using trees like the one shown below, giving examples across 3 diverse verticals:
- Fashion – Items of clothing, footwear or accessories (Pink).
- Jobs – Job openings (Yellow).
- Customer marketing – Marketing software & consultancy (Blue).
Use this key to understand the examples below:
There are 2 distinct types of classification we’ll cover: product (keywords about the product itself) & contextual (keywords about the larger context in which the product exists).
1. Product based classification
The product itself, viewed at 4 levels of analysis, is the ribbon that structures this term classification approach.
The way in which the product fits into each of the 4 levels of analysis (going from very broad to very detailed) must be carefully established before the main keyword list is gathered.
The taxa at each level of resolution inherit the properties of their parent in the level above.
Product Level 0
- Complete product concept
- Lowest possible level of resolution
This is the simplest possible one-phrase summary of “what they do”. While it’s often not clearly defined when researching, we should to ensure it’s clearly defined, it’s crucial to be properly aware of it at the core of the other levels and as being the central part of the taxonomy structure. This is found by saying “X’s site is a ____ site”.
Product level 1
- Broad product type
- Low resolution
- Sometimes called ‘parent’
Broad taxonomies are characterised by few taxa. They should be established before the main keyword research starts. What are the simplest ways in which the product concept is divided up? This is often seen in the topics of the top-level navigation landing pages but not always. For example, with a recruitment site we worked on, the top-level landing pages don’t distinguish between product types – it’s one further level down under the main jobs page where the product is divided into its broad types. If the site structure isn’t helpful, formulate the broad types using knowledge of the business. After all we are modelling search demand, not the way the client has structured their website.
Product level 2
- Specific product – Not broad abstractions, but real things.
- Mid resolution – the level of common experience of the product; is often the product as spoken about in the common parlance of the user of the product – for example, ‘shirts’ or ‘site manager jobs’.
This makes up the main body of the term gathering. Find the taxa using exploratory research methods driven by the level 1 roots. Product taxonomies do not have a taxa limit. There can be more taxa at this level than the levels below. Something like ‘clothes’ couldn’t go here because clothes is an abstraction encompassing multiple types of clothes, and ‘red’ couldn’t go here because red isn’t a real thing either, it’s an attribute of a thing.
It’s a little more difficult to grasp with services and software, but it’s easier if you ask ‘at what level does the user specifically engage with intent?’ – it isn’t ‘customer marketing’ (very broad) or ‘customer marketing software’ (broad), it’s ‘customer satisfaction measurement software’ – a specific use within the category of customer marketing. The user won’t think ‘I’m going to do some customer marketing with this software’ – instead they’ll think ‘I’m going to do some customer satisfaction measurement’ with this software’.
The coloured taxons are a mix of our 3 verticals, showing how taxons from each vertical would fit into the structure.
Product level 3
- Product attributes & characteristics
- High resolution, granular
Attributes of (characteristics of) the specific level 2 products. Multiple taxonomies are probable at level 3. We wouldn’t put shirts here, as an attribute of clothes, because that isn’t their relationship.
Obvious physical example is colours and sizes. Perhaps surprisingly, gender for fashion belongs here and not at level 1, because a shirt being for men or women is a characteristic of the shirt.
Again, here is a mix of taxonomies from our 3 example verticals – Colour for fashion, Location for jobs and Software integration for customer marketing
Informational Term Classification for Products
While product-informational taxa apply at multiple levels of resolution, they tend to be similar enough across the levels (e.g. ‘what is’) to allow us to use a single taxonomy across all the levels, rather than needing to have informational taxonomies at each of the levels. So we’d have all these in one column or perhaps three: knowledge, procedural, evaluation.
AUTOMATICALLY ASSIGNING SEARCH INTENT:
Some quick tips for advanced analysts:
(a) Hierarchical – Presence of a search term at or beyond the third level of resolution predicts higher commerciality, which increases with the number of level 3 taxonomies applicable to the cluster.
(b) Semantic – Added to this, a term’s membership of an informational taxonomy provides another indicator of the level of commerciality.
Both (a) & (b) above can be analysed in a matrix to give a final ‘hierarchical and semantic’ automated estimation of commerciality for search terms which can, in the cluster analysis phase, be summarised at cluster level. This can then be compared to SERP analyses for high value clusters during the cluster analysis phase to give a final measure.
2. Context based term classification
The same system is used as with product search terms but is a tree going up away from the product instead of roots going down into it. Instead of being product-depth focused, it grows out from the product concept in the other direction, encompassing semantically-near concepts.
This example looks solely at our Jobs vertical. The product levels we just looked at grow downwards from ‘Jobs (0)’, the core product.
Contextual keyword research is normally used to support content marketing and outreach, best performed in respect to a specific content marketing or outreach goals.
How to group your keywords the fast way
Now you’ve got your taxonomies laid out, and you’ve gathered your list of search terms, it’s time to tag the search terms with the appropriate taxa. If you’ve got a lot of keywords, the thought of doing this manually can be intimidating.
Don’t do it manually. Use this formula instead. It will automatically search through your list of search terms and assign each one with the most appropriate taxon:
To work with the formula, you need to add a column to the left of your taxons on each taxonomy which will contain a string identifier e.g. ‘shorts’ for Shorts, or ‘compan’ to match both ‘company’ and ‘companies’ to the same taxon, Company. The formula looks for the identifier in a search term, and if it finds the string identifier anywhere within the search term, it will assign the identifier’s assigned taxon (label) to the search term.
It looks roughly like this:
Here’s the formula broken down into steps:
Determines what to output when your search term does not match any of the identifier strings
Sets the range in which your string identifiers live i.e. the strings that the formula will look for in your search term
Sets the location of your search term
Sets the range in which your taxons can be found.
Tip: The formula searches your string identifier range from the bottom up and assigns the first matching tag it finds. So make sure you put more specific identifiers (like ‘light blue’) lower down and broader identifiers (like ‘blue) at the top, so that for the search term ‘light blue tops’, the tag returned from your taxonomy is ‘light blue’ rather than ‘blue’. One quick way to do this is with the LEN function. The longer a string is, the more likely that it’s more specific and should therefore be considered first.