Graph theory-based Internal Linking: 50% to 1% Orphan Pages

Graph theory-based Internal Linking: 50% to 1% Orphan Pages

Subscribe to our blog

Stay up to date with all things Impira, automation, document processing, and industry best practices.

By subscribing, I agree to Botpresso’s Terms of Service and Privacy Policy.

Graph Theory based Internal Linking

How a LinkPie based Internal Linking solution helped to bring the orphan pages down from 50% to 1% for a travel brand

Introduction

Internal linking is one of the proven tech SEO weapons, especially for the websites with a huge number of landing pages.

In this case study, I’m talking about the travel brand Omio, where we had planned to increase the number of SEO landing pages from 200 thousand to 1 million across 28 domains. While I was completely convinced about the opportunity and the approach we decided to take for this massive inventory scale up, I was a bit skeptical on the other hand because 50% of our existing landing pages were orphan pages.

“Orphan pages are the pages that can’t be reached through navigation on the website. They are the pages that aren’t linked from any other pages on the website”

So, the plan was to fix the problem of orphan pages first, and then talk about inventory expansion.

What's next?

Internal linking is one of the proven tech SEO weapons, especially for websites with a huge number of landing pages.

In this case study, I’m talking about the travel brand Omio, where we had planned to increase the number of SEO landing pages from 200 thousand to 1 million across 28 domains. While I was completely convinced about the opportunity and the approach we decided to take for this massive inventory scale-up, I was a bit skeptical on the other hand because 50% of our existing landing pages were orphan pages.

“Orphan pages are the pages that can’t be reached through navigation on the website. They are the pages that aren’t linked from any other pages on the website”

So, the plan was to fix the problem of orphan pages first, and then talk about inventory expansion.

The magic solution was born – graph theory based internal linking

That discussion with the Engineering team (the architects behind the solution) was the best one I’ve ever had. This time we drew world’s map and started plotting some destinations to cover some of the common and edge cases.

graph theory internal linking plan

Then we listed the end goals, which were:

  • Keep everything relevant. User is the king and then comes search engine bots
  • Covering ~ 100% of the pages i.e leaving ~ 0% orphan pages
  • Making pages more accessible through the navigation flow – we aimed to cover at least 90% of the pages within the depth level of 5

The plan was to have a brainstorming session but we were sitting in different corners and scratching our heads. Suddenly, a nerd in the room turned around and mentioned “multi-directed graph”. Well, that’s something we had studied during the high school days but never thought about an application in this case. He explained the concept of triangle count and that’s when this graph theory based internal linking solution was born.

We added some tweaks to ensure our important pages (based on demand and supply parameters) are getting the importance they deserve, for example – London to Paris is one of the most popular routes we had whereas Berlin to Brandenburg was hardly attracting any traffic, so we didn’t want this algorithm to treat both the pages at the same level of importance. So, we added a custom rule, the more important pages get more number of links, but still respecting the relevance rule.

We took one of our domains for a deep analysis to identify how this great-in-theory-solution works in reality. The results were great and we implemented the solution on a couple of test domains. Right after the release, we initiated a fresh crawl to gather the new stats. As soon as the crawl finished, we all had smiles on our faces seeing 98% coverage. This was beyond imagination to be honest.

Celebrations? Too early maybe!

The team started to celebrate, but wait! What are those 2% pages which still aren’t covered, I asked and spoiled the party. We looked into the data – some of the pages were okay to be ignored and we moved them out of the list. For the remaining ones, we found a slight tweak to fix an edge case we missed. Wohaaa! The result was 99% coverage and we were left with only ~ 1% orphan pages.

Interested in the technical nitty-gritty of the solution? You can find it here.

Results

We started to observe our access logs to see how search engines perceive this and it was all promising. We witnessed a gradual increase in rankings and impressions, especially for the pages which weren’t performing before. We rolled it out everywhere and witnessed similar (great) results.

The project was marked as one of the successful ones and I’m so proud solving the coverage problem through a graph theory based solution, massive credits to the brilliant brains in the engineering team for seeding the amazing idea in our brains. Here’s the quick summary of the end results for you:

Nitin Manchanda

Nitin Manchanda

Nitin Manchanda is a developer turned SEO. In the past, Nitin has helped some really popular brands like Omio, trivago, and Flipkart grow organically. He is also the founder of Botpresso, a boutique SEO consultancy. While not busy with his SEO stuff, you'll find him planning his next trip, watching some cricket match, or painting, crafting or breaking things with his two daughters!