Future-proof your skills and escape the tech hamster wheel with Linux, Python, vim & git (LPvg) including NixOS, Jupyter, FastHTML and an AI stack to resist obsolescence.

Structuring Websites to Train Models

Learn how to structure websites to train AI models while maintaining a client-first mindset. Explore the balance between technical innovation and delivering value to clients in modern SEO.

Because of life being what it is and mistakes I have made, I am missing another very important birthday. I am sorry, but please know you are in my every thought and I am doubling down professionally to be what I need to be for you, maybe not today, but in the future. My gift to you (beyond the new blue Commodore 64x) is staying sharp and in the game. I’m recalibrating to be at the top of my field for another 20 years as the world undergoes these remarkable changes. I love you, and even though it may not look like it, I am present and here for you and forever will be. I am very proud of you.

The Client-First Mindset

Every client has a theme or a mantra or a battle cry by which we open up every meeting. Through this battle cry, I make their mission mine.

I am an SEO, albeit precisely what that means in the age of AI is undergoing dramatic change (thank goodness!). But nonetheless, that means I am in a service industry. If I fail to serve (no matter how much I love the webdev part of this now), I fail at my job. The clients pay our paychecks, and without a very Peter Drucker view of the world where The Company exists to serve the client, the well dries up. The well must not dry up. I must serve. All my webdev work is in support of that mission, and when it interferes, I must adjust how I manage my time and prioritize my tasks so that I can delight clients first.

Understanding Peter Drucker’s Customer-Centric Philosophy

Me: Hey Perplexity, explain to the nice folks that reference to Peter Drucker.

Perplexity: Peter Drucker’s philosophy on customer-centricity is indeed highly relevant to your role as an SEO professional in the service industry. His emphasis on creating and serving customers aligns perfectly with your mission to delight clients.

Drucker’s Customer-Centric Philosophy

Drucker famously stated, “The purpose of business is to create a customer”[1]. This fundamental principle underscores the importance of focusing on customer needs and satisfaction above all else. In your case, as an SEO professional, this translates to prioritizing client service and results over technical preferences or personal interests.

Key Aspects of Drucker’s Customer-Focused Approach:

  1. Customer Creation: Drucker believed that businesses should actively create customers, not just passively wait for them[1]. For SEO, this means constantly innovating and adapting strategies to attract and retain clients.

  2. Value Definition: “What the customer thinks he or she is buying, what he or she considers value is decisive”[1]. Understanding and aligning with your clients’ perception of value is crucial for success in SEO.

  3. Results-Oriented: Drucker emphasized measuring results with customers[1]. In SEO, this translates to focusing on metrics and outcomes that matter to your clients.

  4. Adaptability: “The greatest danger in times of turbulence is not the turbulence; it is to act with yesterday’s logic”[3]. This is particularly relevant in the rapidly evolving field of SEO and AI.

Applying Drucker’s Principles to SEO

Your approach of making each client’s mission your own and prioritizing client service over personal technical interests aligns perfectly with Drucker’s philosophy. By focusing on creating value for clients and adapting to their needs, you’re embodying the customer-centric approach that Drucker advocated.

Practical Applications:

  • Regularly reassess what constitutes value for your clients in the changing SEO landscape.
  • Prioritize tasks that directly contribute to client satisfaction and results.
  • Stay adaptable and open to new SEO techniques and technologies that can better serve your clients.
  • Measure your success not just by technical SEO metrics, but by client satisfaction and business outcomes.

By maintaining this client-focused mindset, you’re not only ensuring the sustainability of your business but also delivering the highest value to your clients in a rapidly evolving digital landscape.

Citations:

[1] https://customerthink.com/peter-drucker-and-the-customer/
[2] https://www.customercontactweekdigital.com/people-management/columns/back-to-basics-with-peter-drucker-lessons-for-cust
[3] https://www.linkedin.com/pulse/principles-peter-drucker-man-who-changed-
[4] https://mlari.ciam.edu/peter-f-druckers-management-philosophy-and-life-coaching-are-interchangeable
[5] https://adstach.com/about/therealestatephilosopher/post/peter-drucker-creating-customers2
[6] https://danashby.co.uk/2014/10/30/the-purpose-of-business/
[7] https://hbr.org/2020/01/are-you-undervaluing-your-customers
[8] https://www.business.com/articles/management-theory-of-peter-drucker/

Personal Reflections on SEO Evolution

Me: Ouch! This is some pretty hard medicine for me, haha!

The Evolution of SEO and Personal Challenges

I am torn between two worlds. On the one hand, I am in this for the love of the tech and the challenge. I’m am not particularly enamored with the field of SEO itself, full of scammers and spammers as it is. When I got into it, it was more of a “get rewarded for good information architecture” sort of thing. That was some 20 years ago when the Web was much smaller, and there were way fewer bad actors. Someone doing meritoriously good work was more likely to be recognized and rewarded with good search engine result positions (SERPs) on all the right keywords for it.

Today, all that kind of content is drowned out by either people capable of gaming the system, or by big companies with a lot of resources. I actually work on behalf of those big companies, but I want all their rewards to be meritorious, and I want to reposition my job on how to help them adjust their sites for meritorious reward receiving, both by Google in the way it continues to do the SERP thing, but also by the new breed of search which hits a base model like you and facilitates and customizes the results with follow-on processes like Retrieval-Augmented Generation (RAG) or other techniques. I believe the challenge will break down to the following:

  1. Preparing the site for the crawls that will occur for model training and fine-tuning. This is an attempt to get a certain base of content into the curated training set. Sites should be adjusted with a prompt/response or question/answer mindset that reflects the prescribed model training formats.
  2. Preparing the site for intelligently directed real-time crawls designed to find some particular information from a site as a result of a user query. For example, checking a site’s schema.org structured data for product availability. A site’s hierarchical drill-down paths and the anchor text and title tags should facilitate this, plus any of the site’s built-in search features.

AI’s Perspective on SEO Evolution

Perplexity: Your perspective on the evolution of SEO and your current position in the industry is both insightful and relatable. It’s clear that you’re grappling with the tension between your passion for technology and the realities of the modern SEO landscape. Let’s break this down:

The Evolution of SEO: From Merit to Complexity

Your journey from the early days of SEO to the present highlights a significant shift in the industry:

  1. Early Days: SEO was more about good information architecture and quality content, rewarding meritorious work.
  2. Current Landscape: Dominated by those who can game the system or have substantial resources.

This transition has undoubtedly changed the nature of SEO work, especially for those who, like you, value the technical and ethical aspects of the field.

Balancing Passion and Pragmatism

Your situation exemplifies a common dilemma in many tech-related fields:

  • Passion for Technology: You’re driven by the love of tech and the challenge it presents.
  • Industry Realities: The current state of SEO, with its focus on gaming systems and resource-heavy strategies, doesn’t align perfectly with your ideals.

Your Vision for the Future of SEO

Your approach to repositioning your role is forward-thinking and aligns well with the evolving search landscape:

  1. Meritorious Rewards: Focusing on helping clients earn their search positions through genuine value and quality.
  2. Adapting to AI and New Search Paradigms: Preparing for both traditional SERPs and AI-driven search like RAG.

Your Proposed Strategies

  1. Preparing for AI Training Crawls:
    • Aligning content with prompt/response formats.
    • This approach is innovative and anticipates the growing role of AI in search.
  1. Optimizing for Intelligent Real-Time Crawls:
    • Focusing on structured data, hierarchical navigation, and on-site search.
    • This strategy bridges traditional SEO with emerging AI-driven search needs.

Bridging the Gap

Your approach seems to be finding a middle ground between:

  • Serving client needs (as per Drucker’s philosophy)
  • Maintaining technical integrity and ethical standards
  • Preparing for the future of search and AI

This balanced approach could potentially:

  1. Satisfy your technical interests and ethical standards
  2. Meet client needs and expectations
  3. Position both you and your clients for future search paradigms

Moving Forward

Consider framing your work as not just SEO, but as “Search Experience Optimization” or “AI-Ready Content Architecture”. This could help align your passion for technology with the evolving needs of the industry and your clients.

By educating clients on these forward-thinking approaches, you’re not just serving their current needs but also preparing them for future changes in search technology. This aligns well with Drucker’s customer-centric approach while allowing you to focus on the technical challenges you enjoy.

Refining Our AI Training Strategy

Me: Excellent! Our job is to turn our client’s websites into separate offerings of high quality ready-to-be-curated-in training material. It should therefore follow the prompt/response model whenever possible. But not all content from sites like large ecommerce sites can make it into the final curated set of data, for there is no reason to have time-sensitive product availability (for example) in the core model. We must therefore help the client prepare their site in 2 ways. I have articulated them here, but please restate it for improved accuracy, clarity and practical use with human clients and AIs.

In other words, improve my wording and help me polish this.

The AI Training Game: Jeopardy Edition

Imagine a world where you are playing train-the-model Jeopardy. All your answers must now be given in the form of a question.

  1. How can we adjust our site to help train AI models to achieve our goals? How can we make some portion of our site worthy of the curated set of quality content world-knowing models must know?
  2. How can we adjust our site so that when one of these us-aware trained models needs to know more, it can rapidly find it on our site on second-phase real-time small world theory crawls that are going for more time-sensitive answers on content that didn’t make it into the main curated data?

Optimizing Content for AI Training and Real-Time Retrieval

Perplexity: Your approach to optimizing websites for AI training and real-time information retrieval is insightful. Let’s refine and clarify these concepts to make them more accessible and actionable for clients and AI systems:

Core Optimization Strategies for AI-Ready Websites

Strategy 1: Curating Content for AI Training Sets

How can we structure our website to contribute valuable, evergreen content to AI training datasets?

  • This question addresses the need to create high-quality, timeless content that could be included in the curated training sets for large language models. It involves:
    • How can we organize our content in a question-and-answer format to align with AI training methodologies?
    • What strategies can we employ to ensure our content is accurate, comprehensive, and authoritative enough to be considered for AI model training?
    • How can we implement structured data and semantic markup to make our content more easily interpretable by AI systems?

Strategy 2: Optimizing for Real-Time AI Queries

How can we optimize our website for efficient, real-time information retrieval by AI systems?

  • This question focuses on making dynamic, time-sensitive information easily accessible for AI-driven searches. It encompasses:
    • What techniques can we use to clearly differentiate between evergreen content and time-sensitive information on our site?
    • How can we implement efficient indexing and crawling structures to facilitate rapid information retrieval by AI systems?
    • What methods can we employ to ensure our real-time data (like product availability or current events) is easily discoverable and interpretable by AI crawlers?
    • How can we leverage APIs or other data exchange protocols to provide up-to-date information directly to AI systems when queried?

The AI Training Mindset

By framing these strategies as questions, we encourage a more interactive and exploratory approach to SEO optimization for AI. This format aligns well with the “train-the-model Jeopardy” concept, promoting a mindset of continuous learning and adaptation in the face of evolving AI technologies. It also helps clients and AI systems to think more critically about the process of optimizing web content for both long-term AI training and real-time information retrieval.

Citations:

[1] https://growthnatives.com/blogs/digital-marketing/how-is-ai-trained/
[2] https://www.ama.org/events/virtual-training/the-influence-of-ai-on-seo/
[3] https://owdt.com/insight/11-strategies-for-improving-seo-performance-with-ai/
[4] https://seoleverage.com/seo-tips/6-ways-to-use-machine-learning-in-seo/
[5] https://marketbrew.ai/a/deep-learning-seo-strategies
[6] https://www.semrush.com/blog/ai-seo/
[7] https://www.convinceandconvert.com/ai/seo-in-an-ai-driven-world-how-to-strategize/
[8] https://community.openai.com/t/seo-and-how-ai-will-enhance-it-course-idea/63955

Practical Implementation Guide

Me: This is really excellent, thank you! Write a way to preface it I can use with the client that makes humorous reference to “Answer in the form of a question” rule, and its relationship to how AI models like you are trained. Give me some examples of the popular templates that are used for data training preparation, and related it to options of how to present or enhance already published data that may not exactly follow such a format, but could be adjusted to satisfy the same need. There are existing web scraping and extracting systems that don’t requrie such preparation, so it’s not strictly necessary. We just want to lean into the process and put our best foot forward.

Playing Jeopardy with AI: A Guide to Content Optimization

Ladies and gentlemen, welcome to the “Jeopardy!” of AI training! Just as contestants must phrase their responses in the form of a question, we’re about to embark on a journey where our websites will play a similar game with AI models. Instead of “I’ll take ‘Potent Potables’ for $200, Alex,” we’ll be saying, “I’ll take ‘Optimal Content Structure’ for a higher SERP ranking, Google!”

  1. Question-Answer Pairs:
    • Q: What is the capital of France?
    • A: Paris
  2. Instruction-Response Format:
    • Instruction: Explain photosynthesis in simple terms.
    • Response: Photosynthesis is the process plants use to make food from sunlight.
  3. Dialogue Format:
    • Human: How do I bake chocolate chip cookies?
    • AI: Here’s a simple recipe for chocolate chip cookies…
  4. Context-Question-Answer Triples:
    • Context: The Eiffel Tower was completed in 1889 for the World’s Fair.
    • Q: When was the Eiffel Tower built?
    • A: The Eiffel Tower was completed in 1889.

Adapting Existing Content

To adapt existing content to these formats, consider the following approaches:

  1. FAQ Transformation: Turn your existing FAQs into question-answer pairs.
    • Before: “Our return policy allows 30 days for refunds.”
    • After: Q: What is your return policy? A: Our policy allows 30 days for refunds.
  2. How-To Guide Conversion: Restructure guides into instruction-response format.
    • Before: “Steps to reset your password: 1. Click ‘Forgot Password’…”
    • After: Instruction: How do I reset my password? Response: Follow these steps: 1. Click ‘Forgot Password’…
  3. Product Description Reframing: Turn features into a dialogue.
    • Before: “Our blender has 5 speed settings and a pulse function.”
    • After: Human: What features does your blender have?
      • AI: Our blender comes with 5 speed settings and a pulse function.
  4. Blog Post Adaptation: Transform key points into context-question-answer triples.
    • Before: “The Great Wall of China stretches over 13,000 miles.”
    • After: Context: The Great Wall of China is an ancient fortification.
      • Q: How long is the Great Wall of China?
      • A: The Great Wall stretches over 13,000 miles.

Remember, while existing web scraping systems might not require this preparation, by structuring our content this way, we’re essentially rolling out the red carpet for AI models. We’re not just waiting to be discovered; we’re putting on our best show and saying, “Hey AI, let’s play ‘Jeopardy!’… With our website content!”

Citations:

[1] https://heidloff.net/article/fine-tuning-question-answering/
[2] https://simpletransformers.ai/docs/qa-data-formats/
[3] https://huit.harvard.edu/news/ai-prompts
[4] https://towardsdatascience.com/building-a-question-answering-system-part-1-9388aadff507?gi=1c8beeb8a75a
[5] https://www.reddit.com/r/learnmachinelearning/comments/1dfjmdw/approach_to_train_ai_to_ask_questions/
[6] https://huggingface.co/docs/transformers/en/tasks/question_answering
[7] https://www.digitalocean.com/community/tutorials/how-to-train-question-answering-machine-learning-models
[8] https://growthnatives.com/blogs/digital-marketing/how-is-ai-trained/

Okay, so let’s do this.