Skip to content
← Back to articles
8 min read

AI-Native Content Workflows via Astro 6

A highly technical guide on building zero-maintenance, automated AI content pipelines using Astro 6 content collections, GitHub Actions, and LLMs.

AI-Native Content Workflows via Astro 6
In this post

TL;DR: Manual content creation is an unscalable bottleneck for engineering teams. By architecting an AI-native content pipeline combining GitHub Actions orchestration, LLM (Gemini/GPT) text generation, and Astro 6’s strictly typed Content Collections, you can deploy a zero-maintenance, self-publishing blog engine. This guide breaks down the architectural requirements, script infrastructure, and continuous integration pipeline needed to automate technical writing.

Table of Contents

The Bottleneck of Manual Content Generation

As an engineer, your highest leverage activity is shipping code, not writing SEO copy.

Yet, discoverability remains a critical requirement for indie hackers and boutique agencies.

The traditional approach—context switching from IDE to CMS, wrestling with markdown, and manually optimizing meta tags—destroys velocity.

The solution is an AI-Native Content Pipeline. This isn’t about spamming generated content; it’s about treating technical writing as an automated CI/CD task.

By treating prompts as code and executing them via scheduled jobs, we can transform Git commit histories and PR descriptions into high-quality, technically accurate Dev Logs and SEO articles.

Architectural Overview

A robust AI content pipeline relies on three distinct layers:

  1. The Orchestrator: GitHub Actions scheduled cron jobs.

  2. The Generator: A Node.js executable utilizing LLM APIs (like Google’s Gemini).

  3. The Consumer: Astro 6 Content Collections for rigorous type validation and static rendering.

graph TD;
    A[GitHub Actions] -->|Executes| B[Generator Script];
    B -->|Fetches Data| C[GitHub REST API];
    B -->|Prompts| D[LLM API];
    D -->|Returns MDX| B;
    B -->|Writes File| E[Astro Content Collection];
    E -->|Triggers Build| F[Astro CLI];
    F -->|Outputs| G[Static HTML];

The Orchestration Layer: GitHub Actions

We trigger the generation pipeline using GitHub Actions. To maintain quality control, the workflow shouldn’t push directly to main.

Instead, it creates a draft Pull Request. This allows for human review (“human-in-the-loop”) to verify technical accuracy and tone before publishing.

Here is the .github/workflows/generate-blog.yml configuration:

name: Generate Blog Post

on:
  schedule:
    - cron: '0 8 * * 1' # Every Monday at 8 AM UTC
  workflow_dispatch: # Allow manual triggering

jobs:
  generate:
    runs-on: ubuntu-latest
    permissions:
      contents: write
      pull-requests: write

    steps:
      - name: Checkout repository
        uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Setup Node.js
        uses: actions/setup-node@v4
        with:
          node-version: '22'
          cache: 'npm'

      - name: Install dependencies
        run: npm ci

      - name: Execute Generator Script
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          GEMINI_API_KEY: ${{ secrets.GEMINI_API_KEY }}
        run: npm run generate:blog

      - name: Create Pull Request
        uses: peter-evans/create-pull-request@v6
        with:
          token: ${{ secrets.GITHUB_TOKEN }}
          commit-message: 'feat(blog): automated weekly blog post generation'
          title: 'Automated Blog Post Draft'
          body: 'Automated draft generated by AI. Please review and merge.'
          branch: 'automated-blog-draft'
          add-paths: 'src/data/blog/*.mdx'

The Generator Layer: Prompt Engineering as Code

The generator is a standalone TypeScript script executed via tsx. Its primary responsibility is fetching context (recent commits, issues) and feeding it into a strict prompt template.

Crucial Architecture Rule: Prompts are business logic. Store them in dedicated files (like JULES_SCHEDULED_PROMPTS.md) and version control them meticulously.

Here is a simplified example of the Node.js generator script:

import fs from 'node:fs/promises';
import path from 'node:path';
import { GoogleGenerativeAI } from '@google/generative-ai';

// Initialize the API
const apiKey = process.env.GEMINI_API_KEY;
if (!apiKey) throw new Error('GEMINI_API_KEY is missing.');
const genAI = new GoogleGenerativeAI(apiKey);

async function generatePost() {
  const model = genAI.getGenerativeModel({ model: 'gemini-1.5-pro' });

  // Simulated prompt injection
  const prompt = `
    You are an expert technical SEO copywriter and senior software architect.
    Write an 800-word highly technical, SEO-optimized blog post for an Astro 6 portfolio.
    Return strictly valid MDX with a frontmatter block.
    Ensure frontmatter includes: title, description, pubDate, category, tags, readTime, draft, image.
  `;

  const result = await model.generateContent(prompt);
  const mdxContent = result.response.text();

  // Sanitize and format filename
  const dateStr = new Date().toISOString().split('T')[0];
  const filename = `automated-post-${dateStr}.mdx`;
  const filepath = path.join(process.cwd(), 'src', 'data', 'blog', filename);

  await fs.writeFile(filepath, mdxContent, 'utf-8');
  console.log(`Successfully generated ${filename}`);
}

generatePost().catch(console.error);

Securing Output: Overcoming LLM Hallucinations

LLMs are prone to hallucinating invalid markdown or omitting required frontmatter fields.

To combat this, your prompt must be highly constrained. Explicitly define the frontmatter schema within the prompt, and specify that the output must only contain the file contents, omitting introductory conversational text.

The Consumer Layer: Astro 6 Content Collections

This is where Astro 6 excels. By using astro:schema and Content Collections, we enforce type safety on the raw MDX files generated by the LLM.

If the LLM hallucinates a category or forgets a required field, the Astro build process fails immediately in the CI pipeline, preventing malformed content from reaching production.

// src/content.config.ts
import { defineCollection, z } from 'astro:schema';
import { glob } from 'astro/loaders';

const blog = defineCollection({
  loader: glob({ pattern: '**/*.{md,mdx}', base: './src/data/blog' }),
  schema: z.object({
    title: z.string().max(60, 'Title must be under 60 characters for SEO'),
    description: z.string().max(160, 'Description must be under 160 characters'),
    pubDate: z.date(),
    category: z.enum(['AI', 'Web Development', 'Systems', 'Design', 'Productivity']),
    tags: z.array(z.string()),
    readTime: z.string(),
    draft: z.boolean().default(true),
    image: z.string().optional(),
  }),
});

export const collections = { blog };

Notice the strict validations: max(60) for titles and z.enum for categories. The Astro CLI (npx astro check) runs during the GitHub Action process, acting as an automated QA engineer verifying the LLM’s output.

High-Performance Parallel Content Fetching

When rendering the blog index or RSS feeds, it’s crucial to optimize data fetching.

A common mistake is sequentially resolving collections or applying intensive parsing synchronously.

Performance Standard: Always parallelize independent asynchronous operations. When dealing with Astro collections, utilize Promise.all() to prevent waterfall delays.

// src/pages/blog/index.astro
---
import { getCollection } from 'astro:content';

// BAD: Sequential Waterfall
// const blogPosts = await getCollection('blog');
// const devLogs = await getCollection('devlog');

// GOOD: Parallelized Fetching
const [blogPosts, devLogs] = await Promise.all([
  getCollection('blog', ({ data }) => !data.draft),
  getCollection('devlog')
]);

const combinedFeed = [...blogPosts, ...devLogs].sort(
  (a, b) => b.data.pubDate.valueOf() - a.data.pubDate.valueOf()
);
---

Security Hardening: XSS in Content Rendering

Finally, when building content-heavy sites, especially those utilizing JSON-LD for semantic SEO, ensure that dynamic data injected into <script> tags is strictly sanitized.

Directly using JSON.stringify on LLM-generated content within an inline script tag opens an attack vector for Cross-Site Scripting (XSS).

If the LLM generates a string containing </script>, it can prematurely close the tag and execute arbitrary injected code.

Always utilize a sanitization utility:

// src/utils/index.ts
export function escapeJsonForScript(obj: any): string {
  return JSON.stringify(obj)
    .replace(/</g, '\\u003c')
    .replace(/>/g, '\\u003e')
    .replace(/&/g, '\\u0026')
    .replace(/\\u2028/g, '\\u2028')
    .replace(/\\u2029/g, '\\u2029');
}

Conclusion: Ship Faster with AI Orchestration

Architecting an AI-native content workflow transforms technical writing from a chore into a scalable system.

By orchestrating LLM generation through GitHub Actions and enforcing strict validation via Astro 6 Content Collections, developers can maintain a consistent, high-quality digital presence with zero manual overhead.

Ship the code, let the robots handle the prose.

Written by Jordan Thirkle

Stay-at-home dad building AI-accelerated products. I write code during naps and after bedtime — every post comes from real work, not theory.

X GITHUB LINKEDIN NEWSLETTER
0