G
GTM Vault
Browse
  • Dashboard
    • Automations
    • Skills
    • Prompts
    • Makers
  • Sign in
All Automations

Company Intel Scraper (RAG Pipeline)

Enterprise-grade company research pipeline. Crawls sitemaps, chunks content for RAG, generates OpenAI embeddings, stores in Supabase vector DB with semantic search for AI email copy generation.

View on GitHub
workflowEnrichment & ResearchAI-PoweredSales IntelligenceOutbound
Original
by Roheel Jain
Nodes

48

Triggers

2

Platform

n8n

Tech Stack
OpenAI Embeddings
OpenAI Embeddings
Supabase Vector DB
Supabase Vector DB
Sitemap Crawling
RAG
HTTP API
Workflow

Manual Trigger

manualTrigger

Set - Config

set

Parse Email Doc

code

IF - Skip Summary

if

Supabase - Get Step Stats

httpRequest

Enrich with Performance

code

OpenAI - Generate Embedding

httpRequest

Prepare Supabase Upsert

code

Supabase - Upsert Email Copy

httpRequest

HTTP Request

httpRequest

Merge

merge

Supabase - Get Step Stats1

httpRequest

Set - Unique ID

set

Fetch Campaign Request

httpRequest

Map Fields

code

OpenAI - Embed Request

httpRequest

Supabase - Match Email Copies

httpRequest

Build AI Prompt

code

OpenAI - Generate Copy

httpRequest

Format Output

code

Fetch Clients

httpRequest

Parse Form + Match Client

code

Supabase - Upsert Campaign Request

httpRequest

Webhook1

webhook

Webhook

webhook

Normalize Input

code

Scrape - Key Pages

httpRequest

Aggregate Scraped Content

code

OpenAI - Extract Profile

httpRequest

Format Profile

code

Supabase - Upsert Profile

httpRequest

Set - Config1

set

Fetch robots.txt1

httpRequest

Extract Sitemap URL1

code

Fetch Sitemap1

httpRequest

Filter Key Pages1

code

Delete Old Scraped Assets

httpRequest

Chunk Pages for RAG

code

OpenAI - Embed Chunk

httpRequest

Format Asset Row

code

Supabase - Insert Assets

httpRequest

Code in JavaScript

code

If

if

Fetch Sub-Sitemaps

httpRequest

Code in JavaScript1

code

Fetch Page Free

httpRequest

Has Content?

if

Check Content

code

Mini Map
Press enter or space to select a node. You can then use the arrow keys to move the node around. Press delete to remove it and escape to cancel.
Press enter or space to select an edge. You can then press delete to remove it or escape to cancel.
How It Works

This automation turns any company's website into a smart research database that writes personalized email copy for you. Simply enter a company URL and get AI-generated, contextually relevant email content based on deep analysis of their business.

1

Submit company website for analysis

Enter the target company's website URL into the system. The automation begins by identifying all the important pages on their site.

2

Scan and collect company content

The system automatically crawls through key pages like About Us, Services, and Product pages to gather comprehensive information about the company.

3

Build intelligent company profile

AI analyzes all collected content to create a detailed profile including the company's services, value propositions, industry focus, and key messaging.

4

Store information for smart search

All company data is organized and stored in a searchable database that understands context and meaning, not just keywords.

5

Generate personalized email copy

When you request email content, the system searches through the company profile and generates relevant, personalized copy that speaks directly to their business needs.

What You'll Need

OpenAISupabaseWeb Scraping Service
  • OpenAI API account with available credits
  • Supabase database account for storing company profiles
  • Target company website URLs that are publicly accessible
  • Basic understanding of your ideal customer profile
  • Email templates or style guidelines you want to follow

Estimated Cost per Run

USD 0.15 – 2.50 (Cost varies based on website size and content volume. Includes OpenAI API usage for content analysis and email generation.)

This automation has 4 configurable settings you'll customize during setup.