Features Pricing Blog About
Compare vs Screaming Frog vs Sitebulb vs Slickplan vs Octopus.do vs VisualSitemaps
Use Cases For Agencies Site Migration Content Audit For SEO Pros
Log in Get Started

User Guide

IATO - Website Crawler & Content Governance Platform

Terminology

  • Workspace: A container for organizing related projects and team collaboration
  • Project: A single crawl job targeting a specific website or URL pattern
  • Content Inventory: Catalog of all pages with metadata and classification

Free Trial Limits

  • • 1 workspace, 500 pages per crawl
  • • 100MB storage, 7-day data retention
  • • No exports, sharing, or scheduled crawls

Key Features

  • Visual Sitemap Editor: Drag-and-drop sitemap planning with intelligent auto-layout
  • AI Sitemap Assistant: Conversational AI for restructuring and content creation
  • Content Inventory: Full catalog of pages with type classification
  • Technical SEO Audit: Automated detection of SEO issues, broken links, and redirects
  • Navigation Analysis: Analyze and optimize site navigation structure
  • Taxonomy Builder: AI-assisted content taxonomy creation
  • Screenshot Capture: Thumbnail generation stored directly in the database
  • Minimal Footprint Defaults: Storage-optimized settings for faster, leaner crawls
  • Role-Based Permissions: Owner, admin, member, viewer roles

1. Dashboard Overview

Welcome to IATO! This tool allows you to crawl websites, analyze their structure, find SEO issues, and generate comprehensive reports.

Navigation

The dashboard is organized into the following main sections:

  • Workspaces: Primary view - organize projects and collaborate with team members
  • Settings: Configure credentials, form auth, extraction rules, and admin options

Workspace Detail View

Click on any workspace to open its detail view:

  • Header: Workspace name, visibility badge, and settings gear icon
  • Action Buttons: "New Project" to start a crawl, "New Schedule" to create a recurring crawl
  • Search & Filter: Search by name or URL, filter by All / Scheduled / On-demand
  • Projects List: Combined list of all on-demand and scheduled crawl projects

User Menu

Click your name in the top-right corner to access:

  • My Account: Profile settings, theme & appearance, API key, and notification preferences
  • Subscription: View and manage your plan tier and billing
  • Team Members: Invite users and manage team access
  • User Guide: This documentation
  • API Documentation: REST API reference for all endpoints
  • Developer Portal: API key management with scoped permissions, quickstart guides
  • MCP Server Docs: Model Context Protocol integration for AI tools

System Status

Check the health status indicator in the top-right corner to ensure all services are running:

  • Dashboard (API server)
  • Database (MySQL)
  • Redis (caching/real-time)
  • Crawler Engine

2. Starting a New Project

Basic Settings

Setting Description Default
Website URLStarting URL for the crawlRequired
Max PagesMaximum internal pages to crawl (external links don't count)500
Max External LinksMaximum external links to check (only when "Check external links" is ON). 0 = unlimited0
Max DepthHow many links deep to follow from start URL3
Max RedirectsMaximum redirect chain length5
TimeoutRequest timeout in seconds30

Speed Controls

  • Concurrent Requests: Number of parallel requests (1-20). Higher values crawl faster but may overload servers.
  • Delay: Seconds to wait between requests. Recommended: 0.5-1.0 for production sites.

Crawl Scope

These settings control what gets crawled and stored. Default settings are optimized for minimal storage - enable only what you need.

Setting Description Default
Crawl subdomainsTreat subdomains (blog.example.com) as internalOFF
Store external linksRecord links to other domains (URLs only, not checked)ON
Check external links (detect broken)Make HEAD requests to external links to check status codes. Enables broken external link detection. May slow crawls.OFF
Crawl outside start folderIf starting at /blog/, also crawl /products/, /about/, etc.ON
Respect robots.txtHonor robots.txt directives. Disable only for sites you own.ON
Store page content (HTML)Save full HTML for content analysis. Required for word counts and duplicate detection.OFF

💡 Storage Optimization

Default settings create the smallest database footprint (~5-10 MB per 5,000 pages). Enable additional options only as needed - a full crawl with all options can use 850 MB - 4 GB per project.

Resource Collection

Control which resource types to discover and verify. Each has two options:

Resource Track Check Size Default
ImagesRecord image URLs found on pagesHEAD request to get status & file sizeOFF
CSSRecord CSS file URLsHEAD request to verify & get sizeOFF
JavaScriptRecord JS file URLsHEAD request to verify & get sizeOFF
FontsRecord font URLsHEAD request to verify & get sizeOFF
MediaRecord video/audio URLsHEAD request to verify & get sizeOFF
OtherRecord PDFs, docs, etc.HEAD request to verify & get sizeOFF

Analysis Settings

Enable these to generate insights. When disabled, the corresponding metrics will show 0 with a hint to enable.

Setting What It Enables Default
SEO AnalysisIssues tab, missing titles, meta descriptions, heading structureOFF
Performance MetricsResponse times, TTFB, response time distribution chartOFF
Detect DuplicatesContent similarity analysis, duplicate page detectionOFF
Track RedirectsRedirect chain analysis, redirect loops detectionOFF
Extract HreflangInternational SEO, language/region targeting analysisOFF
Extract Structured DataSchema.org markup, JSON-LD, Microdata extractionOFF

URL Filtering (Advanced)

Use regex patterns to include/exclude URLs:

# Include only blog and product pages
/blog/.*
/products/.*

# Exclude admin and login pages
/admin/.*
/login.*
\\?.*session.*

JavaScript Rendering

Enable JavaScript rendering to crawl single-page applications (SPAs) and sites with dynamic content. Uses headless browsers.

Browser Engine

  • Chromium: Default, most compatible with modern web
  • Firefox: Alternative rendering engine
  • WebKit: Safari's engine, useful for iOS testing

Device Presets

Preset Resolution Type
Desktop 1080p1920×1080Desktop
Desktop 1440p2560×1440Desktop
iPhone 14390×844Mobile
iPhone 14 Pro Max430×932Mobile
iPad Pro1024×1366Tablet
Pixel 7412×915Mobile
Samsung Galaxy S23360×780Mobile
Googlebot Mobile412×823Bot
Googlebot Desktop1920×1080Bot
CustomUser-definedCustom

Wait Conditions

Option Description
Network IdleWait until no network requests for 500ms (recommended for SPAs)
Page LoadWait for the load event
DOM Content LoadedWait for DOMContentLoaded event
First ResponseContinue after first server response
Wait for SelectorWait for a specific CSS selector to appear (e.g., #main-content)
Extra WaitAdditional delay in milliseconds after page loads

Resource Blocking

Block resources during rendering to speed up crawls:

  • Images: Skip loading images
  • CSS: Skip loading stylesheets
  • Fonts: Skip loading web fonts
  • Media: Skip loading video/audio

Screenshots

Capture screenshots of each page during the crawl (requires JavaScript rendering).

Option Description
Full PageCapture entire scrollable page (vs viewport only)
FormatPNG (lossless), JPEG (smaller), WebP (modern)
QualityCompression quality for JPEG/WebP (10-100%)

3. Managing Projects

Project Statuses

  • pending - Project queued, waiting to start
  • running - Crawl in progress
  • completed - Crawl finished successfully
  • failed - Crawl encountered errors

Project Details View

Click on any project to open the full analysis view. The left sidebar contains collapsible sections:

Crawl Data

  • Overview: Summary dashboard with key metrics and charts
  • Pages: All crawled pages with status codes, titles, and metadata
  • Content: Page content analysis and word counts
  • Links: Internal and external link analysis
  • Technical SEO: Canonicals, robots, response headers
  • On-Page SEO: Titles, meta descriptions, headings
  • Structured Data: Schema.org, JSON-LD, Microdata
  • International: Hreflang and language targeting analysis
  • Resources: Images, CSS, JS, fonts, and media files
  • Screenshots: Page thumbnails (only if captured during crawl)
  • Extracted Data: Custom extraction results (only if rules were used)
  • Issues: Crawl errors, redirects, and warnings
  • Config: Crawl settings and parameters used

Inventory

  • All Assets: Complete content inventory with search and filtering
  • Subdomains: Pages grouped by subdomain
  • Orphan Pages: Pages not linked from any other page

Audit

  • Overview: Audit summary dashboard
  • All Pages: Paginated audit results for every page
  • Issues: Grouped SEO issues with severity

Navigation, Sitemaps, Taxonomy

See dedicated sections below for details on these sidebar areas.

Export & Versions

  • All Exports: Generate and download exports (Pro/Team tier)
  • Version History: Compare recrawl versions and track changes over time

Project Actions

From the workspace list:

  • Click project: Open full project analysis
  • Quick Delete: Hover over a project to reveal the trash icon
  • Bulk Delete: Select multiple projects and delete them at once

From the project detail header:

  • Recrawl: Start a new version with same or modified settings
  • Compare: Compare this crawl with another version
  • Version Dropdown: Switch between different crawl versions

4. Workspaces

Workspaces help you organize projects and collaborate with team members.

Workspace Roles

Role Permissions
OwnerFull access, can delete workspace, manage members
AdminManage projects and members, cannot delete workspace
MemberCreate and manage own projects, view all projects
ViewerView projects only, no editing

Creating a Workspace

  1. Click the workspace selector dropdown
  2. Select "Create Workspace"
  3. Enter a name and description
  4. Choose visibility (private or shared)

Inviting Team Members

  1. Open workspace settings (gear icon)
  2. Go to the Members tab
  3. Enter email address and select role
  4. Click Invite

10. Scheduling Crawls

Set up recurring crawls that run automatically. Create a schedule from the workspace detail view by clicking "New Schedule".

Schedule Settings

  • URL: The website to crawl
  • Frequency: Hourly, Daily, Weekly, or Monthly
  • Time: When to run (HH:MM format)
  • Day: Day of week (for weekly) or day of month (for monthly)
  • Timezone: Schedule timezone
  • Notify on changes: Get notified when content changes are detected

Schedule Actions

  • Run Now: Trigger the schedule immediately
  • Pause/Resume: Temporarily disable/enable the schedule
  • Delete: Remove the schedule

11. Reports

Generate reports from the Reports view. The left panel has a report generation form (select job, type, and format), and the right panel shows recent reports.

Report Types

  • Summary: High-level overview with key metrics and charts
  • Detailed: Complete page-by-page analysis
  • SEO Audit: Focus on SEO issues and recommendations

Output Formats

  • HTML: Interactive web report
  • PDF: Printable document
  • CSV: Spreadsheet-compatible data
  • JSON: Machine-readable format
  • Excel: XLSX with multiple sheets

Note: Report generation creates database records. Full file generation and download is planned for Phase 9.

12. Comparing Crawls

Compare two crawls of the same website to find changes over time. Access via the Compare button in the project detail header, or from the Versions section in the sidebar. Select a baseline (older) job and a compare (newer) job, then click Compare.

Comparison Metrics

  • Added URLs: New pages found in the compare crawl
  • Removed URLs: Pages that disappeared
  • Changed URLs: Pages with different status codes or content

Use Cases

  • Monitor website changes after deployments
  • Detect broken links after migrations
  • Track content additions/removals
  • Verify redirect implementations

13. Settings

The Settings page has 6 tabs for configuring your crawling environment:

Credentials

Configure HTTP Basic or Digest authentication for password-protected websites. Add domain patterns with username/password, then select them when starting a crawl.

Extraction Rules

Create custom rules to extract specific data from pages using CSS selectors, XPath, or Regular Expressions.

Form Auth

Configure login form credentials so the crawler can authenticate before crawling. Specify the login URL, form field names, and credentials.

AI

Configure your personal AI provider override (Bring Your Own Key). Select a provider (Anthropic or OpenAI), enter your API key, and choose a model. This overrides the system AI configuration for your account.

Admin

Platform administration (admin users only). Configure system-wide AI settings, email delivery (SendGrid/SMTP), and access the danger zone for system-level operations.

Integrations

Connect external services to enrich your crawl data:

  • Google Analytics: Connect your GA4 property to see traffic and engagement data alongside crawl results
  • Google Search Console: Connect to see search performance, keywords, and indexing status
  • WordPress: Add WordPress sites for content sync, menu management, and taxonomy push/pull

Team Management

User Roles

Role Permissions
OwnerFull workspace access, team management, delete workspaces
AdminManage users, settings, all projects in workspace
MemberCreate projects, manage own crawls, view all data
ViewerRead-only access to projects and reports

Inviting Users

  1. Click user menu → Team Members
  2. Click "Invite User"
  3. Enter email and select role
  4. User receives invitation email

Seat Management (Team Tier)

Team plans include a set number of seats. The Team Members view shows a seat management panel where you can add or remove seats. Additional seats cost $20/month each.

5. Content Inventory

The Content Inventory provides a complete catalog of all crawled pages with metadata and classification.

Inventory Views

Access from the Inventory section in the project sidebar:

  • All Assets: Complete content inventory with search, filtering, and metadata
  • Subdomains: Pages grouped by subdomain for multi-domain analysis
  • Orphan Pages: Pages not linked from any other internal page

6. Navigation Analysis

Analyze and optimize your website's navigation structure for better user experience and SEO.

Navigation Views

Access from the Navigation section in the project sidebar:

Overview

Summary stats: total menus detected, total menu items, and unmapped pages not reachable via navigation.

Menus

View and manage detected navigation menus. Use AI Detection to automatically identify menus from page structure — review, approve/reject, and edit labels.

Page Mapping

See which pages appear in navigation menus. Identify orphan pages (not in any menu) and add pages to menus with one click.

Labels

Analyze navigation anchor text across all menus for consistency and SEO optimization.

Breadcrumbs

Detect and analyze breadcrumb navigation structures across your site.

Navigation Validation

Automatic issue detection:

Issue Type Description Severity
Orphan PagesPages not reachable via navigationWarning
Broken LinksMenu items pointing to 404 pagesError
Duplicate LabelsSame label used multiple timesWarning
Deep NestingMenu items nested too deep (>3 levels)Info

7. Taxonomy

Build and manage taxonomies to organize your content. Access from the Taxonomy section in the project sidebar.

Taxonomy Views

  • Overview: Taxonomy summary dashboard with AI wizard launcher
  • Terms: Browse and manage all taxonomy terms
  • Hierarchy: View and edit parent-child relationships as a tree
  • Thesaurus: Manage synonyms and related term relationships
  • Governance: Taxonomy governance policies and approval workflows
  • History: Change log for taxonomy modifications
  • Tags: Manage custom tags separate from hierarchical categories
  • Validation: Quality checks (see validation table below)
  • Import / Export: Import SKOS/RDF files, export JSON/CSV/SKOS

AI Taxonomy Assistant (5-Phase Wizard)

Launch from Taxonomy Overview to access the guided workflow:

Phase 1: Import Standard Vocabularies

  • Import from Schema.org, Dublin Core, FOAF, and other standards
  • Search BARTOC registry for domain-specific vocabularies
  • Upload SKOS/RDF files directly
  • Use starter taxonomies for common industries

Phase 2: Extract Terms from Content

  • Auto-extract terms from page titles, headings, navigation, URLs
  • Filter by source type (Headings, Navigation, URL Paths)
  • Automatic hierarchy detection from URL structure
  • Select individual terms or use "Select Top 50"

Phase 3: Organize Hierarchy

  • View current taxonomy as a collapsible tree
  • AI suggests parent-child relationships based on term similarity
  • Apply or undo suggestions with one click
  • Confidence scores help prioritize suggestions

Phase 4: Classify Pages

  • Auto-classify pages by matching content against taxonomy terms
  • Human review required: Approve, reject, or delete each classification
  • Bulk actions: Approve All or Reject All pending
  • Stats show Pages, Pending, Approved, Rejected counts

Phase 5: Export Taxonomy

  • JSON: Full hierarchy, machine-readable
  • CSV: Spreadsheet format for Excel/Google Sheets
  • SKOS (RDF/XML): W3C standard for controlled vocabularies
  • Option to include or exclude draft terms

Taxonomy Validation

Automatic quality checks for your taxonomy:

Issue Type Description Action
Missing DefinitionTerms without descriptionsEdit Term
Duplicate LabelsSame label used for multiple termsReview Terms
Orphan TermsTerms not connected in hierarchyFix Hierarchy
Draft TermsTerms not yet approvedEdit Term

💡 Tip: Human-in-the-Loop

The classification phase requires human approval for quality control. AI suggestions are starting points - always review before approving to ensure accuracy.

8. Visual Sitemap Editor

The Visual Sitemap Editor lets you plan and reorganize your website's information architecture with a drag-and-drop canvas interface.

Accessing the Editor

  • From a completed crawl, click "Open in Sitemap Editor"
  • Or go to a workspace and click "Visual Sitemaps" in the left sidebar
  • Create a new sitemap or open an existing one

Canvas Controls

Control Action
Click + DragPan the canvas
Scroll / PinchZoom in/out
Click NodeSelect and open details panel
Drag NodeReposition on canvas
Shift + ClickMulti-select nodes
Delete / BackspaceDelete selected node(s)
Ctrl/Cmd + C/VCopy/paste nodes

Node Properties

Click any node to open the right panel with editable properties:

  • Title: Page name (displayed on node)
  • URL/Path: Target URL for the page
  • Status: Draft, Review, Published, Archived
  • Action: Keep, Update, Review, Remove, Redirect, or Merge
  • Redirect To URL: Destination URL (appears when Action is set to Redirect)
  • Page Type: Landing, Blog, Service, etc.
  • Assigned To: Team member responsible
  • Meta Description: SEO description
  • Target Keywords: SEO keywords
  • Notes: Internal notes for team

Toolbar Features

  • Import: Import from CSV, XML sitemap, or crawl data
  • Download: Export as image (PNG/SVG) or data (CSV/XML)
  • Search: Find nodes by title, URL, or status
  • AI Button: Open AI Assistant drawer
  • Share: Share view-only link or embed code

Theme Settings

Access via the gear icon in the toolbar:

  • Theme Color: Blue, Purple, Green, Orange, etc.
  • Mode: Light, Dark, or System
  • Max Levels: Collapse hierarchy below certain depth
  • Show Screenshots: Toggle page previews

Auto-Layout

The editor uses automatic hierarchical layout:

  • Nodes are arranged top-to-bottom by hierarchy level
  • Edges route orthogonally between parents and children
  • Layout runs automatically on load and after structure changes
  • Manual positions are saved if you drag nodes after layout

💡 Tip: Use AI Assistant

Click the AI button (✨) in the toolbar to open the AI Sitemap Assistant. Ask it to reorganize pages, create new sections, or generate content for you.

9. AI Sitemap Assistant

The AI Sitemap Assistant is a conversational interface for restructuring your sitemap using natural language. Ask it to create pages, reorganize sections, or generate content.

Opening the Assistant

  • Click the AI button (✨) in the sitemap editor toolbar
  • A drawer slides in from the right side
  • Type your request in the chat input

What You Can Ask

Request Type Example Prompts
Create pages"Create a new FAQ section with 3 pages: General, Pricing, Support"
Reorganize"Move all blog posts under a new Blog section"
Update metadata"Add meta descriptions to all service pages"
Analyze"What pages are missing meta descriptions?"
Suggest structure"How should I organize the Services section?"
Generate content"Write content for the About page"

Human-in-the-Loop Approval

When AI proposes changes, you review and approve before execution:

  1. AI proposes a plan - Shows list of operations (create, update, delete, move)
  2. Review the plan - Toggle between List and Tree view to see affected nodes
  3. Cherry-pick operations - Uncheck any operations you don't want
  4. Approve or Reject - Click Approve to execute, Reject to cancel
  5. Modify - Send feedback to get a revised plan

Plan Preview Features

  • List View: Flat list of all operations with checkboxes
  • Tree View: Grouped by operation type (Create, Update, Move, Delete)
  • Node References: Click page names to navigate to that node
  • Operation Count: Badge shows number of proposed changes

Conversation History

  • Click the history icon in the drawer header
  • View past conversations for this sitemap
  • Click any conversation to reload it
  • Click + to start a new conversation

Resizable Drawer

Drag the left edge of the drawer to resize it (320px - 700px). Your preferred width is saved automatically.

AI Context

The AI has access to:

  • Full sitemap tree with all nodes and hierarchy
  • Page metadata (titles, URLs, status, types)
  • Navigation menus and their items
  • Crawl data (if available)
  • Team members and assignments
  • Taxonomy (categories and tags)

⚠️ AI Configuration Required

The AI Assistant requires an AI provider (OpenAI or Anthropic) to be configured in Admin → AI Usage & Costs. Without this, the assistant will show an error message.

✨ Pro Tips

  • • Be specific: "Create 3 pages under Services" works better than "add some pages"
  • • Reference existing pages: "Move the Contact page under About"
  • • Use the Tree view to understand hierarchical changes
  • • You can always Modify if the first plan isn't quite right

Exporting Data

Export Formats

CSV Export

Download crawl data as comma-separated values. Scopes available:

  • All Pages: Complete page list with all metadata
  • Internal Only: Only pages within the crawled domain
  • External Only: Only outbound links
  • Errors Only: Pages with 4xx/5xx status codes

XML Sitemap

Generate a sitemap.xml file from crawl results, compatible with Google Search Console.

JSON Export

Full structured data export for programmatic processing.

Redirect Map Export

Export a complete redirect map CSV mapping old URLs to new destinations. Hand the file directly to your development team for server configuration.

Creating a Redirect Map

Redirect maps aggregate data from multiple sources automatically. Here's how to build one:

Step 1: Mark Pages for Redirect

There are two ways to set up redirects:

  • Sitemap Editor: Select a node → set Action to Redirect → enter the destination URL in the "Redirect To URL" field that appears → Save
  • Content Audit: On the audit page list, change a page's action to Redirect → enter the destination URL when prompted. Or open the page detail modal and fill in the Redirect To URL field.

Step 2: Export the Redirect Map

Navigate to your project's Export tab and click the Redirect Map card:

  • Choose the redirect status code (301 Permanent or 302 Temporary)
  • Click Download Redirect Map to get the CSV

What's Included

The redirect map CSV aggregates redirects from all sources:

  • Sitemap redirects: Pages you marked as "Redirect" in the sitemap editor with a destination URL
  • Audit decisions: Redirect actions set in the Content Audit
  • URL changes: Pages whose URLs changed during sitemap restructuring
  • Crawl-detected: 3xx redirects found automatically during crawling

If the same source URL appears in multiple sources, the most explicit user action takes priority.

CSV Columns

Source URL, Destination URL, Status Code, Redirect Type, Chain Length, Notes

Troubleshooting

Common Issues

Pages not being crawled

Check if robots.txt is blocking. Try disabling "Respect robots.txt" for testing.

Slow crawl performance

Increase concurrent requests (up to 10-15) and reduce delay (0.25s).

Getting Help

  • View API documentation for programmatic access

Using Credentials

Credentials allow IATO to access password-protected areas of websites using HTTP Basic or Digest authentication.

Creating Credentials

  1. Go to Settings > Credentials
  2. Click Add Credential
  3. Enter a name, username, and password
  4. Select the authentication type (Basic or Digest)
  5. Save the credential

Applying Credentials to a Crawl

When starting a new project, expand Advanced Options and select your credential from the dropdown. The crawler will use these credentials for all requests.

Security Note: Credentials are stored encrypted. Never share your IATO account if you have sensitive credentials stored.

Using Extraction Rules

Extraction rules let you pull specific structured data from crawled pages using CSS selectors, XPath, or Regular Expressions. Rules are created once and can be attached to any crawl or scheduled crawl.

Creating Rules

  1. Go to Settings > Extraction Rules
  2. Click Add Rule
  3. Enter a name and choose the extraction method (CSS, XPath, or Regex)
  4. Enter your selector or pattern
  5. Choose the target: what to extract from matched elements
  6. Click Test to verify the rule against a live URL before saving
  7. Click Save to store the rule

Extraction Methods

Method Example Use Case
CSSh1.titlePage titles, specific elements
XPath//meta[@name='description']/@contentAttributes, complex paths
Regexprice:\\s*\\$(\\d+\\.\\d{2})Pattern matching in text

Target Options

Target Description
textExtract the text content of matched elements (default)
htmlExtract the full HTML of matched elements
attributeExtract a specific HTML attribute (set target_attribute, e.g. href, src, content)
countReturn the number of matched elements

Advanced Options

  • Match All: Return all matches instead of just the first one.
  • Required: Mark a rule as required — if extraction fails, the page is flagged.
  • Default Value: Provide a fallback value when extraction fails.
  • Regex Group: For regex rules, specify which capture group to return (default: 0 = full match).

Applying Rules to a Crawl

When starting a new project or scheduled project, select extraction rules from the Extraction Rules section in the crawl configuration. Selected rules will be applied to every page during the crawl. View results in the Extracted Data tab after the crawl completes.

Tip: Always use the Test button to verify your selector works on a sample page before starting a large crawl.

Using Form Authentication

Form authentication allows IATO to log into websites that use login forms before crawling.

Setting Up Form Auth

  1. Go to Settings > Form Auth
  2. Click Add Configuration
  3. Enter the login page URL
  4. Specify the form field names (usually "username" and "password")
  5. Enter your login credentials
  6. Optionally specify a success indicator (text or URL that confirms login)

Tip: Use your browser's developer tools to find the exact form field names. Look for <input name="..."> attributes.

Settings Scope

All settings in IATO are global - they apply across all your workspaces and projects.

Setting Type Location Notes
CredentialsSettings tabAvailable to all projects
Extraction RulesSettings tabApplied based on URL patterns
Form AuthSettings tabAvailable to all projects
AI ConfigurationSettings tabPersonal AI provider override (BYOK)
IntegrationsSettings tabGoogle Analytics, Search Console, WordPress
API KeyMy AccountFor REST API and SDK access

JavaScript Rendering

JavaScript rendering enables IATO to crawl modern single-page applications (SPAs), React/Vue/Angular sites, and any page with dynamically loaded content.

When to Use JS Rendering

  • Single-page applications (React, Vue, Angular)
  • Pages with lazy-loaded content
  • Sites using client-side routing
  • AJAX-heavy pages
  • When you need screenshots
  • Testing mobile/responsive layouts

Note: JavaScript rendering is slower than standard crawling (typically 2-5 seconds per page vs 0.5 seconds). Use it only when needed.

How It Works

IATO uses a headless browser engine to launch a real browser (Chromium, Firefox, or WebKit) that:

  1. Loads the page like a real user would
  2. Executes all JavaScript
  3. Waits for dynamic content to load
  4. Captures the fully-rendered HTML
  5. Optionally takes screenshots

Enabling JS Rendering

  1. Open the New Project modal
  2. Expand Advanced Options
  3. In the JavaScript Rendering section, toggle Enable
  4. Configure browser, device, and wait settings as needed

Configuration Options

Option Description
Browser EngineChromium (default), Firefox, or WebKit (Safari)
Device PresetPre-configured viewport sizes for desktop, mobile, and tablet
Wait UntilWhen to consider the page loaded (Network Idle recommended for SPAs)
Wait for SelectorWait for a specific element to appear (e.g., #main-content)
Extra WaitAdditional delay after page load for slow animations
Resource BlockingSkip images, CSS, fonts, or media to speed up rendering

Device Presets

Choose from 9 pre-configured device profiles or define a custom viewport:

Desktop

  • Desktop 1080p (1920×1080)
  • Desktop 1440p (2560×1440)

Mobile

  • iPhone 14 (390×844)
  • iPhone 14 Pro Max (430×932)
  • Pixel 7 (412×915)
  • Samsung Galaxy S23 (360×780)

Tablet

  • iPad Pro (1024×1366)

Bots

  • Googlebot Mobile
  • Googlebot Desktop

Screenshots

When enabled, IATO captures a screenshot of each page:

  • Viewport: Captures only the visible area
  • Full Page: Captures the entire scrollable page
  • Formats: PNG (best quality), JPEG (smaller files), WebP (modern, efficient)

Tip: Use resource blocking (images, fonts) to speed up rendering when you don't need visual fidelity for SEO analysis.

Theme & Appearance

Customize the look and feel of IATO to suit your preferences. All theme settings are saved to your account and sync across devices.

Accessing Theme Settings

  1. Click your username in the top right
  2. Select My Account
  3. Theme Settings is at the top of the page

Appearance Mode

Choose between three display modes:

Mode Description
LightDefault white/gray appearance
DarkDark backgrounds, easier on the eyes in low light
SystemAutomatically matches your device's light/dark setting

Theme Color

Select from 12 accent colors that apply to buttons, links, and highlights throughout the app:

Blue (default), Purple, Green, Red, Orange, Teal, Indigo, Pink, Cyan, Amber, Lime, Slate

Text Size

Adjust the text size from 80% to 120% of the default:

  • Use the A- and A+ buttons for quick adjustments
  • Use the slider for precise control
  • Click Reset to default to return to 100%

Note: Theme settings are saved automatically and will persist across browser sessions and devices when logged in.

14. API & Automation

IATO provides a comprehensive REST API for automation and integration with other tools. The API is designed to be AI-orchestrator friendly.

TypeScript SDK

The official SDK is the easiest way to integrate with IATO programmatically:

npm install iato-sdk
import { IATO } from 'iato-sdk';

const iato = new IATO({ apiKey: 'iato_your_key_here' });

const job = await iato.crawls.start({
  url: 'https://example.com',
  workspace_id: 'ws_abc123',
});

const completed = await iato.crawls.waitForCompletion(job.id);
const issues = await iato.crawls.seoIssues(completed.id);

The SDK covers every API endpoint with full TypeScript types, automatic retries, and built-in error handling. See the API Documentation tab for the complete SDK reference and all available resources.

API Discovery

  • /api/manifest - Machine-readable capabilities and endpoints
  • /api - API index with all endpoint categories
  • /api/health/detailed - System status with latency info

API Keys

Create scoped API keys for automated access:

  1. Navigate to My Account → API Keys
  2. Click Create New Key
  3. Choose scopes: read, write, or admin
  4. Set optional expiration
  5. Important: Copy the key immediately - it won't be shown again!

Authentication

Include your API key in requests:

Authorization: Bearer iato_your_key_here

Rate Limiting

The API allows 60 requests per minute. Response headers tell you your status:

  • X-RateLimit-Limit: Maximum requests (60)
  • X-RateLimit-Remaining: Requests left in window
  • X-RateLimit-Reset: When the window resets

Idempotency

For safe retries on POST/PUT/DELETE requests, include an idempotency key:

X-Idempotency-Key: unique-request-id-123

If you retry with the same key within 24 hours, you'll get the cached response.

Batch Operations

  • POST /api/crawl/jobs/batch-delete - Delete up to 100 jobs
  • POST /api/crawl/jobs/batch-export - Export up to 20 jobs

Tip: See the full API Documentation in the User Menu for detailed endpoint documentation.

Webhooks

Receive HTTP notifications when crawl events occur. Perfect for integrating IATO with your CI/CD pipeline, Slack, or other automation tools.

Creating a Webhook

  1. Navigate to My Account → Webhooks
  2. Click Create Webhook
  3. Enter your endpoint URL (must be HTTPS in production)
  4. Select events to subscribe to
  5. Optionally add a secret for signature verification

Available Events

Event Description
crawl.startedCrawl job has begun
crawl.progressProgress update (every 10%)
crawl.completedCrawl finished successfully
crawl.failedCrawl encountered an error
crawl.cancelledCrawl was cancelled by user

Webhook Payload

{
  "event": "crawl.completed",
  "timestamp": "2026-01-14T00:00:00Z",
  "data": {
    "job_id": "abc123",
    "url": "https://example.com",
    "pages_crawled": 150,
    "status": "completed"
  }
}

Signature Verification

If you configured a secret, verify the webhook signature:

# Header: X-Webhook-Signature: sha256=abc123...
# Compute: HMAC-SHA256(secret, payload)

Real-time Streaming (WebSocket)

For real-time progress, connect via WebSocket:

const ws = new WebSocket('ws://your-server/api/crawl/jobs/{job_id}/stream');
ws.onmessage = (e) => {
  const data = JSON.parse(e.data);
  console.log(`Progress: ${data.data.percent_complete}%`);
};

Testing: Use the "Test Webhook" button to send a test event to your endpoint before relying on it in production.

Developer Portal

The Developer Portal at /developers is a self-service platform for managing API access, usage, and billing. Access it from the user dropdown menu.

Dashboard

Live overview of your current usage and costs:

  • Usage Cards: Crawl Pages, API Calls, AI Operations, and Storage with progress bars showing consumption against free limits
  • Tier Info: Current plan, rate limit, and concurrent crawl allowance
  • Estimated Cost: Projected cost for the current billing period
  • Credit Balance: Available prepaid credits (if any)

API Keys

Create and manage scoped API keys with the iato_ prefix:

  • Scopes: read, write, admin — choose which permissions each key has
  • Tiers: Free (30 req/min), Pro (60 req/min), API (120 req/min)
  • Actions: Rotate (new key, same config), Revoke (deactivate), Delete (permanent)

Important: API keys are shown only once at creation. Copy and save your key immediately — it cannot be retrieved later.

Billing & Costs

  • Cost Breakdown: Per-category costs for crawl pages, API calls, AI operations, and storage
  • Spending Caps: Set a monthly limit to prevent unexpected charges. Alert shown when approaching the cap
  • Prepaid Credits: Purchase credit bundles with volume bonuses (20-30%). Credits auto-apply before card charges
  • Payment Management: Add payment methods and view invoices via Stripe

Quickstart

Get from zero to your first API call in minutes:

npm install iato-sdk
import { IATO } from 'iato-sdk';
const client = new IATO({ apiKey: 'iato_YOUR_KEY_HERE' });

const job = await client.crawl.start({
  url: 'https://example.com',
  maxPages: 100
});
console.log('Job started:', job.id);

The quickstart page also covers cURL examples, error handling for 402 (spending cap reached) and 429 (rate limited), and cost information.

Pricing

The pricing page includes tier comparison, per-unit pricing with volume discounts, and an interactive cost estimator:

Tier Price Highlights
Free$0500 pages/crawl, 30 req/min, 100 MB storage
Pro$49/mo or $468/yr10k pages, 10k calls, 50 AI ops, 5 GB included
APIUsage-based120 req/min, 5 concurrent, spending caps, volume discounts
EnterpriseCustom600 req/min, 20 concurrent, dedicated support

Use the Cost Estimator on the pricing page to calculate projected costs with preset scenarios (Solo Dev, Small Agency, AI Agent).

Prepaid Credit Bundles

Purchase Credits Bonus
$50$50
$100$12020%
$250$312.5025%
$500$65030%