Help

The General tab controls the overall behaviour of the web crawler: whether it runs, how it identifies itself, and how politely it treats target websites. → Open Crawler Settings (General tab)


Enabled

Enable or disable this data source. When disabled, the web crawler will not run for this service. Use this to pause crawling without losing your other settings.


Politeness Delay (ms)

The number of milliseconds the crawler waits between page requests to the same domain. This helps prevent overloading servers and reduces the risk of being rate-limited or blocked.

  • Default is typically 1000 (1 second).
  • Increase for very small or fragile sites; you can decrease for large, robust sites if you want faster crawls.

User Agent

The User-Agent string the crawler sends with each HTTP request. Many servers log or restrict access by user agent.

  • Leave blank to use the default crawler user agent.
  • Set a custom value (e.g. Mozilla/5.0 (compatible; MyCrawler/1.0)) if the site or your CDN expects a specific identity (for example, an “Airgentic” user agent that you have allowlisted).

Advanced

Convert HTTP to HTTPS

When enabled, all http:// URLs are automatically converted to https:// before crawling. This ensures the crawler uses secure connections when the site supports them, and avoids duplicate content between HTTP and HTTPS versions.

← Back to Crawler settings overview

You have unsaved changes