OpenAssistantGPT - Documentation - Crawlers for data extraction for your website

Learn how to use OpenAssistantGPT Crawlers for efficient data extraction and content organization on your website.

The crawler on our platform is designed to crawl websites and extract content. This process involves scanning a designated URL and gathering relevant information based on specified criteria, like a particular string or webpage section. The extracted content is then organized into files. These files can be provided to your chatbot, enriching its knowledge base and enabling it to respond more intelligently to user queries. Essentially, the more relevant content the crawler gathers, the smarter and more efficient your chatbot becomes.

To create a crawler on our platform, follow these steps:

Open: https://openassistantgpt.io/dashboard/new/crawler
Create New Crawler: Start by selecting the option to create a new crawler.
Display Name: Enter a name for your crawler. This name will appear on the dashboard for easy identification.
Crawling URL: Provide the URL where the crawler will begin its operation.
URL Match: Specify a string that the crawler should always match while crawling. This ensures relevance in the crawling process.
Selector: Define a selector to extract content from specific parts of the website. Use the browser's developer tools (F12) to test your query selector, like document.querySelector("[id='root']").

This tutorial provides a basic overview. For more detailed guidance, please refer to the specific instructions and examples provided on the platform.

OpenAI API Keys Files