Skip to Content
FeaturesAI ChatbotDataData Training

Data Training

The Data Training tab is where you upload and prepare content that the chatbot will use to generate accurate and helpful responses.

Data

This tab supports three types of data ingestion:

  1. Upload Files
  2. Import via Sitemap with Regex
  3. Enter Direct URLs

Upload Files

You can upload text and document files such as:

  • PDF, TXT, CSV, DOCX, and PPTX
  • Limit: Maximum 30MB in total; each individual file must not exceed 15MB

Steps to Upload

  1. Click Choose Files and select your document(s)
  2. Click Upload
  3. Once uploaded, files will appear below the uploaded field
  4. You can:
    • Click the ❌ cross icon to remove files before submission
    • Click Submit to train them

Note: If your uploaded file doesn’t appear under Uploaded Data, it means the training failed or upload was incomplete — try uploading again.


Advanced Data Extraction via Sitemap

This section allows importing structured web content using a sitemap and optional regular expression filter.

Fields

  • Sitemap: Enter your site’s sitemap URL (e.g., https://domain.com)
  • Regular Expression: (Optional) Filter specific URLs (e.g., /travel, release/2023)

Actions

Click Preview to review fetched content

  1. A Preview Modal opens where you can:
    • Toggle Auto Crawl (on = periodic auto-refresh; off = static data)
    • Click Cancel to exit or Train to begin processing

Enter Comma-Separated URLs

Quickly train your bot using a list of specific page URLs.

Example Format: (e.g., https://abc.com, https://abc.com/travel, https://xyz.com/test)

  1. Paste multiple URLs separated by commas

  2. Click Preview to review fetched content

  3. A Preview Modal opens where you can:

    • Toggle Auto Crawl (on = periodic auto-refresh; off = static data)
    • Click Cancel to exit or Train to begin processing

    Training URL


Training Flow Logic

  • After you Submit or Train, the data enters a Queued state
  • If the training is successful, the content will move to the Uploaded Data tab
  • If you do not see the content in the Uploaded tab, the training failed — retry the upload or check formatting

Key Concepts

  • Uploaded Data: These are the only documents and URLs the chatbot uses to answer questions.

  • 🕓 Queued Data: Content that is still being processed.

    Note: Queued content does not affect chatbot responses until it appears in Uploaded Data.

  • ⚙️ Auto Crawl Toggle:
    When enabled, your chatbot will periodically re-fetch and retrain on the URL’s content.
    When disabled, it stays fixed and won’t update unless retrained manually.


Best Practices

  • Always verify file size and format before uploading.
  • Use clear and crawlable web pages for best URL training results.
  • Regularly monitor Queued Data and retry failed uploads if needed.
  • Keep Auto Crawl enabled only for content that changes frequently.

Last updated on