Best Practices for Training Your AI
This page highlights the best approaches for training your AI.
Section A: Files(PDF, TXT, DOCX)
This section explains how to format and structure your content in PDF, Word (DOCX), and Text (TXT) formats so it can be properly trained into your AI bot for high-quality responses.
🧱 Structuring Your Content
1. One Topic per Paragraph
Stick to one idea per paragraph
Keep it under 800 characters (including spaces)
Always use complete sentences and correct punctuation
✅ Good:
“The pricing model includes three plans: Basic ($29), Pro ($99), and Enterprise ($199). Each offers different features depending on user needs.”
❌ Bad:
“Pricing has options. Go to website.”
2. ✂️ Splitting Long Topics
If a topic is too long, break it into two or more logical parts using the same heading or theme.
✅ Good:
Main Heading (Part1): First part
(around 400 characters) with one context
Main Heading (Part2): Second part (around 400 characters) with a different but related context
❌ Bad:
Main Heading: Very long explanation covering multiple ideas in one block without splitting or clarity.
📄 Format-Specific Guidelines
✅ PDF, DOCX, TXT – Formatting Rules
Use clear section headings to define topics
Avoid decorative styles (fonts/colors don’t help the AI)
Use simple formatting: paragraphs, bullet lists, numbered points
❌ Don’t Include:
Images
Tables with merged cells
Decorative graphics or background colors
Page numbers or headers/footers that repeat unnecessarily
📏 Writing Style Guidelines
Be Specific
Mention exact processes, values, names
“We offer 24/7 live chat support.”
Use Examples
Add clear examples wherever possible
“For example, the ‘Basic’ plan includes...”
Write Directly
Use short, clear sentences
“To reset your password, click ‘Forgot Password’.”
Make Things Clear
Never write “etc.” or “many things”
“Supports up to 100 users.” not “Supports many users.”
❌ What to Avoid
Mixing topics in one paragraph
Reduces precision during training
Listing unrelated items together
Affects context accuracy
Including images or charts
AI doesn’t process them
Embedding FAQs in the main file
FAQs must be added separately for best training
🧠 What to Do When You're Not Sure About Format
When unsure about structure, follow this fallback rule:
“One paragraph = One point = Under 800 characters”
If you have longer paragraphs, the system will break into parts them. Just make sure they’re well-written.
🪜 How to Make Answers Better
Keep in mind that each paragraph becomes a reference block. It’s better to have 30 well-written, focused paragraphs than 5 long paragraphs.
✅ Clear and focused content → Fast and accurate answers
❌ Bulky, unstructured content → Confused or generic bot responses
Section B: FAQ's Training
This section explains how to prepare FAQs in a structured and clear format for training AI bots or building knowledge bases. It ensures your questions and answers are professional, consistent, and easy for the system to understand.
🧱 Structure & Format Guidelines:
Question Format:
Maximum length: 200 characters
Always write the full question clearly (not just a keyword or phrase)
Avoid starting with lowercase or always write the full form first
Answer Format:
Maximum length: 500 characters
Use complete, grammatically correct sentences
Keep it direct, factual, and useful
📊 Quick Summary Table:
Max Question Length
200 characters
Max Answer Length
500 characters
Writing Tone
Clear, professional, factual
Abbreviation Use
Full form first, abbreviation in parentheses
File Upload Limit
Max 200 FAQs per category (split if needed)
❌ What to Avoid:
Using only abbreviations (e.g., "CEO") without explanation
Mixing more than one question or topic in a single entry
Using vague or generic questions
Submitting over 200 FAQs in one CSV without splitting
Including emojis, incomplete sentences, or slang
✅ Examples of Good and Bad Practices:
❌ Who is ceo of sml isuzu?
❌ Who is the Chief Executive officer of sml isuzu?
✅ Who is the Chief Executive Officer (CEO) of SML Isuzu?
📔 Two Ways to Train Using FAQ:
1. Type Manually in Dashboard
You can directly enter questions and answers into the dashboard interface.

2. Upload via CSV File
Use a .csv
file in the following format to upload multiple FAQs at once:

The format of the CSV file should be as such:
Question
Answer
What is BotPenguin?
BotPenguin is a....
What is the use of BotPenguin?
BotPenguin can be used for....
⚠️ Limitation on Number of FAQs per Category:
Upload a maximum of 200 FAQs per category per file.
For more than 200 FAQs, split them into multiple CSV files or categories.
This ensures faster processing and avoids system errors.
Here’s a suggested FAQ section to add at the end of your document:
Section C: Website Training
🧱 Structure & Format Guidelines:
Use Webpages With Good Content
Pages should have informative, structured, and complete content. Avoid empty, template-only, or irrelevant pages.
Avoid Duplicate or Repeated Pages
Each page must have unique value. No duplicates or near-duplicates.
robots.txt Must Allow Scraping
Confirm that the URL is not blocked by
robots.txt
. Scraping is only allowed if permitted.Add sitemap.xml in robots.txt
Make sure your
robots.txt
includes asitemap.xml
link. This helps bots find your important pages faster.Skip Unnecessary Categories
Avoid submitting categories that are not helpful to the chatbot’s goal.
🔧 Tip: During the training setup, you can exclude irrelevant categories manually.
📊 Quick Summary Table:
Content quality
Use clear and structured web pages
Duplicate pages
Avoid duplicates or highly similar content
robots.txt permission
Must be allowed in robots.txt
sitemap.xml reference
Should be included in robots.txt
Irrelevant categories
Do not include; exclude during selection or upload
❌ What to Avoid:
Pages with little or poor-quality content
Pages blocked in
robots.txt
Duplicate pages with only minor differences
Pages unrelated to the bot's function or scope
Don’t include extra pages or categories that are not useful for the chatbot.
Example: Exclude categories like “Chatbot Templates”, “Chatbot Features” , “Platform Features” etc., as they are not relevant to the current use case.

Section D: CSV and Google Sheets
🧱 Structure & Format Guidelines
✅ General Rules
Use Flat Tables: Each row should be one complete record.
No Merged Cells: Avoid merged headers or cells.
Label Every Cell: No empty cells where values should be explicitly stated.
Avoid Contradictions: All column data must logically align.
Be Consistent: Keep column types and formats uniform.
📋 File-Specific Formatting
CSV Files (Flat Table Format)
User ID,Name,Email,City,Registration Date
101,Alice Smith,[email protected],New York,2024-01-15
102,Bob Jones,[email protected],Los Angeles,2024-03-22
Comma-separated, no nested headers
Each column = one field/property
Avoid trailing commas or inconsistent row lengths
📊 XLSX & Google Sheets
✅ Ideal Format
❌ Bad Format (Merged/Grouped Headers)
Category
Subcategory
Item
Electronics
Phones
iPhone 13
Samsung S21
✅ Corrected Format
Electronics
Phones
iPhone 13
Electronics
Phones
Samsung S21
✅ Good Table Format (Simple and Clean)
Use a clear, single-record-per-row format to keep data clean and ready for training:
Model A
99999
100cc
Model B
88999
125cc
❌ Bad Format Example – Confusing Data in Same Row
Avoid mixing multiple products or entities in one row:
Model A
99999
Model B
125cc
50
Model B
88999
Model C
150cc
40
✅ Fixed Format – One Row per Product
Every product now has its own row, with all corresponding data:
Model A
999999
125cc
50
Model B
889999
150cc
40
Model C
788899
100cc
60
✍️ Writing Style Guidelines for Tables
Use clear column headers (e.g., “User ID” not “UID”)
Fill all values—no blanks in categories/subcategories
Repeat parent values for child entries instead of leaving cells empty
Use consistent date formats (e.g., YYYY-MM-DD)
No contradictory data (e.g., conflicting models/prices in same row)
Avoid multiple entities per row
⚡ Quick Summary Table
User Database
Use flat structure, one user per row
✅ CSV/XLSX example
Product Categories
Repeat parent-child structure explicitly in each row
✅ Category/Subcategory table
Model Specs
Avoid mixing models in a single row
✅ Model/Price table
Bad Formats
Never use merged cells or leave categories blank
❌ Bad nested row table
FAQ's:
Here are some common FAQ's asked when we train our AI:
If everything is correctly formatted but you're still having trouble, reach out to our support team: [email protected]
Last updated
Was this helpful?