Knowledge Index Freshness Guide
Knowledge Index Freshness Guide
How the knowledge index works
Your bot does not search the internet or your website in real time during a conversation. It searches a pre-built index — a snapshot of your content converted into vector embeddings. The bot is only as current as the last time the index was updated.
The index is built from multiple sources depending on your setup:
- Your website — pages crawled by the Velaro web scraper
- Platform catalog — products pulled from BigCommerce, Shopify, WooCommerce, Magento, or Square
- KB articles — your admin-authored knowledge base articles
- Uploaded documents — CSV/Excel price lists, PDFs, product sheets
How often does the index update?
KB articles: Indexed immediately every time you publish or edit an article. The daily auto-reindex is a safety net that catches any articles that changed without triggering the live push. If nothing changed, this run costs nothing.
Platform products (BigCommerce, Shopify, etc.): Synced based on your plan's reindex frequency. Only products that changed since the last sync are re-embedded — unchanged products are free to process.
Web scraper (multiple URL sets): Each URL set (a group of starting URLs + crawl settings) has its own independent schedule. You can have multiple URL sets — for example one for your product pages, one for your blog, one for your support documentation. Each crawls on the schedule set by your plan. Only pages that changed since the last crawl are re-indexed; unchanged pages cost nothing.
Can I have multiple website sections crawled separately?
Yes. You can create multiple URL sets under Knowledge → Website. Each URL set:
- Starts from its own seed URLs
- Crawls to a configurable depth
- Runs on its own auto-rebuild schedule
- Tracks its own "last run" and "pages changed" metrics
This means your products section, blog, and help docs can each have their own crawl frequency if needed.
What does "re-index" actually mean?
Re-indexing does not re-process everything every time. Velaro uses SHA-256 content hashing:
1. Crawl the page or fetch the product/article
2. Compute a SHA-256 hash of the content
3. Compare against the stored hash from the last run
4. If the hash matches → skip (zero embedding cost)
5. If the hash changed → generate a new embedding and update the index
At a 1% daily change rate on a 10,000-product catalog, you'd re-embed roughly 100 products per day — about $0.01/day in embedding costs.
How often does re-indexing happen by plan?
| Plan | Web / Product Reindex | KB Reindex |
|------|-----------------------|------------|
| Starter | Monthly | Daily (safety net) |
| Growth | Every 2 weeks | Daily (safety net) |
| Professional | Weekly | Daily (safety net) |
| Scale | Daily | Daily (safety net) |
| Large Catalog add-on | Daily | Daily (safety net) |
The "Large Catalog" add-on also raises your product limit to 500,000 products indexed.
Can I trigger a reindex manually?
Yes. In Knowledge → Index Freshness, click Reindex Now (KB articles) or Run Now (web scraper URL sets). Manual KB reindexes are always delta-safe — they only re-embed articles that changed. Manual scraper runs count against your monthly page crawl quota.
What happens to my index if I cancel?
Your knowledge index is retained for 30 days after cancellation, giving you a window to reactivate and pick up exactly where you left off. After 30 days, the index is cleared automatically. Your actual content (KB articles, conversations, contacts) follows the standard 90-day data retention policy.
What if the bot is giving outdated answers?
If the bot quotes a price that changed, or doesn't know about a new product, it means your index is stale. Steps to fix:
1. Check Knowledge → Index Freshness — look at "Last Updated" per source
2. If it's been longer than your plan interval, click "Reindex Now" / "Run Now"
3. For platform products — confirm the integration is connected (Integrations → your platform → Test Connection)
4. For KB articles — confirm the article is published and "Include in bot responses" is enabled
5. For scraped pages — confirm the URL set includes the correct starting URLs and crawl depth covers the changed pages
Was this article helpful?