ChatBees
  • 👋Welcome
  • Concepts
    • 📖Architecture
    • 📖Security
    • 📖Serverless RAG
    • 📖Namespace and Collection
    • 📖Access Control
    • 🚀Ticket AI Agent
  • Ticket AI Agent
    • Installation
  • WEB APP REFERENCES
    • 🔑Sign-in and Sign-out
    • ⛓️Manage Connectors
    • 🌏Manage Collections
      • 💿Data Sources
        • 📖Configure Periodic Import
      • ❓Chat with collection
      • ⏱️View Q/A history
      • 📖Publish a Collection
    • 🏛️Manage Users
    • 💰[Flex] Billing and Payment
    • 💰[Enterprise] Billing and Payment
    • 📈Account Usage
    • 🗝️API Keys
    • 🖥️Generated Code Sample
  • ChatBots
    • 🪄AI Chat for Confluence
    • 🪄ChatBees Slack Bot
    • 🪄ChatBees Website ChatBot
    • 🪄Pnyx Discord Bot
  • Snowflake Native App
  • API References
    • 📖API Key
    • 📖Collection Operations
      • 📖Create Collection
      • 📖Configure Collection
      • 📖List Collections
      • 📖Delete Collection
    • 📖Document Operations
      • 📖Upload Document
      • 📖Summarize Document
      • 📖Get Document Outlines and FAQs
      • 📖Ask
      • 📖Chat
      • 📖Search
      • 📖Personalize Response
      • 📖List Documents
      • 📖Delete Document
    • 📖Crawl Operations
      • 📖Create Crawl
      • 📖Get Crawl
      • 📖Index Crawl
      • 📖Delete Crawl
    • 📖Ingest Data Sources
      • 📖Create Ingestion
      • 📖Get Ingestion
      • 📖Index Ingestion
      • 📖Delete Ingestion
Powered by GitBook
On this page
  1. API References
  2. Ingest Data Sources

Create Ingestion

PreviousIngest Data SourcesNextGet Ingestion

Last updated 1 year ago

Specify the ingestion type and the detailed spec for a data source. The service will automatically ingest data in the data source. This API returns a ingestion_id, that you can use to track the ingestion status.

Confluence, Google Drive and Notion are supported. You can use OAuth to grant the access to ChatBees via UI, after logging into your account.

For heightened security measures, consider disconnecting Confluence immediately after the ingestion process. This prevents ChatBees from accessing Confluence until you reconnect it.

For Confluence, you can also create a confluence user and grant the read-only permission to the Confluence Space that you want to ingest to the user. You can follow to create an API token and pass the username and API token to the create_ingestion API. You can revoke the Atlassian API token at any time after ingestion completes. Supporting CQL will be ready soon.

POST /docs/create_ingestion HTTP/1.1
Api-Key: my_api_key
Content-Type: application/json
Host: my_account_id.us-west-2.aws.chatbees.ai

# example for Confluence
{
  "namespace_name": "string",
  "collection_name": "string",
  "type": "CONFLUENCE",
  "spec": {
    // http url to Confluence 
    "url":"string",
    // Specify space to ingest all pages in the space, or cql to ingest the
    // selected pages. Please specify either space or cql, not both.
    // Optional Confluence Space
    "space": "string",
    // Optional Confluence CQL
    "cql": "string",
    // Optional Atlassian API token. If Confluence is connected via OAuth,
    // service will automatically get the access token.
    "token": "string",
    // Optional Confluence username
    "username": "string"
  }
}

# example for Google Drive
{
  "namespace_name": "string",
  "collection_name": "string",
  "type": "GDRIVE",
  "spec": {
    // Optional Google Drive token. If Drive is connected via OAuth,
    // service will automatically get the access token.
    "token": "string",
    // Optional folder name
    "folder_name":"string"
  }
}

# example for Notion
{
  "namespace_name": "string",
  "collection_name": "string",
  "type": "NOTION",
  "spec": {
    // Optional Notion token. If Notion is connected via OAuth,
    // service will automatically get the access token.
    "token": "string",
    // Optional page ids
    "page_ids": ["pageid1", "pageid2", ...]
  }
}

Response:
{
  "ingestion_id": "string"
}
import chatbees as cb

# Configure API key
cb.init(api_key="my_api_key", account_id="my_account_id")

col = cb.collection('llm_research')

# create Confluence ingetion spec
ingest_type = cb.IngestionType.CONFLUENCE
spec = cb.ConfluenceSpec(
    url='http url to Confluence',
    # Specify space to ingest all pages in the space, or cql to ingest the
    # selected pages. Please specify either space or cql, not both.
    space='Confluence space',
    cql='cql',
    # Optional Atlassian API token. If Confluence is connected via OAuth,
    # service will automatically get the access token.
    token='Atlassian API token',
    # Optional Confluence username
    username='Atlassian username'
)

# create Google Drive ingetion spec
ingest_type = cb.IngestionType.GDRIVE
spec = cb.GDriveSpec(
    # Optional Google Drive token. If Drive is connected via OAuth,
    # service will automatically get the access token.
    token='token',
    # Optional folder name
    folder_name='folder1'
)

# create Notion ingetion spec
ingest_type = cb.IngestionType.NOTION
spec = cb.NotionSpec(
    # Optional Notion token. If Notion is connected via OAuth,
    # service will automatically get the access token.
    token='token',
    # Optional page ids
    page_ids=['pageid1', 'pageid2']
)

# create ingestion
# a ingestion_id is returned, and you can use it to get the ingestion status.
ingestion_id = col.create_ingestion(ingest_type, spec)
📖
📖
Atlassian API token management