📖Create Ingestion

Specify the ingestion type and the detailed spec for a data source. The service will automatically ingest data in the data source. This API returns a ingestion_id, that you can use to track the ingestion status.

Confluence, Google Drive and Notion are supported. You can use OAuth to grant the access to ChatBees via UI, after logging into your account.

For heightened security measures, consider disconnecting Confluence immediately after the ingestion process. This prevents ChatBees from accessing Confluence until you reconnect it.

For Confluence, you can also create a confluence user and grant the read-only permission to the Confluence Space that you want to ingest to the user. You can follow Atlassian API token management to create an API token and pass the username and API token to the create_ingestion API. You can revoke the Atlassian API token at any time after ingestion completes. Supporting CQL will be ready soon.

POST /docs/create_ingestion HTTP/1.1
Api-Key: my_api_key
Content-Type: application/json
Host: my_account_id.us-west-2.aws.chatbees.ai

# example for Confluence
{
  "namespace_name": "string",
  "collection_name": "string",
  "type": "CONFLUENCE",
  "spec": {
    // http url to Confluence 
    "url":"string",
    // Specify space to ingest all pages in the space, or cql to ingest the
    // selected pages. Please specify either space or cql, not both.
    // Optional Confluence Space
    "space": "string",
    // Optional Confluence CQL
    "cql": "string",
    // Optional Atlassian API token. If Confluence is connected via OAuth,
    // service will automatically get the access token.
    "token": "string",
    // Optional Confluence username
    "username": "string"
  }
}

# example for Google Drive
{
  "namespace_name": "string",
  "collection_name": "string",
  "type": "GDRIVE",
  "spec": {
    // Optional Google Drive token. If Drive is connected via OAuth,
    // service will automatically get the access token.
    "token": "string",
    // Optional folder name
    "folder_name":"string"
  }
}

# example for Notion
{
  "namespace_name": "string",
  "collection_name": "string",
  "type": "NOTION",
  "spec": {
    // Optional Notion token. If Notion is connected via OAuth,
    // service will automatically get the access token.
    "token": "string",
    // Optional page ids
    "page_ids": ["pageid1", "pageid2", ...]
  }
}

Response:
{
  "ingestion_id": "string"
}

import chatbees as cb

# Configure API key
cb.init(api_key="my_api_key", account_id="my_account_id")

col = cb.collection('llm_research')

# create Confluence ingetion spec
ingest_type = cb.IngestionType.CONFLUENCE
spec = cb.ConfluenceSpec(
    url='http url to Confluence',
    # Specify space to ingest all pages in the space, or cql to ingest the
    # selected pages. Please specify either space or cql, not both.
    space='Confluence space',
    cql='cql',
    # Optional Atlassian API token. If Confluence is connected via OAuth,
    # service will automatically get the access token.
    token='Atlassian API token',
    # Optional Confluence username
    username='Atlassian username'
)

# create Google Drive ingetion spec
ingest_type = cb.IngestionType.GDRIVE
spec = cb.GDriveSpec(
    # Optional Google Drive token. If Drive is connected via OAuth,
    # service will automatically get the access token.
    token='token',
    # Optional folder name
    folder_name='folder1'
)

# create Notion ingetion spec
ingest_type = cb.IngestionType.NOTION
spec = cb.NotionSpec(
    # Optional Notion token. If Notion is connected via OAuth,
    # service will automatically get the access token.
    token='token',
    # Optional page ids
    page_ids=['pageid1', 'pageid2']
)

# create ingestion
# a ingestion_id is returned, and you can use it to get the ingestion status.
ingestion_id = col.create_ingestion(ingest_type, spec)

PreviousIngest Data Sources NextGet Ingestion

Last updated 1 year ago