Snowflake Native App
Documentations for snowflake native app
Welcome to ChatBees
Run state-of-the-art serverless RAG pipeline within your own Snowflake account with ChatBees. ChatBees provides enterprise users with a complete and flexible end-to-end solution
With a set of simple APIs, you can ingest and index a variety of unstructured and semi-structured data, including PDF
, CSV
, DOCX
, TXT
, MD
.
Required privileges for installation
ChatBees requires the following global privileges
CREATE COMPUTE POOL
: Required to run ChatBees service inside container service.CREATE WAREHOUSE
: Required for ChatBees to execute queries.IMPORTED PRIVILEGES ON SNOWFLAKE DB
: Required to access Snowflake Cortex embedding and completion functions.
Note: Granting IMPORTED PRIVILEGES ON SNOWFLAKE DB
privilege allows ChatBees to see information about usage and costs associated with the consumer account. ChatBees is committed to never accessing these data and other unrelated user data.
Additionally, you can configure the EGRESS
network connection to
HuggingFace if you configure ChatBees to host an embedding model within the app.
Third-party API endpoints, such as Azure OpenAI, if you configure ChatBees to access models outside of Snowflake.
See Configure models
section for details.
Quickstart
First configure the embedding and completion models you wish to use with ChatBees app. ChatBees supports models from major LLM vendors, including Snowflake Cortex.
You can change and experiment with the completion model at any time. However, you must ensure all existing collections are deleted before changing embedding model
For example, use the following command to configure ChatBees to use voyage-multilingual-2
embedding model and llama3.1-70b
completion model from Snowflake Cortex.
NOTE: Replace chatbees_app
with your actual installed app name.
You can also configure the compute instance family of ChatBees. We recommend
CPU_X64_S
if you use Snowflake Cortex models exclusivelyGPU_NV_S
orGPU_NV_XS
if you configure a huggingface embedding model. You can restart ChatBees with a different compute instance family at any time to scale up or down
Next, create a collection. A Collection serves as the fundamental unit for data organization. You can put different data sets into different collections.
Next, ingest all files inside a stage via chatbees_app.api.ingest_files
function. ChatBees supports a variety of file formats like pdf, csv, txt, docx.
Finally, you can query your collection. ChatBees supports both RAG and semantic search.
Configure models
You can configure ChatBees to use a variety of models from different vendors. ChatBees can also host a HuggingFace embedding model for you within the app. Select embedding and completion models from supported vendors below, they can be from the same or different vendor.
Supported embedding model vendors and hosting options
Snowflake Cortex: All models
Hosted HuggingFace: All public models. Contact us if you need access to gated models
OpenAI: All models
Azure OpenAI: All models
Supported completion model vendors and hosting options
Snowflake Cortex: Llama3.1 and 3.2 models. Support for other models is coming soon
OpenAI: All models
Azure OpenAI: All models
Anthropic: (coming soon)
More vendors and hosting options can be supported by request
If you choose a third-party model, network access must be set up under "Connections" tab.
For huggingface models: ChatBees will download the relevant model file from one of HF's CDN endpoints (cdn-lfs.hf.co, cdn-lfs-eu-1.hf.co, cdn-lfs-us-1.hf.co)
For OpenAI models: ChatBees will connect to api.openai.com
For Azure OpenAI models: ChatBees will connect to your Azure OpenAI deployment at .openai.azure.com
Next, configure access credentials if you use OpenAI or Azure OpenAI models.
Finally, specify the embedding and completion models you'd like to use. You can mix and match embedding and completion models from any supported vendor! Make sure to select a GPU instance if you're running embedding model within ChatBees for optimal performance.
Update embedding and completion models.
You can restart ChatBees app with a different completion model at any time. You must call admin.stop_app()
to stop any existing ChatBees service.
However, embedding model can only be changed when the app does not have any collection. If you need to change the embedding model, all existing collections must be deleted first.
Performance considerations
The performance of ChatBees app depends on both ChatBees RAG engine performance as well as the performance of the underlying embedding and completion models.
For example, if ChatBees is configured with Snowflake Cortex, the latency you experience will depend directly on the performance of the Cortex models.
Data ingestion: uses embedding model
Semantic search: uses embedding model
RAG: uses both embedding and completion models
You can export access logs to view a detailed breakdown of RAG performance
Tips to optimize the performance of your ChatBees app:
Completion time too high: Try a smaller model or vendor
Embedding time too high: Try a smaller model or vendor
Embedding time too high (huggingface): Try a smaller embedding model, or use a GPU instance
APIs
admin.start_app
Starts the ChatBees app. You can invoke this procedure again to change embedding or completion model.
embedding_model
: The embedding model to use. Format is'<vendor>/<model>'
completion_model
: The completion model to use. Format is'<vendor>/<model>'
instance_family
: The instance family of ChatBees compute. We recommendCPU_X64_S
if you are not running huggingface embedding model.
admin.stop_app
Stops the ChatBees app and stops compute pool.
admin.service_status
Returns the current status of ChatBees app (e.g. PENDING
, READY
)
admin.get_external_access_config
Returns the access configuration required.
reference_name: Name of the external access config. Only
model_vendor
is supported.
admin.configure_openai
Configures connection to OpenAI
api_key: OpenAI API key
admin.configure_azure_openai
Configures connection to Azure OpenAI
api_key: Azure OpenAI API key
endpoint: Azure OpenAI endpoint
api_version: Azure OpenAI API version
admin.configure_huggingface
(coming soon) Configures connection to Huggingface to use gated repositories
api_key: OpenAI API key
api.create_collection
Creates a collection. A Collection serves as the fundamental unit for data organization. You can put different data sets into different collections.
collection_name: Name of the collection
api.list_collections
Lists all collections in ChatBees
api.delete_collection
Deletes a collection and its content
collection_name: Name of the collection
api.ingest_files
Ingests files from stage into ChatBees RAG pipeline. Please make sure to grant ChatBees app READ privilege of the stage, as well as USAGE privilege of the parent schema and database. This function supports incremental ingestion. Previously ingested files will not be ingested again.
collection_name: The name of the collection to ingest into
stage_path: The fully qualified name of the stage
If you accidentally deleted a file and wish to ingest it again, please re-upload the file into stage, refresh directory table, then invoke ingest_files
. Note: You must enable directory table on the stage and make sure it is refreshed before calling ingest_files
.
api.list_files
Lists all files inside a collection.
collection_name: Name of the collection
api.delete_file
Deletes a file from a collection
collection_name: Name of the collection
file_name: Name of the file to delete
api.ask
ChatBees RAG API. Gets conversational answer to your question, based on data inside your collection.
collection_name: Name of the collection
question: Question to ask
api.search
ChatBees semantic search API. Returns top_k
most relevant results from collection_name
collection_name: Name of the collection
question: Question to ask
top_k: How many search results to return
api.export_access_logs
Exports all access logs into a snowflake table.
Cost considerations
ChatBees app uses the following resources
1x
MEDIUM
warehouse for running queries. This warehouse is auto-suspended if ChatBees is not actively processing requests.1x compute pool with one node to run ChatBees Service. You can specify the compute pool instance family at app startup.
You can stop ChatBees app at anytime and resume later. Our cloud-native architecture ensures that your data is always persisted on durable storage.
Last updated