Connecting to Databricks
Last updated
Was this helpful?
Last updated
Was this helpful?
Juicebox supports connections to Databricks across all major cloud providers: AWS, Google Cloud Platform (GCP) and Microsoft Azure.
Before connecting Juicebox to Databricks, ensure you have:
A Databricks workspace with:
An active SQL warehouse (a SQL-optimized compute resource for data warehousing)
Tables or views in Unity Catalog with the relevant data
A user or service principal with appropriate data access permissions and compute resource access to query the relevant data sources:
USE CATALOG
on the catalog(s) containing the tables or views
USE SCHEMA
on the schema(s) containing the tables or views
SELECT
on the tables or views
Can use
permissions for a SQL warehouse compute resource
A Premium or Enterprise plan (recommended for optimized performance)
A Juicebox workspace with:
A Business or Unlimited subscription. Database connections are only available on these subscription levels. Reach out to help@myjuicebox.io to learn more about subscription options.
A Juicebox user account with Owner or Admin privileges.
The connection details (see below)
A secure method to share connection details (such as 1Password's one-time sharing feature)
There are two authentication methods available to connect to Databricks from a Juicebox workspace: Personal Access Token (PAT) and OAuth. PAT is suitable for testing and development, but is not recommended for production. OAuth is the recommended method for use in production.
Connection details for either method need to be shared with Juice Analytics by a user having Admin or higher privileges in the Juicebox workspace. To start this process, reach out to help@myjuicebox.io.
PAT authentication can be generated for a Databricks user account and will have the same permissions as the associated user account. It is suitable for testing and development, but is not recommended for production.
Create a Personal Access Token in Databricks:
In your Databricks workspace, click on the user profile button in the upper-right and select Settings
In the Settings side bar, select the Developer tab
Click the Manage button in the Access tokens section
Click the Generate new token button
Provide a description and set expiration
Copy the generated personal access token (it will only be shown once)
Share the following information:
hostname: The Databricks server hostname
http_path: The HTTP path to the Databricks SQL endpoint
catalog: The catalog to use for the connection
schema: The schema to use for the connection (optional)
access_token: Your Databricks personal access token copied from Step 1 (this will be stored securely)
OAuth authentication requires a service principal with appropriate permissions. OAuth is suitable for all environments and recommended for production.
Create a Service Principal in Databricks (you must be an Admin to do this):
In your Databricks workspace, click on the user profile button in the upper-right and select Settings
In the Settings side bar, select the Identity and access tab under Workspace admin
Click the Manage button in the Service principals section
Click the Add service principal button
Click the Add new button
Enter a service principal name and click Add
Generate an OAuth Secret for the Service Principal
On the service principal's details page, click the Secrets tab
Under OAuth secrets, click Generate secret
Set the secret's lifetime in days (maximum 730 days)
Copy the displayed Secret (it will only be shown once) and Client ID, then click Done
Share the following information:
hostname: The Databricks server hostname
http_path: The HTTP path to the Databricks SQL warehouse
catalog: The catalog to use for the connection
schema: The schema to use for the connection (optional)
account_id: The OAuth account ID
client_id: The OAuth client ID (also called the service principal application ID)
secret: Your OAuth client secret from Step 2 (this will be stored securely)
In your Databricks workspace, navigate to SQL Warehouses
Select the SQL warehouse you want to connect to
Click on Connection details
Copy the Server hostname and HTTP path values
In your Databricks workspace, navigate to Catalog
The top-level container is the catalog. Each connection requires a single catalog.
The containers within the catalog are the schemas. Schemas contain tables and views. Specifying the schema is optional.
In your Databricks workspace, click on the workspace dropdown to the left of the user profile button
Click Manage account. This will open the account console.
Click the user profile button in the upper-right of the account console.
Copy the Account ID.
In your Databricks workspace, click on the user profile button in the upper-right and select Settings
In the Settings side bar, select the Identity and access tab under Workspace admin
Click the Manage button in the Service principals section
Click the service principal name to access its details
Copy the Application ID (also called the Client ID)
After sending the connection details to the Juice Analytics team, the Databricks connection will be set up in your workspace. After it is set up, a new Databricks connection button will appear in the data drawer of each draft report in your Juicebox workspace like this:
To add a data source using the Databricks connection, click on the connection button, select the schema, and select the table or view to add.
Once the data source has been added to the Juicebox report, you can use it to configure slices.
For help setting up a Databricks connection in Juicebox, reach out to us at help@myjuicebox.io.
Provide Connection Details to Juice Analytics (see section for more information):
Use a secure sharing method like 1Password's feature
Provide Connection Details to Juice Analytics (see section for more information):
Use a secure sharing method like 1Password's feature
Additional access controls can be configured within Juicebox at the user level. This allows for granular control over who can see which Juicebox reports and what filters are applied for the user within each report. These permissions are managed separately from Databricks permissions. See .