databricks · jprakash-db · Dec 27, 2024 · Aug 14, 2024 · Sep 24, 2024 · Sep 25, 2024
@@ -58,6 +58,57 @@ jobs:
       #----------------------------------------------
       - name: Run tests
         run: poetry run python -m pytest tests/unit
+  run-unit-tests-with-arrow:
+    runs-on: ubuntu-latest
+    strategy:
+      matrix:
+        python-version: [ 3.8, 3.9, "3.10", "3.11" ]
+    steps:
+      #----------------------------------------------
+      #       check-out repo and set-up python
+      #----------------------------------------------
+      -   name: Check out repository
+          uses: actions/checkout@v2
+      -   name: Set up python ${{ matrix.python-version }}
+          id: setup-python
+          uses: actions/setup-python@v2
+          with:
+            python-version: ${{ matrix.python-version }}
+      #----------------------------------------------
+      #  -----  install & configure poetry  -----
+      #----------------------------------------------
+      -   name: Install Poetry
+          uses: snok/install-poetry@v1
+          with:
+            virtualenvs-create: true
+            virtualenvs-in-project: true
+            installer-parallel: true
+
+      #----------------------------------------------
+      #       load cached venv if cache exists
+      #----------------------------------------------
+      -   name: Load cached venv
+          id: cached-poetry-dependencies
+          uses: actions/cache@v2
+          with:
+            path: .venv-pyarrow
+            key: venv-pyarrow-${{ runner.os }}-${{ steps.setup-python.outputs.python-version }}-${{ github.event.repository.name }}-${{ hashFiles('**/poetry.lock') }}
+      #----------------------------------------------
+      # install dependencies if cache does not exist
+      #----------------------------------------------
+      -   name: Install dependencies
+          if: steps.cached-poetry-dependencies.outputs.cache-hit != 'true'
+          run: poetry install --no-interaction --no-root
+      #----------------------------------------------
+      # install your root project, if required
+      #----------------------------------------------
+      -   name: Install library
+          run: poetry install --no-interaction --all-extras
+      #----------------------------------------------
+      #              run test suite
+      #----------------------------------------------
+      -   name: Run tests
+          run: poetry run python -m pytest tests/unit
   check-linting:
     runs-on: ubuntu-latest
     strategy:

@@ -55,5 +55,3 @@ jobs:
       #----------------------------------------------
       - name: Run e2e tests
         run: poetry run python -m pytest tests/e2e
-      - name: Run SQL Alchemy tests
-        run: poetry run python -m pytest src/databricks/sqlalchemy/test_local
@@ -0,0 +1,78 @@
+name: Publish to PyPI Manual [Production]
+
+# Allow manual triggering of the workflow
+on:
+  workflow_dispatch: {}
+
+jobs:
+  publish:
+    name: Publish
+    runs-on: ubuntu-latest
+
+    steps:
+      #----------------------------------------------
+      # Step 1: Check out the repository code
+      #----------------------------------------------
+      - name: Check out repository
+        uses: actions/checkout@v2  # Check out the repository to access the code
+
+      #----------------------------------------------
+      # Step 2: Set up Python environment
+      #----------------------------------------------
+      - name: Set up python
+        id: setup-python
+        uses: actions/setup-python@v2
+        with:
+          python-version: 3.9  # Specify the Python version to be used
+
+      #----------------------------------------------
+      # Step 3: Install and configure Poetry
+      #----------------------------------------------
+      - name: Install Poetry
+        uses: snok/install-poetry@v1  # Install Poetry, the Python package manager
+        with:
+          virtualenvs-create: true
+          virtualenvs-in-project: true
+          installer-parallel: true
+
+#      #----------------------------------------------
+#      # Step 4: Load cached virtual environment (if available)
+#      #----------------------------------------------
+#      - name: Load cached venv
+#        id: cached-poetry-dependencies
+#        uses: actions/cache@v2
+#        with:
+#          path: .venv  # Path to the virtual environment
+#          key: venv-${{ runner.os }}-${{ steps.setup-python.outputs.python-version }}-${{ github.event.repository.name }}-${{ hashFiles('**/poetry.lock') }}
+#          # Cache key is generated based on OS, Python version, repo name, and the `poetry.lock` file hash
+
+#      #----------------------------------------------
+#      # Step 5: Install dependencies if the cache is not found
+#      #----------------------------------------------
+#      - name: Install dependencies
+#        if: steps.cached-poetry-dependencies.outputs.cache-hit != 'true'  # Only run if the cache was not hit
+#        run: poetry install --no-interaction --no-root  # Install dependencies without interaction
+
+#      #----------------------------------------------
+#      # Step 6: Update the version to the manually provided version
+#      #----------------------------------------------
+#      - name: Update pyproject.toml with the specified version
+#        run: poetry version ${{ github.event.inputs.version }}  # Use the version provided by the user input
+
+      #----------------------------------------------
+      # Step 7: Build and publish the first package to PyPI
+      #----------------------------------------------
+      - name: Build and publish databricks sql connector to PyPI
+        working-directory: ./databricks_sql_connector
+        run: |
+          poetry build
+          poetry publish -u __token__ -p ${{ secrets.PROD_PYPI_TOKEN }}  # Publish with PyPI token
+      #----------------------------------------------
+      # Step 7: Build and publish the second package to PyPI
+      #----------------------------------------------
+
+      - name: Build and publish databricks sql connector core to PyPI
+        working-directory: ./databricks_sql_connector_core
+        run: |
+          poetry build
+          poetry publish -u __token__ -p ${{ secrets.PROD_PYPI_TOKEN }}  # Publish with PyPI token
@@ -195,7 +195,7 @@ cython_debug/
 #  be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
 #  and can be added to the global gitignore or merged into this file.  For a more nuclear
 #  option (not recommended) you can uncomment the following to ignore the entire idea folder.
-#.idea/
+.idea/
 
 # End of https://www.toptal.com/developers/gitignore/api/python,macos
 

@@ -1,5 +1,10 @@
 # Release History
 
+# 4.0.0 (TBD)
+
+- Split the connector into two separate packages: `databricks-sql-connector` and `databricks-sqlalchemy`. The `databricks-sql-connector` package contains the core functionality of the connector, while the `databricks-sqlalchemy` package contains the SQLAlchemy dialect for the connector. 
+- Pyarrow dependency is now optional in `databricks-sql-connector`. Users needing arrow are supposed to explicitly install pyarrow
+
 # 3.7.0 (2024-12-23)
 
 - Fix: Incorrect number of rows fetched in inline results when fetching results with FETCH_NEXT orientation (databricks/databricks-sql-python#479 by @jprakash-db)

@@ -144,9 +144,6 @@ The `PySQLStagingIngestionTestSuite` namespace requires a cluster running DBR ve
 
 The suites marked `[not documented]` require additional configuration which will be documented at a later time.
 
-#### SQLAlchemy dialect tests
-
-See README.tests.md for details.
 
 ### Code formatting
 

@@ -3,9 +3,9 @@
 [![PyPI](https://img.shields.io/pypi/v/databricks-sql-connector?style=flat-square)](https://pypi.org/project/databricks-sql-connector/)
 [![Downloads](https://pepy.tech/badge/databricks-sql-connector)](https://pepy.tech/project/databricks-sql-connector)
 
-The Databricks SQL Connector for Python allows you to develop Python applications that connect to Databricks clusters and SQL warehouses. It is a Thrift-based client with no dependencies on ODBC or JDBC. It conforms to the [Python DB API 2.0 specification](https://www.python.org/dev/peps/pep-0249/) and exposes a [SQLAlchemy](https://www.sqlalchemy.org/) dialect for use with tools like `pandas` and `alembic` which use SQLAlchemy to execute DDL. Use `pip install databricks-sql-connector[sqlalchemy]` to install with SQLAlchemy's dependencies. `pip install databricks-sql-connector[alembic]` will install alembic's dependencies.
+The Databricks SQL Connector for Python allows you to develop Python applications that connect to Databricks clusters and SQL warehouses. It is a Thrift-based client with no dependencies on ODBC or JDBC. It conforms to the [Python DB API 2.0 specification](https://www.python.org/dev/peps/pep-0249/).
 
-This connector uses Arrow as the data-exchange format, and supports APIs to directly fetch Arrow tables. Arrow tables are wrapped in the `ArrowQueue` class to provide a natural API to get several rows at a time.
+This connector uses Arrow as the data-exchange format, and supports APIs (e.g. `fetchmany_arrow`) to directly fetch Arrow tables. Arrow tables are wrapped in the `ArrowQueue` class to provide a natural API to get several rows at a time. [PyArrow](https://arrow.apache.org/docs/python/index.html) is required to enable this and use these APIs, you can install it via  `pip install pyarrow` or `pip install databricks-sql-connector[pyarrow]`.
 
 You are welcome to file an issue here for general use cases. You can also contact Databricks Support [here](help.databricks.com).
 
@@ -22,7 +22,12 @@ For the latest documentation, see
 
 ## Quickstart
 
-Install the library with `pip install databricks-sql-connector`
+### Installing the core library
+Install using `pip install databricks-sql-connector`
+
+### Installing the core library with PyArrow
+Install using `pip install databricks-sql-connector[pyarrow]`
+
 
 ```bash
 export DATABRICKS_HOST=********.databricks.com
@@ -60,6 +65,18 @@ or to a Databricks Runtime interactive cluster (e.g. /sql/protocolv1/o/123456789
 > to authenticate the target Databricks user account and needs to open the browser for authentication. So it 
 > can only run on the user's machine.
 
+## SQLAlchemy
+Starting from `databricks-sql-connector` version 4.0.0 SQLAlchemy support has been extracted to a new library `databricks-sqlalchemy`.
+
+- Github repository [databricks-sqlalchemy github](https://github.com/databricks/databricks-sqlalchemy)
+- PyPI [databricks-sqlalchemy pypi](https://pypi.org/project/databricks-sqlalchemy/)
+
+### Quick SQLAlchemy guide
+Users can now choose between using the SQLAlchemy v1 or SQLAlchemy v2 dialects with the connector core
+
+- Install the latest SQLAlchemy v1 using `pip install databricks-sqlalchemy~=1.0`
+- Install SQLAlchemy v2 using `pip install databricks-sqlalchemy`
+
 
 ## Contributing