Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PySQL Connector split into connector and sqlalchemy #444

Merged
merged 21 commits into from
Dec 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
9cb1ea3
Modified the gitignore file to not have .idea file
jprakash-db Aug 14, 2024
4099939
[PECO-1803] Splitting the PySql connector into the core and the non c…
jprakash-db Sep 24, 2024
a022590
Changed the folder structure such that sqlalchemy has not reference here
jprakash-db Sep 25, 2024
af47301
Fixed README.md and CONTRIBUTING.md
jprakash-db Oct 8, 2024
64b2818
Added manual publish
jprakash-db Oct 8, 2024
44b52ac
On push trigger added
jprakash-db Oct 8, 2024
8db3fd0
Manually setting the publish step
jprakash-db Oct 8, 2024
3d1ef79
Changed versioning in pyproject.toml
jprakash-db Oct 17, 2024
ee7f1e3
Bumped up the version to 4.0.0.b3 and also changed the structure to h…
jprakash-db Nov 6, 2024
608d237
Removed the sqlalchemy tests from integration.yml file
jprakash-db Nov 11, 2024
85af9c0
[PECO-1803] Print warning message if pyarrow is not installed (#468)
jackyhu-db Nov 13, 2024
38ffa95
[PECO-1803] Remove sqlalchemy and update README.md (#469)
jackyhu-db Nov 13, 2024
6ce555a
Removed all sqlalchemy related stuff
jprakash-db Nov 13, 2024
87b1251
generated the lock file
jprakash-db Nov 13, 2024
e09a880
Resolved merge conflicts
jprakash-db Dec 10, 2024
f9cafe5
Fixed failing tests
jprakash-db Dec 10, 2024
e4205cc
removed poetry.lock
jprakash-db Dec 11, 2024
3853b76
Updated the lock file
jprakash-db Dec 11, 2024
8f70b5b
Fixed poetry numpy 2.2.2 issue
jprakash-db Dec 11, 2024
3fc4e01
Workflow fixes
jprakash-db Dec 26, 2024
a63ece8
Fixed merge conflicts
jprakash-db Dec 26, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 51 additions & 0 deletions .github/workflows/code-quality-checks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,57 @@ jobs:
#----------------------------------------------
- name: Run tests
run: poetry run python -m pytest tests/unit
run-unit-tests-with-arrow:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: [ 3.8, 3.9, "3.10", "3.11" ]
steps:
#----------------------------------------------
# check-out repo and set-up python
#----------------------------------------------
- name: Check out repository
uses: actions/checkout@v2
- name: Set up python ${{ matrix.python-version }}
id: setup-python
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}
#----------------------------------------------
# ----- install & configure poetry -----
#----------------------------------------------
- name: Install Poetry
uses: snok/install-poetry@v1
with:
virtualenvs-create: true
virtualenvs-in-project: true
installer-parallel: true

#----------------------------------------------
# load cached venv if cache exists
#----------------------------------------------
- name: Load cached venv
id: cached-poetry-dependencies
uses: actions/cache@v2
with:
path: .venv-pyarrow
key: venv-pyarrow-${{ runner.os }}-${{ steps.setup-python.outputs.python-version }}-${{ github.event.repository.name }}-${{ hashFiles('**/poetry.lock') }}
#----------------------------------------------
# install dependencies if cache does not exist
#----------------------------------------------
- name: Install dependencies
if: steps.cached-poetry-dependencies.outputs.cache-hit != 'true'
run: poetry install --no-interaction --no-root
#----------------------------------------------
# install your root project, if required
#----------------------------------------------
- name: Install library
run: poetry install --no-interaction --all-extras
#----------------------------------------------
# run test suite
#----------------------------------------------
- name: Run tests
run: poetry run python -m pytest tests/unit
check-linting:
runs-on: ubuntu-latest
strategy:
Expand Down
2 changes: 0 additions & 2 deletions .github/workflows/integration.yml
Original file line number Diff line number Diff line change
Expand Up @@ -55,5 +55,3 @@ jobs:
#----------------------------------------------
- name: Run e2e tests
run: poetry run python -m pytest tests/e2e
- name: Run SQL Alchemy tests
run: poetry run python -m pytest src/databricks/sqlalchemy/test_local
78 changes: 78 additions & 0 deletions .github/workflows/publish-manual.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
name: Publish to PyPI Manual [Production]

# Allow manual triggering of the workflow
on:
workflow_dispatch: {}

jobs:
publish:
name: Publish
runs-on: ubuntu-latest

steps:
#----------------------------------------------
# Step 1: Check out the repository code
#----------------------------------------------
- name: Check out repository
uses: actions/checkout@v2 # Check out the repository to access the code

#----------------------------------------------
# Step 2: Set up Python environment
#----------------------------------------------
- name: Set up python
id: setup-python
uses: actions/setup-python@v2
with:
python-version: 3.9 # Specify the Python version to be used

#----------------------------------------------
# Step 3: Install and configure Poetry
#----------------------------------------------
- name: Install Poetry
uses: snok/install-poetry@v1 # Install Poetry, the Python package manager
with:
virtualenvs-create: true
virtualenvs-in-project: true
installer-parallel: true

# #----------------------------------------------
# # Step 4: Load cached virtual environment (if available)
# #----------------------------------------------
# - name: Load cached venv
# id: cached-poetry-dependencies
# uses: actions/cache@v2
# with:
# path: .venv # Path to the virtual environment
# key: venv-${{ runner.os }}-${{ steps.setup-python.outputs.python-version }}-${{ github.event.repository.name }}-${{ hashFiles('**/poetry.lock') }}
# # Cache key is generated based on OS, Python version, repo name, and the `poetry.lock` file hash

# #----------------------------------------------
# # Step 5: Install dependencies if the cache is not found
# #----------------------------------------------
# - name: Install dependencies
# if: steps.cached-poetry-dependencies.outputs.cache-hit != 'true' # Only run if the cache was not hit
# run: poetry install --no-interaction --no-root # Install dependencies without interaction

# #----------------------------------------------
# # Step 6: Update the version to the manually provided version
# #----------------------------------------------
# - name: Update pyproject.toml with the specified version
# run: poetry version ${{ github.event.inputs.version }} # Use the version provided by the user input

#----------------------------------------------
# Step 7: Build and publish the first package to PyPI
#----------------------------------------------
- name: Build and publish databricks sql connector to PyPI
working-directory: ./databricks_sql_connector
run: |
poetry build
poetry publish -u __token__ -p ${{ secrets.PROD_PYPI_TOKEN }} # Publish with PyPI token
#----------------------------------------------
# Step 7: Build and publish the second package to PyPI
#----------------------------------------------

- name: Build and publish databricks sql connector core to PyPI
working-directory: ./databricks_sql_connector_core
run: |
poetry build
poetry publish -u __token__ -p ${{ secrets.PROD_PYPI_TOKEN }} # Publish with PyPI token
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -195,7 +195,7 @@ cython_debug/
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/
.idea/

# End of https://www.toptal.com/developers/gitignore/api/python,macos

Expand Down
5 changes: 5 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
# Release History

# 4.0.0 (TBD)

- Split the connector into two separate packages: `databricks-sql-connector` and `databricks-sqlalchemy`. The `databricks-sql-connector` package contains the core functionality of the connector, while the `databricks-sqlalchemy` package contains the SQLAlchemy dialect for the connector.
- Pyarrow dependency is now optional in `databricks-sql-connector`. Users needing arrow are supposed to explicitly install pyarrow

# 3.7.0 (2024-12-23)

- Fix: Incorrect number of rows fetched in inline results when fetching results with FETCH_NEXT orientation (databricks/databricks-sql-python#479 by @jprakash-db)
Expand Down
3 changes: 0 additions & 3 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -144,9 +144,6 @@ The `PySQLStagingIngestionTestSuite` namespace requires a cluster running DBR ve

The suites marked `[not documented]` require additional configuration which will be documented at a later time.

#### SQLAlchemy dialect tests

See README.tests.md for details.

### Code formatting

Expand Down
23 changes: 20 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,9 @@
[![PyPI](https://img.shields.io/pypi/v/databricks-sql-connector?style=flat-square)](https://pypi.org/project/databricks-sql-connector/)
[![Downloads](https://pepy.tech/badge/databricks-sql-connector)](https://pepy.tech/project/databricks-sql-connector)

The Databricks SQL Connector for Python allows you to develop Python applications that connect to Databricks clusters and SQL warehouses. It is a Thrift-based client with no dependencies on ODBC or JDBC. It conforms to the [Python DB API 2.0 specification](https://www.python.org/dev/peps/pep-0249/) and exposes a [SQLAlchemy](https://www.sqlalchemy.org/) dialect for use with tools like `pandas` and `alembic` which use SQLAlchemy to execute DDL. Use `pip install databricks-sql-connector[sqlalchemy]` to install with SQLAlchemy's dependencies. `pip install databricks-sql-connector[alembic]` will install alembic's dependencies.
The Databricks SQL Connector for Python allows you to develop Python applications that connect to Databricks clusters and SQL warehouses. It is a Thrift-based client with no dependencies on ODBC or JDBC. It conforms to the [Python DB API 2.0 specification](https://www.python.org/dev/peps/pep-0249/).

This connector uses Arrow as the data-exchange format, and supports APIs to directly fetch Arrow tables. Arrow tables are wrapped in the `ArrowQueue` class to provide a natural API to get several rows at a time.
This connector uses Arrow as the data-exchange format, and supports APIs (e.g. `fetchmany_arrow`) to directly fetch Arrow tables. Arrow tables are wrapped in the `ArrowQueue` class to provide a natural API to get several rows at a time. [PyArrow](https://arrow.apache.org/docs/python/index.html) is required to enable this and use these APIs, you can install it via `pip install pyarrow` or `pip install databricks-sql-connector[pyarrow]`.

You are welcome to file an issue here for general use cases. You can also contact Databricks Support [here](help.databricks.com).

Expand All @@ -22,7 +22,12 @@ For the latest documentation, see

## Quickstart

Install the library with `pip install databricks-sql-connector`
### Installing the core library
Install using `pip install databricks-sql-connector`

### Installing the core library with PyArrow
Install using `pip install databricks-sql-connector[pyarrow]`


```bash
export DATABRICKS_HOST=********.databricks.com
Expand Down Expand Up @@ -60,6 +65,18 @@ or to a Databricks Runtime interactive cluster (e.g. /sql/protocolv1/o/123456789
> to authenticate the target Databricks user account and needs to open the browser for authentication. So it
> can only run on the user's machine.

## SQLAlchemy
Starting from `databricks-sql-connector` version 4.0.0 SQLAlchemy support has been extracted to a new library `databricks-sqlalchemy`.

- Github repository [databricks-sqlalchemy github](https://github.com/databricks/databricks-sqlalchemy)
- PyPI [databricks-sqlalchemy pypi](https://pypi.org/project/databricks-sqlalchemy/)

### Quick SQLAlchemy guide
Users can now choose between using the SQLAlchemy v1 or SQLAlchemy v2 dialects with the connector core

- Install the latest SQLAlchemy v1 using `pip install databricks-sqlalchemy~=1.0`
- Install SQLAlchemy v2 using `pip install databricks-sqlalchemy`


## Contributing

Expand Down
174 changes: 0 additions & 174 deletions examples/sqlalchemy.py

This file was deleted.

Loading
Loading