-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PySQL Connector split into connector and sqlalchemy #444
Conversation
…ore part (#417) * Implemented ColumnQueue to test the fetchall without pyarrow Removed token removed token * order of fields in row corrected * Changed the folder structure and tested the basic setup to work * Refractored the code to make connector to work * Basic Setup of connector, core and sqlalchemy is working * Basic integration of core, connect and sqlalchemy is working * Setup working dynamic change from ColumnQueue to ArrowQueue * Refractored the test code and moved to respective folders * Added the unit test for column_queue Fixed __version__ Fix * venv_main added to git ignore * Added code for merging columnar table * Merging code for columnar * Fixed the retry_close sesssion test issue with logging * Fixed the databricks_sqlalchemy tests and introduced pytest.ini for the sqla_testing * Added pyarrow_test mark on pytest * Fixed databricks.sqlalchemy to databricks_sqlalchemy imports * Added poetry.lock * Added dist folder * Changed the pyproject.toml * Minor Fix * Added the pyarrow skip tag on unit tests and tested their working * Fixed the Decimal and timestamp conversion issue in non arrow pipeline * Removed not required files and reformatted * Fixed test_retry error * Changed the folder structure to src / databricks * Removed the columnar non arrow flow to another PR * Moved the README to the root * removed columnQueue instance * Revmoved databricks_sqlalchemy dependency in core * Changed the pysql_supports_arrow predicate, introduced changes in the pyproject.toml * Ran the black formatter with the original version * Extra .py removed from all the __init__.py files names * Undo formatting check * Check * Check * Check * Check * Check * Check * Check * Check * Check * Check * Check * Check * Check * Check * BIG UPDATE * Refeactor code * Refractor * Fixed versioning * Minor refractoring * Minor refractoring
…ave pyarrow as optional
Print warning message if pyarrow is not installed Signed-off-by: Jacky Hu <[email protected]>
Remove sqlalchemy and update README.md Signed-off-by: Jacky Hu <[email protected]>
Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase ( |
Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase ( |
Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase ( |
Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase ( |
Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase ( |
Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase ( |
Major Change - v4.x.x
Related Links
databricks_sqlalchemy split is present in this PR - databricks/databricks-sqlalchemy#1
Description
databricks-sql-python library is being split into 2 packages to satisfy the business needs
The Split
The two packages post split are
databricks-sql-python
pip install databricks-sql-connector
will install the lean connector andpip install databricks-sql-connector[pyarrow]
will install the complete connector! Not installing PyArrow will disable features such as Cloudfetch and other Arrow needed functions. Without PyArrow only inline results will be supported
databricks-sqlalchemy
databricks-sqlalchemy
library will have a core dependency on the connector with PyArrow and hence thedatabricks-sql-python
and PyArrow will be installed while installingdatabricks-sqlalchemy
pip install databricks-sqlalchemy~=1.0
or the SQLAlchemy v2 based library usingpip install databricks-sqlalchemy
Published Library on PyPi
Development Details
databricks-sql-python
will be raised on this repov1/main
branch in the databricks-sqlalchemy repo. All future PRs must be raised wrt this branchmain
branch in the databricks-sqlalchemy repoPR Details
Tasks Completed
How to Test
Testing pipeline remains the same as it is before the split.
pytest can be used to directly run both the integration as well as unit tests, by
pytest [directory_name or file_name]
Performance Comparison - Benchmarking
The pre-split and post-split preformance comparison has been made using the large and small queries to make sure their is no regression of performance
Dashboard has been created so that everytime the benchmarking is run the result are stored in the benchfood, and comparisons can be made easily