-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SQLAlchemy 2: Stop skipping some non-type tests #247
Conversation
We're now in the territory of features that aren't required for sqla2 compat as of pysql==3.0.0 but we may consider adding this in the future. In this case, table comment reflection needs to be manually implemented. Index reflection would require hooking into the compiler to reflect the partition strategy. test_suite.py::HasIndexTest_databricks+databricks::test_has_index[dialect] SKIPPED (Databricks does not support indexes.) test_suite.py::HasIndexTest_databricks+databricks::test_has_index[inspector] SKIPPED (Databricks does not support indexes.) test_suite.py::HasIndexTest_databricks+databricks::test_has_index_schema[dialect] SKIPPED (Databricks does not support indexes.) test_suite.py::HasIndexTest_databricks+databricks::test_has_index_schema[inspector] SKIPPED (Databricks does not support indexes.) test_suite.py::TableDDLTest_databricks+databricks::test_add_table_comment SKIPPED (Comment reflection is possible but not implemented in this dialect.) test_suite.py::TableDDLTest_databricks+databricks::test_create_index_if_not_exists SKIPPED (Databricks does not support indexes.) test_suite.py::TableDDLTest_databricks+databricks::test_create_table PASSED test_suite.py::TableDDLTest_databricks+databricks::test_create_table_if_not_exists PASSED test_suite.py::TableDDLTest_databricks+databricks::test_create_table_schema PASSED test_suite.py::TableDDLTest_databricks+databricks::test_drop_index_if_exists SKIPPED (Databricks does not support indexes.) test_suite.py::TableDDLTest_databricks+databricks::test_drop_table PASSED test_suite.py::TableDDLTest_databricks+databricks::test_drop_table_comment SKIPPED (Comment reflection is possible but not implemented in this dialect.) test_suite.py::TableDDLTest_databricks+databricks::test_drop_table_if_exists PASSED test_suite.py::TableDDLTest_databricks+databricks::test_underscore_names PASSED Signed-off-by: Jesse Whitehouse <[email protected]>
The fixes to DESCRIBE TABLE and visit_xxx were necessary to get to the point where I could even determine that these tests wouldn't pass. But those changes are not currently tested in the dialect. If, in the course of reviewing the remaining tests in the compliance suite, I find that these visit_xxxx methods are not tested anywhere else then we should extend test_suite.py with our own tests to confirm the behaviour for ourselves. Signed-off-by: Jesse Whitehouse <[email protected]>
The presence of this pytest.ini file is _required_ to establish pytest's root_path https://docs.pytest.org/en/7.1.x/reference/customize.html#finding-the-rootdir Without it, the custom pytest plugin from SQLAlchemy can't read the contents of setup.cfg which makes none of the tests runnable. Signed-off-by: Jesse Whitehouse <[email protected]>
Signed-off-by: Jesse Whitehouse <[email protected]>
Date type work fixed this test failure Signed-off-by: Jesse Whitehouse <[email protected]>
This allows these SQLAlchemy tests to pass: test_suite.py::FetchLimitOffsetTest_databricks+databricks::test_bound_limit PASSED test_suite.py::FetchLimitOffsetTest_databricks+databricks::test_bound_limit_offset PASSED test_suite.py::FetchLimitOffsetTest_databricks+databricks::test_expr_limit_simple_offset PASSED test_suite.py::FetchLimitOffsetTest_databricks+databricks::test_simple_limit PASSED test_suite.py::FetchLimitOffsetTest_databricks+databricks::test_simple_limit_expr_offset PASSED test_suite.py::FetchLimitOffsetTest_databricks+databricks::test_simple_limit_offset[cases0] PASSED test_suite.py::FetchLimitOffsetTest_databricks+databricks::test_simple_limit_offset[cases1] PASSED test_suite.py::FetchLimitOffsetTest_databricks+databricks::test_simple_limit_offset[cases2] PASSED This partially reverts the change introduced in #246 Signed-off-by: Jesse Whitehouse <[email protected]>
I implemented our custom DatabricksStatementCompiler so we can override the default rendering of unbounded LIMIT clauses from `LIMIT -1` to `LIMIT ALL` We also explicitly skip the FETCH clause tests since Databricks doesn't support this syntax. Blacked all source code here too. Signed-off-by: Jesse Whitehouse <[email protected]>
Add meaningful skip markers for table comment reflection and indexes Signed-off-by: Jesse Whitehouse <[email protected]>
This closes #175 Signed-off-by: Jesse Whitehouse <[email protected]>
Adding the @reflection.cache decorator to has_table is necessary to pass test_has_table_cache Caching calls to has_table improves the efficiency of the connector Signed-off-by: Jesse Whitehouse <[email protected]>
Databricks constraint names are limited to 255 characters Signed-off-by: Jesse Whitehouse <[email protected]>
Black test_suite.py Signed-off-by: Jesse Whitehouse <[email protected]>
Signed-off-by: Jesse Whitehouse <[email protected]>
Signed-off-by: Jesse Whitehouse <[email protected]>
Turns out that none of these can pass for the same reason that the first two seemed un-runnable in db6f52b Signed-off-by: Jesse Whitehouse <[email protected]>
dte_dict = {row["col_name"]: row["data_type"] for row in result} | ||
target = [(k, v) for k, v in dte_dict.items() if "FOREIGN KEY" in v] | ||
|
||
def extract_constraint_dict_from_target(target): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style nit: If you feel like there is a concept worth extracting to a function, my preference is not to have it encapsulated inside another function, because then you can't unit test it. I would generally only use this approach for simple functions that are slightly larger than a lambda, and otherwise have the function in a scope that I can test separately. The more you can isolate units of work, the simpler your tests can be to validate their behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great feedback. I'll push this to a follow-up PR shortly.
Intentionally skipping failing type checks as these will be addressed in follow-ups. |
Note to reviewers: The important thing here is that all these tests pass. The nitty gritty implementation of SQLAlchemy is both esoteric and unchanging. The key things to review are the implementation of
get_pk_constraint
,get_primary_keys
, and the modifiedinfer_types
logic.Background
On the path to full SQLAlchemy 2 compliance, I'm working through the entire dialect compliance test suite. Once all the compliance tests have been addressed as either passing or skipped with justification, the new dialect will be ready for release.
As of this PR, we pass 198 reviewed tests, skip 58 reviewed tests, and have not reviewed 1215 tests.
Description
This pull request represents several blocks of tests that I've reviewed in detail. I'm opening this PR as a partial effort as there are still more tests to consider. But I'd like my PR's to be smaller and more frequent rather than enormous and hard-to-review. Here is a list of the tests I enabled and anything relevant for this review.
TableDDLTest
No changes to the dialect were needed to make these pass. The table comment DDL tests are marked skip as this is not an acceptance criteria for the next release.HasIndexTest
: Databricks doesn't support indexes. These tests are permanently skipped.QuotedNameArgumentTest
: These tests are wacky as they are so intermingled with different requirements, some of which Databricks supports, and others which it doesn't. There's no way for us to run them as-is. We can tag this as a follow-up after the initial release.RowFetchTest
: No changes to the dialect were needed to make this pass.FetchLimitOffsetTest
: This failed because the default approach wrote invalid Databricks SQL (it uses-1
as the wildcard match rather thanALL
). I had to implement our own sub-class of theSQLCompiler
which is captured by SQLAlchemy's visitor API.FutureTableDDLTest
: No changes to the dialect were required to make these tests pass. But I had to updaterequirements.py
so that SQLAlchemy's test_runner would know they are runnable.IdentityColumnTest
: I implemented IDENTITY writing semantics in the dialect to make these pass.HasTableTest
: I had to make a few changes to bring these into passing state. I also added thereflection.cache
decorator to our implementation ofhas_table()
so that the connector will benefit from SQLAlchemy's inspector cache. This is the customary approach for first-party and third-party dialects alike. SQLAlchemy takes care of cache-invalidation for us.LongNameBlowoutTest
: Databricks SQL doesn't support identifiers longer than 255 characters so these tests are permanently skipped.ExceptionTest
: There is only one test in this block and it checks for an integrity error to be raised if a primary key constraint is violated. Databricks SQL doesn't enforce information constraints (apart from CHECK constraints, which are not handled inthis PR). So this is permanently skipped.LastrowidTest
: This test depends on us implementing implicit autoincrement behaviour (where we write out an IDENTITY constraint when the user didn't specify one explicitly). This behaviour is not an acceptance criteria for the first release. The test is therefore disabled. I did add a warning log message in case users attempt to use this. The workaround is to explicitly declare Identity() for a column -- other dialects have made this similarly explicit.CompositeKeyReflectionTest
: This is the big one: foreign and primary key reflection is a significant gap in the existing Databricks SQLAlchemy dialect. To make these tests pass I implemented the methods SQLAlchemy needs to reflect the PRIMARY KEY and FOREIGN KEY constraints defined on a given table.Stop inferring TINYINT
After discussion with @benc-db I've also modified the pysql
infer_types
method so that it never automatically infers a TINYINT type. We did this because non-SELECT clauses that accept bound integer inputs (like LIMIT and OFFSET) will never accept a value that was casted to TINYINT. If users of pysql require the efficiency of a TINYINT, they can explicitly declare this type when binding their parameters.Approach
This PR takes the same approach as #242. As I review each block of tests I mark it with
@pytest.mark.reviewed
. If I find that the test depends on a method existing inrequirements.py
, I update that file with its specific requirements. If the test cannot pass within reason for Databricks, I add it directly totest_suite.py
with a rational skip marker. This means that when runningpytest -m reviewed
the output will look like this:If it is possible to make a failing test pass, I implement that feature in the dialect.