Releases: KxSystems/pykx
2.5.5
PyKX 2.5.5
Release Date
2024-11-28
Fixes and Improvements
- PyKX Pandas dependency has been raised to allow
<=2.2.3
for Python>3.8 - PyKX Pandas dependency for Python 3.8 has been clamped to
<2.0
due to support being dropped for it by Pandas after 2.0.3.
For release notes of all versions see here.
3.0.1
Full up-to-date release notes are available here.
🎉 Feature 🎉
-
Addition of the property day to kx.Column objects to allow users to retrieve the day of month of a timestamp.
>>> import pykx as kx >>> tab = kx.Table(data={ ... 'a': kx.random.random(100, kx.TimestampAtom.inf), ... 'b': kx.random.random([100, 3], 10.0) ... }) >>> tab.exec(kx.Column('a').day) pykx.IntVector(pykx.q('7 10 12..'))
🔧 Fixes & Improvements 🔧
-
Added warning to kx.q.system.load and context registration when attempting to load path with a space. Can be suppressed by enabling PYKX_SUPPRESS_WARNINGS.
-
Changed %%python heading to %%py when calling Python code during jupyter_qfirst mode so as not to conflict with inbuilt Jupyter cell magics.
-
Fixed kx.license.check(format='string') to remove newline characters during comparison.
-
Configuration file .pykx-config now supports use of boolean toml configuration
$ cat ~/.pykx-config [default] PYKX_QDEBUG = true $ python >>> import pykx as kx >>> kx.config.pykx_qdebug True
-
Reintroduced unsetting/setting of PYKX_SKIP_UNDERQ in PyKXReimport these had been removed in the 3.0.0 release.
-
Added type checking for the cast flag when calling kx.toq() or creating a kx.K variable such as kx.FloatVector() or kx.DatetimeAtom().
-
Removed the need to enable PYKX_BETA_FEATURES to use pykx_threading.
-
Fixed a memory leak when calling pickle.loads on a PyKX object which previously had been called with pickle.dumps.
-
Removal of column type from the return of dtypes method for kx.Table objects, previously this had raised a deprecation warning
>>> tab = kx.q('([] a:1 2 3j;b:4 5 6i)') >>> tab.dtypes pykx.Table(pykx.q(' columns datatypes --------------------- a "kx.LongAtom" b "kx.IntAtom" '))
3.0.0
Full up-to-date release notes are available here.
🎉 Major Features/Changes 🎉
- Addition of functionality to allow for development of end-to-end streaming workflows consisting of data-ingestion, persistence and query. This functionality is outlined in-depth here.
- Update to the PyKX Query API to support a significantly more Python first approach to querying kdb+ in-memory and on-disk databases.
>>> table = kx.Table(data={
... 'sym': kx.random.random(100, ['AAPL', 'GOOG', 'MSFT']),
... 'date': kx.random.random(100, kx.q('2022.01.01') + [0,1,2]),
... 'price': kx.random.random(100, 1000.0),
... 'size': kx.random.random(100, 100)
... })
>>> table.select(columns=kx.Column('price').max(), where=kx.Column('size') > 5)
>>> table.update(column=kx.Column('price').wavg(kx.Column('size')).rename('vwap'), by=kx.Column('sym'))
>>> table.delete(column=kx.Column('sym'))
>>> table.update(column=(kx.Column('price') * kx.Column('size')).rename('total'))
Beta features available in the 2.* versions of PyKX have now been migrated to full support.
- The full list of these features are as follows:
- Database Creation and Management
- Compression and Encryption Module
- Remote Function Execution
- Streamlit Integration
- Multi-threaded use of PyKX
❓ What else? ❓
- Extension to our integration with Jupyter Notebooks by adding a q-first mode of operation which allows users working between the two languages to more easily automate workflows depending on both
- Addition of a new utility function
kx.util.detect_bad_columns
to validate if the columns of a table object conform to the naming conventions supported by kdb+ and highlighting if the table contains duplicate column names raising a warning indicating potential issues and returning True if the table contains invalid columns. - When generating IPC connections with reconnection_attempts users can now configure the initial delay between first and second attempts and the function which updates the delay on successive attempts using the reconnection_delay and reconnection_function keywords. See here for a worked example.
- Two new options added on first initialisation of PyKX to allow users to:
- Use the path to their already downloaded kc.lic/k4.lic licenses without going through the “Do you want to install a license” workflow
- Allow users to persist for future use that they wish to use the IPC only unlicensed mode of PyKX, this will persist a file ~/.pykx-config which sets configuration denoting unlicensed mode is to be used.
- Addition of function
kx.util.install_q
to allow users who do not have access to a q executable at the time of installing PyKX. See here for instructions regarding its use.
Addition of functionkx.util.start_q_subprocess
to allow a q process to be started on a specified port with supplied initialisation arguments
🔧 Fixes & Improvements 🔧
As with any release PyKX 3.0 provides a significant number of bug fixes and improvements, the following are a subset:
- Addition of support for help when interacting with q keywords and operators via PyKX
- Previously loading pykx.q during q startup using QINIT or QHOME/q.q resulted in a segfault or a corruption.
- The function
kx.util.debug_environment
now returns the applied configuration values at startup instead of customised values - Operations on
kx.GroupbyTable
objects which have been indexed previously would raise an error indicating invalid key access - Attempts to load a database using the
kx.DB
module previously would raise an nyi error if the path to the database contained a space
2.5.2
PyKX 2.5.2 has been released 🎉 Full release notes for client consumption can be found here.
Highlights:
Converting PyKX generic lists using the keyword parameter raw=True would previously return incorrect results, the values received being the memory address of the individual elements of the list, this has now been resolved:
>>> a = kx.q('(1; 3.4f; `asad; "asd")')
>>> a.np(raw=True)
array([1, 3.4, b'asad', b'asd'], dtype=object)
- Fix to issue where use of kx.SymbolAtom with getitem method on kx.Table objects would return a table rather then vector/list. The return now mirrors the expected return which matches str type inputs
>>> import pykx as kx
>>> tab = kx.Table(data={'x': [1, 2, 3], 'y': ['a', 'b', 'c']})
>>> tab['x']
pykx.LongVector(pykx.q('1 2 3'))
>>> tab[kx.SymbolAtom('x')]
pykx.LongVector(pykx.q('1 2 3'))
Fix to issue where loading PyKX on Windows from 2.5.0 could result in a users working directory being changed to site-packages/pykx
The full list including more fixes and improvements is available here.
2.5.1
PyKX 2.5.1 has been released 🎉 Full release notes for consumption can be found here.
Highlights:
- Pandas API additions: isnull, isna, notnull, notna, idxmax, idxmin, kurt, sem.
- Addition of filter_type, filter_columns, and custom parameters to QReader.csv() to add options for CSV type guessing.
>>> import pykx as kx
>>> reader = kx.QReader(kx.q)
>>> kx.q.read.csv("myFile0.csv", filter_type = "like", filter_columns="*name", custom={"SYMMAXGR":15})
pykx.Table(pykx.q('
firstname lastname
----------------------
"Frieda" "Bollay"
"Katuscha" "Paton"
"Devina" "Reinke"
"Maurene" "Bow"
"Iseabal" "Bashemeth"
..
'))
Other items of note:
- Fix to regression in PyKX 2.5.0 where PyKX initialisation on Windows would result in a segmentation fault when using an k4.lic license type.
- Previously user could not make direct use of kx.SymbolicFunction type objects against a remote process, this has been rectified
- Previously use of the context interface for q primitive functions in licensed mode via IPC would partially run the function on the client rather than server, thus limiting usage for named entities on the server.
- With the release of PyKX 2.5.0 and support of PyKX usage in paths containing spaces the context interface functionality could fail to load a requested context over IPC if PyKX was not loaded on the server.
The full list including more fixes and improvements is available here.
2.5.0
PyKX 2.5.0 has been released. Full release notes can be found here.
Highlights:
table.xbar
,table.window_join
,table.replace
- Added
as_arrow
keyword to the.pd()
method to use PyArrow backed data types rather than NumPy. - Other items of note:
- PyKX can now be installed to locations with spaces in the file path.
- Updated libq to 4.0 2024.05.07 and 4.1 to 2024.04.29 for all supported OS's.
- IPC queries can now pass PyKX Functions like objects as the first query parameter.
- k4.lic licenses can now be installed using the interactive license helper.
- To ease license updates, If PyKX fails to start due to a license error it will attempt to replace it's license from
KDB_LICENSE_B64
orKDB_K4LICENSE_B64
if you have one set.
The full list including more fixes and improvements is available here.
Examples:
table.window_join
>>> trades = kx.Table(data={
... 'sym': ['ibm', 'ibm', 'ibm'],
... 'time': kx.q('10:01:01 10:01:04 10:01:08'),
... 'price': [100, 101, 105]})
>>> quotes = kx.Table(data={
... 'sym': 'ibm',
... 'time': kx.q('10:01:01+til 9'),
... 'ask': [101, 103, 103, 104, 104, 107, 108, 107, 108],
... 'bid': [98, 99, 102, 103, 103, 104, 106, 106, 107, 108]})
>>> windows = kx.q('{-2 1+\:x}', trades['time'])
>>> trades.window_join(quotes,
... windows,
... ['sym', 'time'],
... {'ask_minus_bid': [lambda x, y: x - y, 'ask', 'bid'],
... 'ask_max': [lambda x: max(x), 'ask']})
pykx.Table(pykx.q('
sym time price ask_minus_bid ask_max
----------------------------------------
ibm 10:01:01 100 3 4 103
ibm 10:01:04 101 4 1 1 1 104
ibm 10:01:08 105 3 2 1 1 108
'))
table.xbar
>>> kx.random.seed(42)
>>> tab = kx.Table(data = {
... 'x': kx.random.random(N, 100.0),
... 'y': kx.random.random(N, 10.0)})
>>> tab
pykx.Table(pykx.q('
x y
-----------------
77.42128 8.200469
70.49724 9.857311
52.12126 4.629496
99.96985 8.518719
1.196618 9.572477
'))
>>> tab.xbar('x', 10)
pykx.Table(pykx.q('
x y
-----------
70 8.200469
70 9.857311
50 4.629496
90 8.518719
0 9.572477
'))
table.replace
>>> tab = kx.q('([] a:2 2 3; b:4 2 6; c:(1b;0b;1b); d:(`a;`b;`c); e:(1;2;`a))')
>>> tab
pykx.Table(pykx.q('
a b c d e
----------
2 4 1 a 1
2 2 0 b 2
3 6 1 c `a
'))
>>> tab.replace(2, "test")
pykx.Table(pykx.q('
a b c d e
---------------------
`test 4 1 a 1
`test `test 0 b `test
3 6 1 c `a
'))
2.4.0
Full details on the release can be found here.
Additions:
- Support for q/kdb+ 4.1 documentation here added as an opt-in capability, this functionality is enabled through setting PYKX_4_1_ENABLED environment variable.
>>> import os
>>> os.environ['PYKX_4_1_ENABLED'] = 'True'
>>> import pykx as kx
>>> kx.q.z.K
pykx.FloatAtom(pykx.q('4.1'))
- Added support for Python 3.12.
- Support for PyArrow in this python version is currently in Beta.
- Added conversion of NumPy arrays of type datetime64[s], datetime64[ms], datetime64[us] to kx.TimestampVector
- Added Table.sort_values(), Table.nsmallest() and Table.nlargest() to the Pandas like API for sorting tables.
- Table.rename() now supports non-numerical index columns and improved the quality of errors thrown.
- Added the reconnection_attempts key word argument to SyncQConnection, SecureQConnection, and AsyncQConnection IPC classes. This argument allows IPC connection to be automatically re-established when it is lost and a server has reinitialized.
>>> import pykx as kx
>>> conn = kx.SyncQConnection(port = 5050, reconnection_attempts=4)
>>> conn('1+1') # Following this call the server on port 5050 was closed for 2 seconds
pykx.LongVector(pykx.q('2'))
>>> conn('1+2')
WARNING: Connection lost attempting to reconnect.
Failed to reconnect, trying again in 0.5 seconds.
Failed to reconnect, trying again in 1.0 seconds.
Connection successfully reestablished.
pykx.LongAtom(pykx.q('3'))
- Added --reconnection_attempts option to Jupyter %%q magic making use of the above IPC logic changes.
- Addition of environment variable/configuration value PYKX_QDEBUG which allows debugging backtrace to be displayed for all calls into q instead of requiring a user to specify debugging is enabled per-call. This additionally works for remote IPC calls and utilisation of Jupyter magic commands.
>>> import os
>>> os.environ['PYKX_QDEBUG'] = 'True'
>>> import pykx as kx
>>> kx.q('{x+1}', 'e')
backtrace:
[2] {x+1}
^
[1] (.Q.trp)
[0] {[pykxquery] .Q.trp[value; pykxquery; {if[y~();:(::)];2@"backtrace:
^
",.Q.sbt y;'x}]}
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/anaconda3/lib/python3.8/site-packages/pykx/embedded_q.py", line 230, in __call__
return factory(result, False)
File "pykx/_wrappers.pyx", line 493, in pykx._wrappers._factory
File "pykx/_wrappers.pyx", line 486, in pykx._wrappers.factory
pykx.exceptions.QError: type
Fixes and Improvements:
- Resolved segfaults on Windows when PyKX calls Python functions under q.
>>> import pykx as kx
>>> kx.q('{[f;x] f x}', sum, kx.q('4 4#til 16'))
pykx.LongVector(pykx.q('24 28 32 36'))
- Updated kdb Insights Core libraries to 4.0.8, see here for more information.
- Updated libq 4.0 version to 2024.03.04 for all supported OS’s.
- Fix issue where use of valid C backed q code APIs could result in segmentation faults when called.
>>> import pykx as kx
>>> isf = kx.q('.pykx.util.isf')
>>> isf
pykx.Foreign(pykx.q('code'))
>>> isf(True)
pykx.BooleanAtom(pykx.q('0b'))
- Each call to the PyKX query API interned 3 new unique symbols. This has now been removed.
Beta Features
- Addition of Compress and Encrypt classes to allow users to set global configuration and for usage within Database partition persistence.
Standalone
>>> import pykx as kx
>>> compress = kx.Compress(algo=kx.CompressionAlgorithm.gzip, level=8)
>>> kx.q.z.zd
pykx.Identity(pykx.q('::'))
>>> compress.global_init()
pykx.LongVector(pykx.q('17 2 8'))
>>> encrypt = kx.Encrypt(path='/path/to/the.key', password='PassWord')
>>> encrypt.load_key()
Database
>>> import pykx as kx
>>> compress = kx.Compress(algo=kx.CompressionAlgorithm.lz4hc, level=10)
>>> db = kx.DB(path='/tmp/db')
>>> db.create(kx.q('([]10?1f;10?1f)', 'tab', kx.q('2020.03m'), compress=compress)
>>> kx.q('-21!`:/tmp/db/2020.03/tab/x')
pykx.Dictionary(pykx.q('
compressedLength | 140
uncompressedLength| 96
algorithm | 4i
logicalBlockSize | 17i
zipLevel | 10i
'))
2.3.2
Merge pull request #21 from KxSystems/pykx-2.3.2 PyKX 2.3.2 release update
2.3.1
2.2.0
Release 2.2