Releases: hughperkins/coriander
Releases · hughperkins/coriander
v6.0.0
Changes:
- installs to
~/coriander
now- this allows plugins to install without needing
sudo
- this allows plugins to install without needing
- installs using python2.7 script
install_distro.py
now, which:- should be more portable
- handles installing the coriander-dnn plugin
- handles downloading llvm-4.0
- plugin architecture created
- NVIDIA® CUDA™ cuDNN API partial implementation factorized into a plugin coriander-dnn
cocl
script migrated tococl_py
, which is reasonably cross-platform, can run on Windows
Under-the-hood:
- main jenkins script migrated to python, so easy-ish to run on Windows too
v5.1.2
Bug fixes:
- fix shims re-ordering, which caused runtime errors occasionally
- fix test failures in
test_floatstarstar.cu
, caused by virtual memory relocation
v5.1.1
Changes:
- added support for passing arrays of gpu pointers in by-value structs, ie something like:
struct MyStruct {
float *buffers[8];
};
__global__ mykernel(struct MyStruct mystruct) {
...
}
- fixed some compile bugs for Eigen tests on Ubuntu 16.04
v5.0.0
Changes:
cocl_add_executable
andcocl_add_library
cmake macros created- many atomics work now
- compiles/runs on Mac Sierra, using Radeon HD450
- upgraded to use LLVM4.0, under the hood
- simplify handling/switching between 32-bit/64-bit pointer offsets
- added a bunch of maths operations (
sincosf
, ...) - revamped how gpu buffers are passed into kernels, so that each buffer only passed in once
- missing function implementations now cause an obvious failure during compile, rather than weird unknown runtime issues
- added
CL_GPUOFFSET=1
, to choose the second gpu - fixed some bugs involving math, so that
tf.random_uniform
andtf.random_normal
, in tensorflow, work correctly now
v4.0.4
Fixes several eigen tests, https://bitbucket.org/hughperkins/eigen/src/eigen-cl/unsupported/test/cuda-on-cl/?at=eigen-cl :
- argmax passes now
- cuda_nullary re-passes now (briefly didnt, in briefly existing 4.0.1)
- reduction_tiny re-passes now (briefly didnt, in briefly existing 4.0.1)
v4.0.0
- radical refactorization under-the-hood
- allocas work ok now
- calling functions returning pointers, possibly global, possibly not, sometimes global, sometimes not (even for exact same function name), all ok now
- address-space proagation in general: to functions, through functions, to/through phis, to/through allocas, all significantly improved/existent, compared to before
- opencl generation is at runtime now, which gives two things:
- it's actually faster, at runtime, counter-intuitively, since we only need to feed a fraction of hte OpenCL to the gpu driver, just the small amount of opencl we actually need, rather than the entire program each time :-P
- massively improves the ability to determine the address-space of pointer variables/functions/etc
v2.0.1
Fix:
- hostside_opencl_funcs_assure_initialized was being inlined, not added to libcocl.a
v2.0.0
(since following semver, and this changes public api, so bumping major version)
Changes:
- if you just want to ensure the cl context is initialized (which is entirely optional, but useful for testing), the public method now is:
hostside_opencl_funcs_assure_initialized
, rather than justassure_initialized
...
v1.0.0
1.2