Skip to content

Latest commit

 

History

History
72 lines (40 loc) · 4.68 KB

README.md

File metadata and controls

72 lines (40 loc) · 4.68 KB

Example notebooks

These are example notebooks to showcase cuxfilter with cuDF. If you want to distribute your workflow across multiple GPUs, have more data than you can fit in memory on a single GPU, or want to visualize data spread across many files at once, you would want to use Dask-cuDF with cuxfilter. The examples notebooks can be found here.

TRY CUXFILTER NOTEBOOKS ONLINE

  1. Mortgage_example.ipynb

    Open In Studio Lab Open In Colab

  2. NYC_taxi_example.ipynb

    Open In Studio Lab Open In Colab

  3. auto_accidents_example.ipynb

    Open In Studio Lab Open In Colab

  4. graphs.ipynb

    Open In Studio Lab Open In Colab


Setup Remote Environments

Amazon Sagemaker Studio Lab

Amazon SageMaker Studio Lab is a free ML development environment that provides the compute, storage (up to 15GB), and security —all at no cost (currently). This includes GPU notebook instances.

Once you have registered with your email address, simply sign in to your account, start a CPU or GPU runtime, and open your project - all in your browser.

To setup a RAPIDS environment in studio lab (you only need to do this the first time, since studio lab has 15GB of persistent storage across sessions), open a new terminal and run the following

conda install ipykernel

Then install cuxfilter and its dependencies by following the instructions in "Installation" in the project's main README.

Once installed, you should see a card in the launcher for that environment and kernel after about a minute.

Note: It may take about one minute for the new environment to appear as a kernel option.

Google Colab

Google Colab, or "Colaboratory", allows you to write and execute Python in your browser, with

  • Zero configuration required
  • Free access to GPUs
  • Easy sharing

To launch cuxfilter notebooks on the colab environment, you need to follow the the RAPIDS installation instructions guide by clicking Open In Colab. Once the RAPIDS libraries are installed, you can run the cuxfilter notebooks.

Note: Unlike Studio Lab, environment storage is not persistent and each notebook needs a separate RAPIDS installation every time you start a new session.

Copy the installation notebook cells to the top of the cuxfilter notebooks and install RAPIDS before executing the cuxfilter code.


Download Datasets

Note: Auto Accidents dataset has corrupted coordinate data from the years 2012-2014