Recognizing handwritten numbers is a piece of cake for humans, but it's a non-trivial task for machines. Nowadays, with the advancement of machine learning, people have made machines more and more capable of performing this task. We now have mobile banking apps that can scan checks in seconds and accounting software that can extract dollar amounts from thousands of contracts in minutes. If you are interested in knowing how this all works, please follow along with this code pattern as we take you through the steps to create a simple handwritten digit recognizer in Watson Studio with PyTorch.
Watson Studio is an integrated environment for data scientists, developers and domain experts to collaboratively work with data to build, train and deploy models at scale. If you are new to Watson Studio, the best way to understand it is to see it in action
PyTorch is a relatively new deep learning framework. Yet, it has begun to gain adoption especially among researchers and data scientists. The strength of PyTorch is its support of dynamic computational graph while most deep learning frameworks are based on static computational graph. In addition, its strong NumPy like GPU accelerated tensor computation has allowed Python developers to easily learn and build deep learning networks for GPUs and CPUs alike.
In this code pattern, you will use Jupyter Notebook in Watson Studio and access preinstalled and optimized PyTorch environments through the Python client library of the Watson Machine Learning service, which has a set of REST APIs in its core that allows users to submit training jobs, monitor status, and store and deploy models.
When you have completed this code pattern, you will understand how to:
- Create a project in Watson Studio and use Jupyter Notebooks in the project.
- Use the Python client of Cloud Object Storage to create buckets and upload data to buckets.
- Submit PyTorch training jobs to Watson Machine Learning service.
- Use the trained PyTorch model to predict handwritten digits from images.
- Log into IBM Watson Studio
- Run the Jupyter notebook in Watson Studio
- Use PyTorch to download and process the data
- Use Watson Machine Learning to train and deploy the model
- Watson Machine Learning: Make smarter decisions, solve tough problems, and improve user outcomes.
- Watson Studio: IBM's integrated hybrid environment that provides flexible data science tools to build and train AI models and prepare and analyze data.
- Jupyter Notebooks: An open-source web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text.
- Cloud Object Storage: Provides flexible, cost-effective, and scalable cloud storage for unstructured data.
- Artificial Intelligence: Artificial intelligence can be applied to disparate solution spaces to deliver disruptive technologies.
- Python: Python is a programming language that lets you work more quickly and integrate your systems more effectively.
- PyTorch: PyTorch is an open source deep learning platform that provides a seamless path from research prototyping to production deployment.
- Sign up for Watson Studio
- Create a new project
- Create the notebook
- Create a Watson Machine Learning Service instance
- Create HMAC credentials for the Cloud Object Storage instance
- Run the notebook
- See the results
Sign up for IBM's Watson Studio. By creating a project in Watson Studio a free tier Object Storage service will be created in your IBM Cloud account. Take note of your service names as you will need to select them in the following steps.
Note: When creating your Object Storage service, select the Free storage
type in order to avoid having to pay an upgrade fee.
From the Watson Studio home page, click on the Navigation Menu
☰
icon on the top left, expand the Project
option, then click on the View all projects
tab. Once you land in the My projects page, click on the "New project" button and then select the Create an empty project
option.
- To create a project in Watson Studio, give the project a name and select an existing Cloud Object Storage from your IBM Cloud account.
- Upon a successful project creation, you are taken to a dashboard view of your project. Take note of the
Assets
andSettings
tabs, we'll be using them to associate our project with any external assets (such as notebooks) and any IBM Cloud services.
From the project dashboard view, select the Add to project
tab and click on the Notebook
button.
Use the From URL
tab to create our notebook.
-
Give your notebook a name and select your desired runtime. In this case, select the
Default Python 3.6 Free
option. -
For URL, enter the following URL for the notebook stored in our GitHub repository:
https://raw.githubusercontent.com/IBM/pytorch-on-watson-studio/master/notebooks/use-pytorch-to-predict-handwritten-digits.ipynb
- Press the
Create Notebook
button.
If you have existing running instance of Watson Machine Learning (WML) Service, you can go to the IBM Cloud Resources page and click on the desired WML service to access the service details.
If you do not already have a running instance of the WML service, follow these steps to create one.
- From the IBM Cloud Catalog, under the AI category, select Machine Learning.
- Select the
Lite
plan, enter a service name located at the bottom of the page, then pressCreate
.
Once the service instance is created or you have landed in the service instance page of your choice, navigate to Service credentials
, view credentials and make note of them. If you don't see any credentials available, create a New credential
.
If you get this error: "You do not have the required permission to assign role 'Writer'. Contact the account owner to update your access." Give yourself writer access by:
- Use the IBM Cloud menu
☰
and selectSecurity
. - Click on
Manage
. - Click on
Identity and Access
. - Use the three dots icon to assign access to yourself.
- Click on
Assign access to resources
. - Use the
Services
pulldown to selectAll Identity and Access enabled services
. - Use the checkbox to enable
Writer
. - Hit
Assign
. - Go back and try to create your Watson ML credentials again.
- In the notebook available with this pattern, there is a cell which requires you to enter your WML credentials. Copy and paste these credentials into that notebook cell.
Execute the following steps to associate a WML service to your project:
- Go to the My projects page, click on your project.
- Click on the
Settings
tab - Click on the
Add service
button located in theAssociated services
section and then selectWatson
. - Click the
Add
button located in theMachine Learning
tile. . - Select a WML service from the drop-down menu to associate it with your project.
To run the notebook available with this pattern, you must create a Keyed-Hashing for Message Authentication
(HMAC) set of credentials for your Cloud Object Storage instance.
- From the IBM Cloud Resources page, click on the Cloud Object Storage instance that you assigned to your Watson Studio project. Then click the
Service credentials
tab.
- Click on
New Credential
to initiate creating a new set of credentials. Enter a name, then expandAdvanced options
to turn on theInclude HMAC Credential
option. PressAdd
to create the credentials.
- Once the credentials are created, you should see a set of
cos_hmac_keys
values.
- In the notebook available with this pattern, there is a cell which requires you to enter your Cloud Object Storage credentials. Copy and paste these credentials into that notebook cell.
To view your notebooks, select Notebooks
in the project Assets
list. To run a notebook, simply click on the edit
icon listed in the row associated with the notebook in the Notebooks
list.
Some background on executing notebooks:
When a notebook is executed, what is actually happening is that each code cell in the notebook is executed, in order, from top to bottom.
Each code cell is selectable and is preceded by a tag in the left margin. The tag format is
In [x]:
. Depending on the state of the notebook, thex
can be:
- A blank, this indicates that the cell has never been executed.
- A number, this number represents the relative order this code step was executed.
- A
*
, which indicates that the cell is currently executing.There are several ways to execute the code cells in your notebook:
- One cell at a time.
- Select the cell, and then press the
Play
button in the toolbar.- Batch mode, in sequential order.
- From the
Cell
menu bar, there are several options available. For example, you canRun All
cells in your notebook, or you canRun All Below
, that will start executing from the first cell under the currently selected cell, and then continue executing all cells that follow.- At a scheduled time.
- Press the
Schedule
button located in the top right section of your notebook panel. Here you can schedule your notebook to be executed once at some future time, or repeatedly at your specified interval.
Once the model is trained we can use it to recognize handwritten digits.
Note: With only 1 epoch, the results might be less then perfect.
View a copy of the notebook including output here.
- Data Analytics Code Patterns: Enjoyed this Code Pattern? Check out our other Data Analytics Code Patterns
- AI and Data Code Pattern Playlist: Bookmark our playlist with all of our Code Pattern videos
- Watson Studio: Master the art of data science with IBM's Watson Studio
- Spark on IBM Cloud: Need a Spark cluster? Create up to 30 Spark executors on IBM Cloud with our Spark service
This code pattern is licensed under the Apache Software License, Version 2. Separate third party code objects invoked within this code pattern are licensed by their respective providers pursuant to their own separate licenses. Contributions are subject to the Developer Certificate of Origin, Version 1.1 (DCO) and the Apache Software License, Version 2.