A CorDapp designed to facilitate the crowd sourcing of data used to build machine learning models.
In my medium article I discuss:
- What common issues data scientists face when trying to build these models
- How I solved these issues using Corda
- Why Corda is the right choice for decentralized machine learning applications
From the /decentralized_corpus_manager/CorDapp/ directory:
- Create your nodes by running
./gradlew deployNodes
. - Start your nodes by running the
build\nodes\runnodes
command.
- Build the Spring jar file by running
./gradlew clients::bootJar
command fromt the /CorDapp directory. - Start the server with:
java -jar .\clients-0.1.jar --server.port:8080 --config.rpc.host=localhost --config.rpc.port=10006 --config.rpc.username=user1 --config.rpc.password=test
from the /CorDapp/clients/build/libs directory
- If you don't have Flask (and using a Windows machine), watch this video.
- Copy the Classification API code to your flask application.
- Run the
activate
command from the MyProject/Scripts/ directory to start up your virtual environment. - Start the server with
flask run
- If you have Anaconda Prompt already installed, you can use anaconda prompt to start the flask application. When you open Anaconda Prompt, you should be in the home directory, which is
C:\Users\John
- Step 1: Create a Virtual environment
- If you don't have virtualenv installed, run the command
pip install virtualenv
- Run the command
virtualenv venv
. Note thatvenv
is the name of our virtual environment. Feel free to replacevenv
with a more suitable name. - To activate the virtual environment, navigate to the
Scripts
folder by typingcd venv\Scripts
. After typing that command, typeactivate
. - If you activate your virtual environment correctly, you should see something like:
(venv) (base) C:\Users\John\venv\Scripts
- If you don't have virtualenv installed, run the command
- Step 2: Running the Flask Application
- We need to install flask. Now that you are in the virtual environment through Anaconda, run the command
pip install flask
- To see what version of flask you have, run the command
flask --version
- In the virtual environment, we need to create a directory called
demoapp
. Run the commandmkdir demoapp
while in(venv) (base) C:\Users\John\venv\Scripts
- Open up a code editor. I used Sublime Text. Drag the folder you created in the previous step into the Sublime text window.
- In Sublime Text, you see on the left hand side an icon called "Folders" with
demoapp
already there. Right click the folder and selectNew File
. - Go to where you have the
decentralized_corpus_manager
folder in your computer. Click on the folder calledflask
and open the file calledapp.py
. - Copy and paste that code in the new file you created in Sublime Text. Save that file as
app.py
. - Go to Anaconda and run the command
flask run
. If you execute this command correctly, you will seeRunning on 127.0.0.1:5000/
. That is the localhost on your browser. - If you execute
flask run
and Anaconda wants you to install additional libraries, runpip install <libraryName>
for each library you are told. For example, to install scikit learn and numpy, runpip install sklearn
andpip install numpy
respectively. Feel free to google the pip install command to make sure you install the library correctly. - Go to your browser and type
localhost:5000
and you should seeHello from Flask>app.py
. If you examine theapp.py
file carefully, you will seeapp.route()
in the documentation. That tells you what to enter into the browser to execute certain functions.
- We need to install flask. Now that you are in the virtual environment through Anaconda, run the command
- Run
npm start
from the decentralized-corpus-management\react_app directory
- issueCorpus: An endpoint to issue a corpus using JSON.
- @Param:
corpus
is a LinkedHashMap<String, String> where the Key is the data row and the value is the classification label - @Param:
algorithmUsed
is a String describing the type of algo used to produce the model - @Param:
classificationURL
is a String which represents the Flask endpoint where the classification report can be created from. - @Param:
participants
is the list (Strings) of CordaX500 names for each party included on the TX.
- updateCorpus: An endpoint that allows a user to propose a new corpus for the model with the intent to improve it.
- @Param:
proposedCorpus
is a LinkedHashMap<String, String> where the Key is the data row and the value is the label. - @Param:
corpusLinearId
is the LinearPointer used to query for the corpus state.
- updateClassificationURL: An endpoint for strictly modifying the URL that is used to build the classification report.
- @Param:
newURL
is a String representing the new endpoint used to produce the classification report. - @Param:
corpusLinearId
is the LinearPointer used to query for the corpus state.
- transferOwnership: Only the owner of a corpus can "close" it or update its classification URL. This endpoint allows an owner to re-assign the ownership of a corpus.
- @Param:
newOwner
is the party representing the new owner of the corpus. - @Param:
corpusLinearId
is the LinearPointer used to query for the corpus state.
- closeCorpus: An endpoint for the corpus owner to prevent any further changes to a corpus by pointing to the exit state.
- @Param:
corpusLinearId
is the LinearPointer used to query for the corpus state.
- issueCorpusWithCSV: This endpoint gives the user has the option of using a CSV file delimited by "|" to use as a corpus for corpus state creation.
- @Param:
csvFile
is a multipart file that is a "|" delimited utterance (data|label). - @Param:
algorithmUsed
is a String describing the type of algo used to produce the model - @Param:
classificationURL
is a String which represents the Flask endpoint where the classification report can be created from. - @Param:
participants
is the list (Strings) of CordaX500 names for each party included on the TX.
- corpusLookup: Retrieve the most recent version of a corpus state.
- @Param:
corpusLinearId
is the LinearPointer used to query for the corpus state.
Import the sample Postman requests
If you don't have Postman installed, go to this link. Follow the download instructions and open Postman.
Once you open Postman, click Import
on the top left hand corner and navigate to the folder decentralized_corpus_manager/postman/postman-collection.json
. If you do it correctly, you should see 9 requests with two GET
requests in green while the rest yellow.
You should be ready for the demo video. We hope you enjoy learning about machine learning with Corda!