-
Notifications
You must be signed in to change notification settings - Fork 164
TonY Notebook Submitter
NotebookSubmitter is used to submit a python pex file (for example, Jupyter Notebook) to run inside a cluster.
It would first kick off a container inside the cluster that matches the resource request (am GPU/Memory/CPU) and run the specified script inside that node. To make it easier for Jupyter Notebook, we also bake in a proxy server in the submitter which would automatically proxy the request to that node.
Suppose you have a folder named bin/ at root directory which contains a notebook pex file: linotebook, you can use this command to start the notebook and follow the output message to visit the jupyter notebook page.
CLASSPATH=$(${HADOOP_HDFS_HOME}/bin/hadoop classpath --glob):./:/home/khu/notebook/tony-cli-0.1.0-all.jar \
java com.linkedin.tony.cli.NotebookSubmitter --src_dir bin/ --executes "'bin/linotebook --ip=* $DISABLE_TOKEN'"
You'll see similar log as below:
18/10/02 21:15:55 INFO cli.NotebookSubmitter: Starting NotebookSubmitter..
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/hadoop//share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/export/home/test_user/notebook/tony-cli-0.1.2-all.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
18/10/02 21:15:55 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
18/10/02 21:15:57 INFO cli.NotebookSubmitter: Copying /export/home/test_user/notebook/tony-cli-0.1.2-all.jar to: hdfs://nn:9000/user/test_user/.tony/640c7b44-5039-4a6e-a6c4-363cf924b504
18/10/02 21:15:57 INFO tony.TonyClient: TonY heartbeat interval [1000]
18/10/02 21:15:57 INFO tony.TonyClient: TonY max heartbeat misses allowed [25]
18/10/02 21:15:57 INFO tony.TonyClient: Starting client..
18/10/02 21:15:57 INFO client.RMProxy: Connecting to ResourceManager at rm/10.150.1.183:8032
18/10/02 21:15:57 INFO conf.Configuration: found resource resource-types.xml at file:/hadoop/resource-types.xml
18/10/02 21:15:57 INFO resource.ResourceUtils: Adding resource type - name = yarn.io/gpu, units = , type = COUNTABLE
18/10/02 21:15:57 INFO resource.ResourceUtils: Adding resource type - name = memory-mb, units = Mi, type = COUNTABLE
18/10/02 21:15:57 INFO resource.ResourceUtils: Adding resource type - name = vcores, units = , type = COUNTABLE
18/10/02 21:16:01 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 109572593 for test_user on nn:9000
18/10/02 21:16:01 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 120134930 for test_user on nn:9000
18/10/02 21:16:01 INFO tony.TonyClient: Successfully fetched tokens.
18/10/02 21:16:01 INFO tony.TonyClient: Completed setting up app master command {{JAVA_HOME}}/bin/java -Xmx1638m -Dyarn.app.container.log.dir=<LOG_DIR> com.linkedin.tony.TonyApplicationMaster --executes 'chmod +x -R bin/ && bin/linotebook --ip=* ' --hdfs_classpath hdfs://nn:9000/user/test_user/.tony/640c7b44-5039-4a6e-a6c4-363cf924b504 --shell_env LD_LIBRARY_PATH=/usr/java/latest/jre/lib/amd64/server:/export/apps/hadoop/latest/lib/native/: --container_env TONY_CONF_PATH=hdfs://nn:9000/user/test_user/.tony/application_1538503238688_0041/tony-final.xml --container_env TONY_CONF_TIMESTAMP=1538514961731 --container_env TF_ZIP_LENGTH=82542716 --container_env TF_ZIP_TIMESTAMP=1538514961651 --container_env TF_ZIP_PATH=hdfs://nn:9000/user/test_user/.tony/application_1538503238688_0041/tf.zip --container_env TONY_CONF_LENGTH=185881 --container_env CLASSPATH={{CLASSPATH}}<CPS>./*<CPS>$HADOOP_CONF_DIR<CPS>$HADOOP_COMMON_HOME/share/hadoop/common/*<CPS>$HADOOP_COMMON_HOME/share/hadoop/common/lib/*<CPS>$HADOOP_HDFS_HOME/share/hadoop/hdfs/*<CPS>$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*<CPS>$HADOOP_YARN_HOME/share/hadoop/yarn/*<CPS>$HADOOP_YARN_HOME/share/hadoop/yarn/lib/*<CPS>/export/apps/hadoop/site/lib/* 1><LOG_DIR>/amstdout.log 2><LOG_DIR>/amstderr.log
18/10/02 21:16:01 INFO tony.TonyClient: Submitting YARN application
18/10/02 21:16:01 INFO impl.YarnClientImpl: Submitted application application_1538503238688_0041
18/10/02 21:16:02 INFO tony.TonyClient: URL to track running application (will proxy to TensorBoard once it has started): wp/proxy/application_1538503238688_0041/
18/10/02 21:16:02 INFO tony.TonyClient: ResourceManager web address for application: http://rm:8088/cluster/app/application_1538503238688_0041
18/10/02 21:16:06 INFO tony.TonyClient: AM host: node
18/10/02 21:16:06 INFO tony.TonyClient: AM RPC port: 14357
18/10/02 21:16:06 INFO client.RMProxy: Connecting to ResourceManager at rm/10.150.1.183:8032
18/10/02 21:16:06 INFO tony.TonyClient: Logs for driver 0 at: http://node:8042/node/containerlogs/container_e08_1538503238688_0041_01_000001/test_user
18/10/02 21:16:06 INFO tony.TonyClient: Logs for notebook 0 at: node:23617
18/10/02 21:16:06 INFO cli.NotebookSubmitter: If you are running NotebookSubmitter in your local box, please open [localhost:12190] in your browser to visit the page. Otherwise, if you're running NotebookSubmitter in a remote machine (like a gateway), please run [ssh -L 18888:localhost:12190 name_of_this_host] in your laptop and open [localhost:18888] in your browser to visit Jupyter Notebook. If the 18888 port is occupied, replace that number with another number.
18/10/02 21:16:06 INFO tonyproxy.ProxyServer: Starting proxy for node:23617 on port 12190
The instructions is in the bold sentence.