Skip to content

Getting spark job logs on Google Dataproc with ML_DSL

Anna Safonova edited this page Jun 29, 2020 · 3 revisions

Using Magic Functions

Here is an example how to get logs of spark job on Google Dataproc cluster:

project_id = "gd-mldsl-test"
filters = """resource.type:cloud_dataproc_cluster
             AND timestamp>2020-02-16
             AND resource.labels.cluster_name:test_cluster
             AND jsonPayload.application:application_1000000000000_0001"""

Get logs using magic function %logging:

%logging -p $project_id -f $filters

output:

02/17/2020, 09:33:40	Setting up env variables
02/17/2020, 09:33:40	Setting up job resources
02/17/2020, 09:33:40	Launching container
02/17/2020, 09:33:41	Class path contains multiple SLF4J bindings.
02/17/2020, 09:33:41	Found binding in [jar:file:/usr/lib/spark/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
02/17/2020, 09:33:41	Found binding in [jar:file:/usr/lib/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
02/17/2020, 09:33:41	See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
02/17/2020, 09:33:41	Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
...