Spark日志

刘超 7月前 ⋅ 5496 阅读   编辑

目录

  1、日志存放路径

  2、更改日志级别(info、debug、error等)

    a、spark-submit
      通过log4j.properties设置(适用于本地模式、集群模式)

    b、spark-shell
      通过setLogLevel设置(适用于本地模式、集群模式)
      通过log4j.properties设置(适用于本地模式、集群模式)

  3、附录

    a、log4j.properties示例

    b、log4j.xml示例

 

  在很多情况下,我们需要查看driverexecutors在运行Spark应用程序时候产生的日志,这些日志对于我们调试和查找问题是很重要的。

一、日志存放路径

  Spark日志确切的存放路径和部署模式相关:

  1、如果是Spark Standalone模式,我们可以直接在Master UI界面查看应用程序的日志,在默认情况下这些日志是存储在worker节点的work目录下,这个目录可以通过SPARK_WORKER_DIR参数进行配置。

  2、如果是Mesos模式,我们同样可以通过MesosMaster UI界面上看到相关应用程序的日志,这些日志是存储在Mesos slavework目录下。

  3、如果是YARN模式,最简单地收集日志的方式是使用YARN的日志收集工具(yarn logs -applicationId ),这个工具可以收集你应用程序相关的运行日志,但是这个工具是有限制的:应用程序必须运行完,因为YARN必须首先聚合这些日志;而且你必须开启日志聚合功能(yarn.log-aggregation-enable,在默认情况下,这个参数是false),开启聚合后,就可以在yarn中查看

二、更改日志级别

  1、spark-submit

  1.1、通过log4j.properties设置

  1.1.1、本地模式

    在resources或者src下配置log4j.properties文件即可改变日志级别,如果log4j.properties不在src或者resources,可以通过以下代码指定log4j.properties存放位置

PropertyConfigurator.configure("E:\\Workspace\\300\\cedc\\aquarius\\src\\main\\resources\\log4j.properties");

    您可能会遇到如下问题:

    a、在resources下配好log4j.properties后, 有时能控制spark日志有时不能

  1.1.2、集群模式

    详细命令使用,下面给出个示例,如下

spark-submit \
    --master yarn \
    --deploy-mode cluster \
    --conf "spark.driver.extraJavaOptions=-Dlog4j.configuration=file:log4j.properties" \
  // --driver-java-options "-Dlog4j.configuration=file:log4j.properties    driver-java-options 等同于spark.driver.extraJavaOptions
    --conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=file:log4j.properties" \
    --files "/absolute/path/to/your/log4j.properties" \
    --class com.github.atais.Main \
  "SparkApp.jar"

 

  2、spark-shell

  1.1、通过setLogLevel设置

export SPARK_MAJOR_VERSION=2
spark-shell --master yarn --deploy-mode client

......
scala> sc.setLogLevel("DEBUG")

scala> 18/09/17 07:58:57 DEBUG Client: IPC Client (1024266763) connection to newhwx1.example.com/10.10.10.10:8032 from spark sending #69
18/09/17 07:58:57 DEBUG Client: IPC Client (1024266763) connection to newhwx1.example.com/10.10.10.10:8032 from spark got value #69
18/09/17 07:58:57 DEBUG ProtobufRpcEngine: Call: getApplicationReport took 8ms

  1.2、指定log4j.properties设置

spark-shell --master yarn --deploy-mode client --files ./log4j.properties --conf "spark.executor.extraJavaOptions='-Dlog4j.configuration=log4j.properties'" --driver-java-options "-Dlog4j.configuration=log4j.properties"

三、附录

  1、log4j.properties示例

################################################################################
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
##############################################################################

# By default, everything goes to console and file
# error,debug,info
# only print console
log4j.rootLogger=error, myConsoleAppender
# print console,file
# log4j.rootLogger=error, myConsoleAppender,RollingAppender
# log4j.rootCategory=info, DRFA
# The noisier spark logs go to file only
# log4j.logger.spark.storage=INFO, myConsoleAppender
# log4j.additivity.spark.storage=false
# log4j.logger.spark.scheduler=INFO, myConsoleAppender
# log4j.additivity.spark.scheduler=false
# log4j.logger.spark.CacheTracker=INFO, myConsoleAppender
# log4j.additivity.spark.CacheTracker=false
# log4j.logger.spark.CacheTrackerActor=INFO, myConsoleAppender
# log4j.additivity.spark.CacheTrackerActor=false
# log4j.logger.spark.MapOutputTrackerActor=INFO, myConsoleAppender
# log4j.additivity.spark.MapOutputTrackerActor=false
# log4j.logger.spark.MapOutputTracker=INFO, myConsoleAppender
# log4j.additivty.spark.MapOutputTracker=false

log4j.appender.myConsoleAppender=org.apache.log4j.ConsoleAppender
log4j.appender.myConsoleAppender.layout=org.apache.log4j.PatternLayout
log4j.appender.myConsoleAppender.layout.ConversionPattern=%d [%t] %-5p %c - %m%n
log4j.appender.RollingAppender=org.apache.log4j.DailyRollingFileAppender
log4j.appender.RollingAppender.File=/var/log/spark.log
log4j.appender.RollingAppender.DatePattern='.'yyyy-MM-dd
log4j.appender.RollingAppender.layout=org.apache.log4j.PatternLayout
log4j.appender.RollingAppender.layout.ConversionPattern=[%p] %d %c %M - %m%n
log4j.appender.RollingAppenderU=org.apache.log4j.DailyRollingFileAppender
log4j.appender.RollingAppenderU.File=/var/log/sparkU.log
log4j.appender.RollingAppenderU.DatePattern='.'yyyy-MM-dd
log4j.appender.RollingAppenderU.layout=org.apache.log4j.PatternLayout
log4j.appender.RollingAppenderU.layout.ConversionPattern=[%p] %d %c %M - %m%n

  2、log4j.xml示例


注意:本文归作者所有,未经作者允许,不得转载

全部评论: 0

    我有话说: