Spark

刘超 2天前 ⋅ 4934 阅读   编辑

  追踪apache spark异常;追踪hdp spark异常

分类 版本 错误 是否
解决
使用
scala
  Caused by: java.lang.NoSuchMethodException: com.opera.adx.domain.AdxClick.<init>()
2.0.2 Declaration is never userd 有疑问
Cannot resolve symbol spark 
spark streaming updateStateByKey
spark submited时间差怎么相差那么大
mismatched input '<' expecting {')', ','}(line 1, pos 75)
SparkException: Task not serializable
Failed to execute user defined function($anonfun$getOSVersionRange$1: (string) => string)
在resources下配好log4j.properties后, 有时能控制spark日志有时不能
关闭Spark科学计数法
解决Spark Application jar包冲突
spark流计算程序内存溢出
The application is finished, but the log aggregation status is not updated for a long time. Not sure whether the log
java.util.concurrent.TimeoutException: Futures timed out after [120 second]
WARN TaskSetManager: Lost task 69.2 in stage 7.0 (TID 1145, 192.168.47.217): java.io.IOException: Connection from /192.168.47.21
WARN TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1, aa.local) ExecutorLostFailure (executor lost)
ERROR TransportChannelHandler: Connection to has been quiet for 120000 ms while there are outstanding requests.
org.apache.spark.shuffle.FetchFailedException: Connection from n35-03.fn.ams.osa/172.17.30.15:44122 closed
This behavior can be adjusted by setting 'spark.debug.maxToStringFields' in SparkEnv.conf
User class threw exception org.apache.spark.sql.AnalysisException Both sides of this join are outside the broadcasting threshold
ERROR TransportRequestHandler: Error while invoking RpcHandler#receive() on RPC id 5765311539245480646 org.apache.spark.storage.BlockNotFoundException: Block broadcast_49_piece0 not found
ERROR TransportRequestHandler: Error while invoking RpcHandler#receive() on RPC id 6213350919299289336 java.io.StreamCorruptedException: invalid stream header: 0001000B
SparkException: Failed to get broadcast_49_piece0 of broadcast_49
ERROR RetryingBlockFetcher: Exception while beginning fetch of 520 outstanding blocks (after 1 retries)
ERROR ShuffleBlockFetcherIterator: Failed to get block(s) from n35-03.fn.ams.osa:38397
ERROR TransportRequestHandler: Error sending result ChunkFetchSuccess{streamChunkId=StreamChunkId{streamId=575253708868, chunkIn
ERROR OneForOneBlockFetcher: Failed while starting block fetches
ERROR DiskBlockObjectWriter: Uncaught exception while reverting partial writes to file /data02/hadoop/yarn/log/usercache/sdev/ap
WARN OneWayOutboxMessage: Failed to send one-way RPC
ERROR TransportResponseHandler: Still have 1 requests outstanding when connection from /172.17.30.13:33690 is closed
ERROR CoarseGrainedExecutorBackend: RECEIVED SIGNAL TERM
org.apache.spark.sql.AnalysisException: Window Frame ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING must match the required frame ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
ExitCodeException exitCode=13
java.lang.AssertionError: assertion failed: Incorrect number of children
Failed to remove broadcast 14 with removeFromMaster = true - org.apache.spark.rpc.RpcEnvStoppedException: RpcEnv already stopped
application_1582757194275_146060 failed 1 times (global limit =2; local limit is =1) due to AM Container for appattempt_1582757194275_146060_000001 exited with  exitCode: 0
unable to instantiate sparksession with hive support because hive classes
Broadcast  
Futures timed out after [300 seconds]
Caused by: org.apache.spark.SparkException: Could not execute broadcast in 300 secs. You can increase the timeout for broadcasts via spark.sql.broadcastTimeout or disable broadcast join by setting spark.sql.autoBroadcastJoinThreshold to -1
metadata  
It is possible the underlying files have been updated. You can explicitly invalidate the cache in Spark by running 'REFRESH TABLE tableName' command in SQL or by recreating the Dataset/DataFrame involved
 
消息总线 ERROR LiveListenerBus: SparkListenerBus has already stopped! Dropping event SparkListenerExecutorMetricsUpdate(8,WrappedArray())
ERROR LiveListenerBus: SparkListenerBus has already stopped! Dropping event SparkListenerTaskEnd(2,3,ShuffleMapTask,FetchFailed(BlockManagerId(128, n49-04.fn.ams.osa, 41355),0,10,6,org.apache.spark.shuffle.FetchFailedException: java.io.FileNotFoundExceptio
20/03/31 09:44:11 ERROR LiveListenerBus: Dropping SparkListenerEvent because no remaining room in event queue. This likely means one of the SparkListeners is too slow and cannot keep up with the rate at which tasks are being started by the scheduler
字段    
spark hive字段名数字开头
union错行导致数据类型改变
数据类型    
StructType
error: value fromDDL is not a member of object org.apache.spark.sql.types.StructType
error: type mismatch; found : org.apache.spark.sql.types.StructField required: org.apache.spark.sql.types.DataType
case class
scala.collection.mutable.ListBuffer[Survey] forSome { type Survey <: Product with Serializable{val question_index:
schema  
Unable to infer schema for ORC at hdfs://opera/apps/hive/warehouse/adx.db/log_adx_click/*/*/*/day=20220506. It must be specified manually
Schema for type Any is not supported
dataframe
java.lang.UnsupportedOperationException: Cannot evaluate expression: rownumber()
orc
Unions can only be performed on tables with the same number of columns, but one table has '44' columns and another table has '43' columns
User class threw exception: java.lang.UnsupportedOperationException: empty.reduceLeft
java.lang.IndexOutOfBoundsException: Index: 9, Size: 9
shuffle
org.apache.spark.shuffle.FetchFailedException: Connection from n35-03.fn.ams.osa/172.17.30.15:44122 closed
org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output location for shuffle 0
windows
Relative path inabsolute URI:file:E:/Workspace/Eclipse_Daimler_bak/Daimler/spark-warehouse
函数
org.apache.spark.sql.AnalysisException: cannot resolve 'size('qwert')' due to data type mismatch: argument 1 requires (array or map) type, however, ''qwert'' is of string type
cannot resolve 'length(`tuples`)' due to data type mismatch: argument 1 requires (string or binary) type, however, '`tuples`' is of array> type
updateStateByKey
The method updateStateByKey in the type JavaPairDStream is not applicable for the arguments
countDistinct
org.apache.spark.sql.AnalysisException: Distinct window functions are not supported: count(distinct _w0#39668) windowspecdefinition(site_id#198
isin
Exception in thread "main" java.lang.RuntimeException: Unsupported literal type class scala.collection.immutable.$colon$colon List(a, b, c)
max
Exception in thread "main" org.apache.spark.sql.AnalysisException: cannot resolve 'max(`detail`)' due to data type mismatch: function max does not support ordering on type ArrayType(MapType(StringType,ArrayType(StringType,true),true),true);;
collect_set
org.apache.spark.sql.AnalysisException: cannot resolve 'collect_set(`detail`)' due to data type mismatch: collect_set() cannot have map type data;;
udf udaf  Aggregator
unresolved operator 'Aggregate [aggregatorexample(net.itdiandi.spark.sql.udf.udaf.AggregatorExample$@6dd38df2, None, assertnotnull(input[0, net.itdiandi.spark.sql.udf.udaf.Average, true], top level non-flat input object).sum AS sum#34L
数据类型
scala.collection.mutable.ListBuffer[Survey] forSome { type Survey <: Product with Serializable{val question_index:
org.apache.spark.sql.AnalysisException: cannot resolve 'UDF(pv_info)' due to data type mismatch: argument 1 requires array> type, however, '`pv_info`' is of array> type.;;
java.lang.ClassCastException: scala.collection.mutable.WrappedArray$ofRef cannot be cast to scala.collection.LinearSeqOptimized
driver
java.lang.OutOfMemoryError: GC overhead limit exceeded
executor  
ERROR YarnClusterScheduler: Lost executor 240 on n45-16.fn.ams.osa: Container container_e41_1582757194275_57891_02_000347 exited from explicit termination request
ERROR YarnClusterScheduler: Lost executor 596 on n07-13.fn.ams.osa: Executor heartbeat timed out after 3655577 ms
Resubmitted (resubmitted due to lost executor)
jar依赖
The type scala.reflect.api.TypeTags$TypeTag cannot be resolved
scala.Predef$.ArrowAssoc(Ljava/lang/Object;)Ljava/lang/Object;
参数
spark.network.timeout Time must be specified as seconds (s), milliseconds (ms), microseconds (us), minutes (m or min), hour (h), or day (d). E.g. 50s
spark.streaming.kafka.consumer.poll.ms Spark Streaming, Kafka receiver, Failed to get records for ... after polling for 512
网络
spark.executor.heartbeatInterval Futures timed out after [10 seconds]. This timeout is controlled by spark.executor.heartbeatInterval
资源
  WARN YarnScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
文件
User class threw exception: org.apache.hadoop.security.AccessControlException: Permission denied: user=sdev, access=EXECUTE, inode=
User class threw exception: org.apache.spark.sql.AnalysisException: Text data source supports only a single column, and you have 0 columns
Caused by: org.apache.spark.sql.AnalysisException: Path does not exist: hdfs://opera/apps/hive/warehouse/adx_dws.db/dws_adx_up_respone/year=2020/month=05/week=18/day=20200506/*
端口
FAILED ServerConnector@5c120d47{HTTP1.1}{0.0.0.08088} java.net.BindException Address already in use
jvm
spark sql SqlParser 解析异常java.lang.StackOverflowError
PermSize
java.lang.OutOfMemoryError:PermGenSpace
Head
java.lang.OutOfMemoryError: Java heap space 
memory
Required executor memory (1024+384 MB) is above the max threshold(512 MB) of this cluster
fastjson 定义udf时,使用fastjoin解析json
error: ambiguous reference to overloaded definition
{"traversableAgain":true,"empty":false}
value length is not a member of com.alibaba.fastjson.JSONArray
netty
java.lang.OutOfMemoryError: GC overhead limit exceeded at io.netty.util.internal.MpscLinkedQueue.offer(MpscLinkedQueue.java:126)
集成其他组件
mysql
Incorrect string value: '\xC4\x81\xE1\xB8\x91' for column 'field_value' at row 14
java.sql.BatchUpdateException: Incorrect string value: '\xE7\xA9\xBA\xE5\x86\x9B...' for column 'pv_info' at row 1
tidb

java.util.NoSuchElementException: spark.tispark.pd.addresses

spark访问hive报cannot resolve '`advIndustry`' given input columns
sparksql
在SparkSQL中调用cache()不生效
value $ is not a member of StringContext
Error: Cluster deploy mode is not applicable to Spark shell
Spark:java.net.BindException Address already in use Service 'SparkUI' failed after 16 retries!
sparkstreaming
AM Container for exited with exitCode 15
hdfs
Got error, status message opReadBlock received exception org.apache.hadoop.hdfs.server.datanode.ReplicaNotFoundException: Replic
Failed to CREATE_FILE  date=20191022 for DFSClient_NONMAPREDUCE_241717241_1 on 172.17.30.206 because DFSClient_NONMAPREDUCE_241717241_1 is already the current lease holder
hive
2.0.2 org.apache.spark.sql.AnalysisException: Saving data in MetastoreRelation adx_app, r_day_stat_gps_density  is not supported
Window function rownumber() requires window to be ordered, please add ORDER BY clause.
spark访问hive报cannot resolve '`advIndustry`' given input columns
org.apache.spark.sql.catalyst.analysis.NoSuchDatabaseException: Database 'adx_app' not found
yarn-cluster模式就报了找不到“test.user”表的错误
Exception in thread "main" java.lang.NoSuchFieldError: SPARK_RPC_SERVER_ADDRESS
Tez
java.lang.ClassNotFoundException: org.apache.tez.dag.api.SessionNotRunning
java.lang.ClassNotFoundException: org.apache.tez.mapreduce.hadoop.MRHelpers
ERROR ApplicationMaster: User class threw exception: java.lang.reflect.InvocationTargetException
org.apache.tez.dag.api.TezUncheckedException: Invalid configuration of tez jars, tez.lib.uris is not defined in the configuration
2.4.4
org.apache.spark.sql.AnalysisException: The format of the existing table r_day_stat_gps_density is `HiveFileFormat`. It doesn't match the specified format `ParquetFileFormat`.
redis
redis.clients.jedis.exceptions.JedisNoReachableClusterNodeException: No reachable node in cluster
hbase Unable to set watcher on znode (/hbase-unsecure/hbaseid) org.apache.zookeeper.KeeperException$ConnectionLossException:
hdfs ERROR ShortCircuitCache: ShortCircuitCache(0x3168a3c9): failed to release short-circuit shared memory slot Slot(slotIdx=2, shm=DfsClientShm(0101141fecb10c56bdd91c3f817b0bcd)) by sending ReleaseShortCircuitAccessRequestProto to /var/lib/hadoop-hdfs/dn_socke
kafka could not find a 'kafkaclient' entry in the jaas configuration
mongo com.mongodb.MongoCommandException: Command failed with error -1: 'unrecognized field "allowDiskUse'
平台
2.0.1
LiveListenerBus Listener EventLoggingListener threw an exception
2.5
ClassNotFoundException com.sun.jersey.api.client.config.ClientConfig
PKIX path building failed:unable to find valid certification path to requested target
PermGen space
ClassNotFoundException com.sun.jersey.api.client.config.ClientConfig
YarnAllocator Container marked as failed container_e10_1531289274251_1867_02_000002 on host kafka248. Exit status 1. Diagnostics Exception from container-launch
3.0
spark-sql中看不到hive的库、表
Master挂掉,standby重启也失效
worker挂掉或假死
cluster.YarnScheduler: Lost executor 2 on zdbdsps025.iccc.com: Container marked as failed
机器
Cannot resolve the host name for 10-10-135-16610.10.135.166 because of javax.naming.NameNotFoundException DNS name not found


注意:本文归作者所有,未经作者允许,不得转载

全部评论: 0

    我有话说: