sparksql不支持正则

刘超 5天前 ⋅ 29 阅读   编辑

  基于spark2.0.2.2.5.6.0-40测试

scala> df.createOrReplaceTempView("temp");

scala> spark.sql("select * from temp limit 1").show()
+----+-----------+-------------+--------+---------------+---------------+-------------+--------------+----------+--------------+-----+------+--------+---------------+--------------+-------------+--------+----------------------+-----------------------------+-----------------------------+------------------+-------------------------+-------------------------+---------------------+---------------------+-----------------+------------------------+------------------------+-------------+--------------------+--------------------+----------------+----------------+-------------+----------------+-----------+-----------------------+-----------------------+------------------+---------------------+---------------------+-----------+------------------+------------------+--------------+----------+
|test|countryCode|       slotId|bucketId|    publisherId|          appId|trafficSource|          adId|chargeMode|    materialId|mType|siteId|agencyId|   advertiserId|advertiserType|      orderId|operator|prev_valid_impressions|phase1_prev_valid_impressions|phase2_prev_valid_impressions|impression_revenue|phase1_impression_revenue|phase2_impression_revenue|impression_up_revenue|impression_commission|prev_valid_clicks|phase1_prev_valid_clicks|phase2_prev_valid_clicks|click_revenue|phase1_click_revenue|phase2_click_revenue|click_up_revenue|click_commission|install_count|conversion_count|pixel_count|phase1_conversion_count|phase2_conversion_count|conversion_revenue|conversion_up_revenue|conversion_commission|total_price|phase1_total_price|phase2_total_price|total_up_price|commission|
+----+-----------+-------------+--------+---------------+---------------+-------------+--------------+----------+--------------+-----+------+--------+---------------+--------------+-------------+--------+----------------------+-----------------------------+-----------------------------+------------------+-------------------------+-------------------------+---------------------+---------------------+-----------------+------------------------+------------------------+-------------+--------------------+--------------------+----------------+----------------+-------------+----------------+-----------+-----------------------+-----------------------+------------------+---------------------+---------------------+-----------+------------------+------------------+--------------+----------+
|   0|         NG|s838936514880|       0|pub236088034304|app496319027008|            0|a1269979849024|         2|m1269997916544|    1|      |        |adv386831107200|             5|o392026937792|        |                     2|                            0|                            0|               0.0|                      0.0|                      0.0|                  0.0|                  0.0|             null|                    null|                    null|         null|                null|                null|            null|            null|         null|            null|       null|                   null|                   null|              null|                 null|                 null|       null|              null|              null|          null|      null|
+----+-----------+-------------+--------+---------------+---------------+-------------+--------------+----------+--------------+-----+------+--------+---------------+--------------+-------------+--------+----------------------+-----------------------------+-----------------------------+------------------+-------------------------+-------------------------+---------------------+---------------------+-----------------+------------------------+------------------------+-------------+--------------------+--------------------+----------------+----------------+-------------+----------------+-----------+-----------------------+-----------------------+------------------+---------------------+---------------------+-----------+------------------+------------------+--------------+----------+

scala> spark.sql("SET hive.support.quoted.identifiers=none")
res16: org.apache.spark.sql.DataFrame = [key: string, value: string]

scala> spark.sql("SET spark.sql.parser.quotedRegexColumnNames=true")
res17: org.apache.spark.sql.DataFrame = [key: string, value: string]

scala> spark.sql("SELECT `(commission)?+.+` FROM temp limit 1")
org.apache.spark.sql.AnalysisException: cannot resolve '```(commission)?+.+```' given input columns: [advertiserId, prev_valid_clicks, conversion_count, advertiserType, phase2_total_price, agencyId, bucketId, phase2_prev_valid_impressions, click_commission, trafficSource, chargeMode, countryCode, publisherId, impression_revenue, conversion_up_revenue, phase1_prev_valid_impressions, prev_valid_impressions, phase2_click_revenue, slotId, commission, total_price, phase1_prev_valid_clicks, phase2_prev_valid_clicks, click_up_revenue, phase1_impression_revenue, impression_up_revenue, phase2_conversion_count, appId, conversion_revenue, install_count, materialId, phase1_conversion_count, conversion_commission, orderId, click_revenue, phase1_total_price, test, total_up_price, phase2_impression_revenue, operator, pixel_count, phase1_click_revenue, mType, impression_commission, adId, siteId]; line 1 pos 7;
'GlobalLimit 1
+- 'LocalLimit 1
   +- 'Project ['`(commission)?+.+`]
      +- SubqueryAlias temp
         +- Relation[test#0,countryCode#1,slotId#2,bucketId#3,publisherId#4,appId#5,trafficSource#6,adId#7,chargeMode#8,materialId#9,mType#10,siteId#11,agencyId#12,advertiserId#13,advertiserType#14,orderId#15,operator#16,prev_valid_impressions#17L,phase1_prev_valid_impressions#18L,phase2_prev_valid_impressions#19L,impression_revenue#20,phase1_impression_revenue#21,phase2_impression_revenue#22,impression_up_revenue#23,... 22 more fields] parquet

  at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
  at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.scala:77)
  at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.scala:74)
  at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:308)
  at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:308)

spark2.3可以

https://stackoverflow.com/questions/52526768/does-spark-sql-supports-hive-select-all-query-with-except-columns-using-regex-sp


注意:本文归作者所有,未经作者允许,不得转载

全部评论: 0

    我有话说: