要在Spark中跨集群讀取Hive數據,可以使用Spark的HiveWarehouseConnector來連接到Hive數據庫。以下是一個示例代碼,演示如何在Spark中跨集群讀取Hive數據:
```scala
import org.apache.spark.sql.SparkSession
val spark = SparkSession.builder
.appName("Read from Hive")
.config("spark.sql.hive.metastore.version", "3.0.0")
.config("spark.sql.hive.metastore.jars", "/path/to/hive-jars")
.enableHiveSupport()
.getOrCreate()
// 通過HiveWarehouseConnector連接到Hive數據庫
val hiveTable = spark.read.format("com.hortonworks.spark.sql.hive.llap.HiveWarehouseConnector")
.option("url", "jdbc:hive2://
.option("dbcp.username", "
.option("dbcp.password", "
.option("dbcp.driver", "org.apache.hive.jdbc.HiveDriver")
.option("database", "
.option("table", "