91超碰碰碰碰久久久久久综合_超碰av人澡人澡人澡人澡人掠_国产黄大片在线观看画质优化_txt小说免费全本

溫馨提示×

溫馨提示×

您好,登錄后才能下訂單哦!

密碼登錄×
登錄注冊×
其他方式登錄
點擊 登錄注冊 即表示同意《億速云用戶服務條款》

sqoop相關整理記錄

發布時間:2020-06-09 16:34:13 來源:網絡 閱讀:3577 作者:紅塔山lvs 欄目:大數據

生產背景:

在從mysql導入到hive中,遇到如下問題:

從hive導出到mysql中,遇到如下問題:

sqoop 缺點:

1  基于命令行的操作方式,易出錯,且不安全。
2  數據傳輸和數據格式是緊耦合的,這使得connector無法支持所有的數據格式
3  用戶名和密碼暴漏出來
4  sqoop安裝需要root權限

Sqoop優點:

1   高效可控的利用資源,任務并行度,超時時間。
2   數據類型映射與轉化,可自動進行,用戶也可自定義 .
3   支持多種主流數據庫,MySQL,Oracle,SQL Server,DB2等等 。

Sqoop原理:

Sqoop的inport原理:

Sqoop的export原理:

驗證sqoop的各種報錯:

1 mysql字段太短
2 hive的空字段轉換
3 分隔符錯誤
4  mysql的網絡不在集群網絡中
5  mysql停止服務
6 mysql utf8編碼只是3個字節,可能是因為某些unicode字符轉成utf8之后變成了4個字節,需要mysql支持utf8mb4
7  Sqoop調式信息
8 修改生成的Java類,重新打包。

Sqoop命令行說明



生產背景:

 在從mysql導入到hive,遇到如下問題:

       1) 源mysql和集群機器不在同一個網段中,導致執行導入命令,網絡連接失敗。

       2) 某些字符導入到hive中,出現報錯終止。 

    2.1  sqoop使用的JDBC-connector 版本太低(更換版本)。

hive導出到mysql中,遇到如下問題:

 1)某些字符插入mysql,出現報錯終止。

   1.1 可能mysql本身編碼的限制,某些字符不支持,比如uft8utf8mb4

   1.2  sqoop使用的JDBC-connector 版本太低(更換版本)。

 

sqoop 缺點:

 1  基于命令行的操作方式,易出錯,且不安全。

 2  數據傳輸和數據格式是緊耦合的,這使得connector無法支持所有的數據格式

 3  用戶名和密碼暴漏出來

 4  sqoop安裝需要root權限

Sqoop優點:

 

1   高效可控的利用資源,任務并行度,超時時間。

2   數據類型映射與轉化,可自動進行,用戶也可自定義 .


3   支持多種主流數據庫,MySQL,OracleSQL ServerDB2等等 

 

 

Sqoop原理:

 

Sqoopinport原理:

    Sqoopimport時,需要制定split-by參數。Sqoop根據不同的split-by參數值來進行切分,然后將切分出來的區域分配到不同map中。每個map中再處理數據庫中獲取的一行一行的值,寫入到HDFS中。同時split-by根據不同的參數類型有不同的切分方法,如比較簡單的int型,Sqoop會取最大和最小split-by字段值,然后根據傳入的num-mappers來確定劃分幾個區域。 比如select max(split_by),min(split-by) from得到的max(split-by)min(split-by)分別為10001,而num-mappers2的話,則會分成兩個區域(1,500)(501-100),同時也會分成2sql2map去進行導入操作,分別為select XXX from table where split-by>=1 and split-by<500select XXX from table where split-by>=501 and split-by<=1000。最后每個map各自獲取各自SQL中的數據進行導入工作。

Sqoopexport原理:根據mysql表名稱,生成一個以表名稱命名的Java類,該類繼承了sqoopRecord的,是一個只有MapMR,且自定義了輸出字段。

   

sqoop export --connect jdbc:mysql://$url:3306/$3?characterEncoding=utf8 --username $username --password $password --table $1 --export-dir $2 --input-fields-terminated-by '|' --null-non-string '0' --null-string '0';

驗證sqoop的各種報錯:


mysql表

create  table dm_trlog (

plat     varchar(20),

user_id  varchar(20),

click_time   varchar(20),

click_url    varchar(200)

)

hive 表

CREATE TABLE TRLOG

(

PLATFORM string,

USER_ID int,

CLICK_TIME string,

CLICK_URL string

)

row format delimited

fields terminated by '\t';

sqoop list-databases –connect jdbc:mysql://192.168.119.129:3306/ –username li72 –password 123

1  mysql字段太短

 

sqoop export --connect jdbc:mysql://192.168.119.129:3306/student?characterEncoding=utf8 --username li72 --password 123 --table dm_trlog --export-dir /home/bigdata/hive/data/db1.db/trlog --input-fields-terminated-by '\t' --null-non-string '0' --null-string '0';

 

Warning: $HADOOP_HOME is deprecated.

14/11/06 01:42:32 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.

14/11/06 01:42:32 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.

14/11/06 01:42:32 INFO tool.CodeGenTool: Beginning code generation

14/11/06 01:42:34 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `dm_trlog` AS t LIMIT 1

14/11/06 01:42:34 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `dm_trlog` AS t LIMIT 1

14/11/06 01:42:34 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/bigdata/hadoop

Note: /tmp/sqoop-root/compile/d5e37c20a9231b3253c97fc27d16d8a9/dm_trlog.java uses or overrides a deprecated API.

Note: Recompile with -Xlint:deprecation for details.

14/11/06 01:42:38 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/d5e37c20a9231b3253c97fc27d16d8a9/dm_trlog.jar

14/11/06 01:42:38 INFO mapreduce.ExportJobBase: Beginning export of dm_trlog

14/11/06 01:42:43 INFO input.FileInputFormat: Total input paths to process : 1

14/11/06 01:42:43 INFO input.FileInputFormat: Total input paths to process : 1

14/11/06 01:42:43 INFO util.NativeCodeLoader: Loaded the native-hadoop library

14/11/06 01:42:43 WARN snappy.LoadSnappy: Snappy native library not loaded

14/11/06 01:42:44 INFO mapred.JobClient: Running job: job_201411060114_0001

14/11/06 01:42:45 INFO mapred.JobClient:  map 0% reduce 0%

14/11/06 01:43:15 INFO mapred.JobClient: Task Id : attempt_201411060114_0001_m_000000_0, Status : FAILED

java.io.IOException: com.mysql.jdbc.MysqlDataTruncation: Data truncation: Data too long for column 'click_time' at row 1

        at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:192)

        at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651)

        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)

        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)

        at org.apache.hadoop.mapred.Child$4.run(Child.java:255)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:396)

        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)

        at org.apache.hadoop.mapred.Child.main(Child.java:249)

Caused by: com.mysql.jdbc.MysqlDataTruncation: Data truncation: Data too long for column 'click_time' at row 1

        at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4118)

        at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4052)

        at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2503)

        at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2664)

        at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2794)

        at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2155)

        at com.mysql.jdbc.PreparedStatement.execute(PreparedStatement.java:1379)

        at org.apache.sqoop.mapreduce.AsyncSqlOutputFormat$AsyncSqlExecThread.run(AsyncSqlOutputFormat.java:233)

mysql字段太短了

drop table  dm_trlog;

create  table dm_trlog (

plat     varchar(20),

user_id  varchar(20),

click_time   varchar(200),

click_url    varchar(200)

)

 

2 hive的空字段轉換 

由于Hive的NULL用\N來表示,字段用\001來分割,換行用\n來換行,導出分隔符一定要和hive表保持一致,如果為空可以指定轉換為0,有些mysql數字字段不能插入\N

加上兩個參數:--input-null-string '\\N' --input-null-non-string '\\N'多加一個'\',是為轉義

sqoop export --connect jdbc:mysql://192.168.119.129:3306/student?characterEncoding=utf8 --username li72 --password 123 --table dm_trlog --export-dir /home/bigdata/hive/data/db1.db/trlog --input-fields-terminated-by '\t' --null-non-string '0' --null-string '0';

14/10/23 04:53:47 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/915d24128af59c9e517580e0f07411d4/dm_pc_play_kpi.jar 
14/10/23 04:53:47 INFO mapreduce.ExportJobBase: Beginning export of dm_pc_play_kpi 
14/10/23 04:53:48 INFO input.FileInputFormat: Total input paths to process : 1 
14/10/23 04:53:48 INFO input.FileInputFormat: Total input paths to process : 1 
14/10/23 04:53:48 WARN snappy.LoadSnappy: Snappy native library is available 
14/10/23 04:53:48 INFO util.NativeCodeLoader: Loaded the native-hadoop library 
14/10/23 04:53:48 INFO snappy.LoadSnappy: Snappy native library loaded 
14/10/23 04:53:48 INFO mapred.JobClient: Running job: job_201408301703_84117 
14/10/23 04:53:49 INFO mapred.JobClient: map 0% reduce 0% 
14/10/23 04:55:45 INFO mapred.JobClient: Task Id : attempt_201408301703_84117_m_000000_0, Status : FAILED 
java.io.IOException: Can't export data, please check task tracker logs 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112) 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39) 
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) 
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) 
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) 
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) 
at org.apache.hadoop.mapred.Child$4.run(Child.java:255) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:415) 
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) 
at org.apache.hadoop.mapred.Child.main(Child.java:249) 
Caused by: java.lang.NumberFormatException: For input string: "N" 
at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1241) 
at java.lang.Float.valueOf(Float.java:417) 
at dm_pc_play_kpi.__loadFromFields(dm_pc_play_kpi.java:335) 
at dm_pc_play_kpi.parse(dm_pc_play_kpi.java:282) 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83) 
... 10 more 

14/10/23 04:55:53 INFO mapred.JobClient: Task Id : attempt_201408301703_84117_m_000000_1, Status : FAILED 
14/10/23 04:55:58 INFO mapred.JobClient: Task Id : attempt_201408301703_84117_m_000000_2, Status : FAILED 
java.io.IOException: Can't export data, please check task tracker logs 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112) 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39) 
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) 
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) 
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) 
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) 
at org.apache.hadoop.mapred.Child$4.run(Child.java:255) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:415) 
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) 
at org.apache.hadoop.mapred.Child.main(Child.java:249) 
Caused by: java.lang.NumberFormatException: For input string: "N" 
at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1241) 
at java.lang.Float.valueOf(Float.java:417) 
at dm_pc_play_kpi.__loadFromFields(dm_pc_play_kpi.java:335) 
at dm_pc_play_kpi.parse(dm_pc_play_kpi.java:282) 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83) 
... 10 more 

 

3 分隔符錯誤

Hive中的分隔符是’\t’ 但是導出寫成’|’

sqoop export --connect jdbc:mysql://192.168.119.129:3306/student?characterEncoding=utf8 --username li72 --password 123 --table dm_trlog --export-dir /home/bigdata/hive/data/db1.db/trlog --input-fields-terminated-by '|' --null-non-string '0' --null-string '0';

 為了測試,特意把分隔符改成 "|"


14/11/06 01:50:19 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.

14/11/06 01:50:19 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.

14/11/06 01:50:19 INFO tool.CodeGenTool: Beginning code generation

14/11/06 01:50:20 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `dm_trlog` AS t LIMIT 1

14/11/06 01:50:21 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `dm_trlog` AS t LIMIT 1

14/11/06 01:50:21 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/bigdata/hadoop

Note: /tmp/sqoop-root/compile/e474b3f8292f91dd4134b302ae35df19/dm_trlog.java uses or overrides a deprecated API.

Note: Recompile with -Xlint:deprecation for details.

14/11/06 01:50:25 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/e474b3f8292f91dd4134b302ae35df19/dm_trlog.jar

14/11/06 01:50:25 INFO mapreduce.ExportJobBase: Beginning export of dm_trlog

14/11/06 01:50:45 INFO input.FileInputFormat: Total input paths to process : 1

14/11/06 01:50:45 INFO input.FileInputFormat: Total input paths to process : 1

14/11/06 01:50:45 INFO util.NativeCodeLoader: Loaded the native-hadoop library

14/11/06 01:50:45 WARN snappy.LoadSnappy: Snappy native library not loaded

14/11/06 01:50:51 INFO mapred.JobClient: Running job: job_201411060114_0003

14/11/06 01:50:52 INFO mapred.JobClient:  map 0% reduce 0%

14/11/06 01:51:20 INFO mapred.JobClient: Task Id : attempt_201411060114_0003_m_000000_0, Status : FAILED

java.io.IOException: Can't export data, please check task tracker logs

        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)

        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)

        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)

        at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)

        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)

        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)

        at org.apache.hadoop.mapred.Child$4.run(Child.java:255)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:396)

        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)

        at org.apache.hadoop.mapred.Child.main(Child.java:249)

Caused by: java.util.NoSuchElementException

        at java.util.AbstractList$Itr.next(AbstractList.java:350)

        at dm_trlog.__loadFromFields(dm_trlog.java:252)

        at dm_trlog.parse(dm_trlog.java:201)

        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)

        ... 10 more

 

14/11/06 01:51:21 INFO mapred.JobClient: Task Id : attempt_201411060114_0003_m_000001_0, Status : FAILED

java.io.IOException: Can't export data, please check task tracker logs

        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)

        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)

        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)

        at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)

        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)

        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)

        at org.apache.hadoop.mapred.Child$4.run(Child.java:255)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:396)

        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)

        at org.apache.hadoop.mapred.Child.main(Child.java:249)

Caused by: java.util.NoSuchElementException

        at java.util.AbstractList$Itr.next(AbstractList.java:350)

        at dm_trlog.__loadFromFields(dm_trlog.java:252)

        at dm_trlog.parse(dm_trlog.java:201)

        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)

        ... 10 more

 

4  mysql的網絡不在集群網絡中

 

sqoop export --connect jdbc:mysql://192.168.119.1:3306/student?characterEncoding=utf8 --username li72 --password 123 --table dm_trlog --export-dir /home/bigdata/hive/data/db1.db/trlog --input-fields-terminated-by '\t' --null-non-string '0' --null-string '0';

 Warning: $HADOOP_HOME is deprecated.

14/11/06 02:04:29 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.

14/11/06 02:04:30 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.

14/11/06 02:04:30 INFO tool.CodeGenTool: Beginning code generation

14/11/06 02:07:40 ERROR manager.SqlManager: Error executing statement: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.

com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)

        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)

        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)

        at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)

        at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1117)

        at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:355)

        at com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2461)

        at com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2498)

        at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2283)

        at com.mysql.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:822)

        at com.mysql.jdbc.JDBC4Connection.<init>(JDBC4Connection.java:47)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)

        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)

        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)

        at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)

        at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:404)

        at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:317)

        at java.sql.DriverManager.getConnection(DriverManager.java:582)

        at java.sql.DriverManager.getConnection(DriverManager.java:185)

        at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:745)

        at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52)

        at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:605)

        at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:628)

        at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:235)

        at org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:219)

        at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:283)

        at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1255)

        at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1072)

        at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:82)

        at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:64)

        at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:100)

        at org.apache.sqoop.Sqoop.run(Sqoop.java:145)

        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

        at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)

        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)

        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)

        at org.apache.sqoop.Sqoop.main(Sqoop.java:238)

Caused by: java.net.ConnectException: Connection timed out

        at java.net.PlainSocketImpl.socketConnect(Native Method)

        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)

        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)

        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)

        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)

        at java.net.Socket.connect(Socket.java:529)

        at java.net.Socket.connect(Socket.java:478)

        at java.net.Socket.<init>(Socket.java:375)

        at java.net.Socket.<init>(Socket.java:218)

        at com.mysql.jdbc.StandardSocketFactory.connect(StandardSocketFactory.java:259)

        at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:305)

        ... 32 more

 

5  mysql停止服務

14/11/06 04:55:25 DEBUG tool.BaseSqoopTool: Enabled debug logging.

14/11/06 04:55:25 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.

14/11/06 04:55:25 DEBUG sqoop.ConnFactory: Loaded manager factory: com.cloudera.sqoop.manager.DefaultManagerFactory

14/11/06 04:55:25 DEBUG sqoop.ConnFactory: Trying ManagerFactory: com.cloudera.sqoop.manager.DefaultManagerFactory

14/11/06 04:55:25 DEBUG manager.DefaultManagerFactory: Trying with scheme: jdbc:mysql:

14/11/06 04:55:25 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.

14/11/06 04:55:25 DEBUG sqoop.ConnFactory: Instantiated ConnManager org.apache.sqoop.manager.MySQLManager@8a0d5d

14/11/06 04:55:25 INFO tool.CodeGenTool: Beginning code generation

14/11/06 04:55:26 DEBUG manager.SqlManager: No connection paramenters specified. Using regular API for making connection.

14/11/06 04:55:27 ERROR manager.SqlManager: Error executing statement: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.

com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)

        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)

        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)

        at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)

        at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1117)

        at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:355)

        at com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2461)

        at com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2498)

        at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2283)

        at com.mysql.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:822)

        at com.mysql.jdbc.JDBC4Connection.<init>(JDBC4Connection.java:47)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)

        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)

        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)

        at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)

        at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:404)

        at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:317)

        at java.sql.DriverManager.getConnection(DriverManager.java:582)

        at java.sql.DriverManager.getConnection(DriverManager.java:185)

        at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:745)

        at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52)

        at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:605)

        at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:628)

        at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:235)

        at org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:219)

        at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:283)

        at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1255)

        at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1072)

        at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:82)

        at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:64)

        at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:100)

        at org.apache.sqoop.Sqoop.run(Sqoop.java:145)

        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

        at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)

        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)

        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)

        at org.apache.sqoop.Sqoop.main(Sqoop.java:238)

Caused by: java.net.ConnectException: Connection refused

        at java.net.PlainSocketImpl.socketConnect(Native Method)

        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)

        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)

        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)

        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)

        at java.net.Socket.connect(Socket.java:529)

        at java.net.Socket.connect(Socket.java:478)

        at java.net.Socket.<init>(Socket.java:375)

        at java.net.Socket.<init>(Socket.java:218)

        at com.mysql.jdbc.StandardSocketFactory.connect(StandardSocketFactory.java:259)

        at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:305)

        ... 32 more

14/11/06 04:55:27 DEBUG manager.SqlManager: No connection paramenters specified. Using regular API for making connection.

14/11/06 04:55:27 ERROR manager.CatalogQueryManager: Failed to list columns from query: SELECT COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_SCHEMA = (SELECT SCHEMA())   AND TABLE_NAME = 'dm_trlog' 

com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)

        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)

        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)

        at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)

        at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1117)

        at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:355)

        at com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2461)

        at com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2498)

        at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2283)

        at com.mysql.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:822)

        at com.mysql.jdbc.JDBC4Connection.<init>(JDBC4Connection.java:47)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)

        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)

        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)

        at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)

        at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:404)

        at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:317)

        at java.sql.DriverManager.getConnection(DriverManager.java:582)

        at java.sql.DriverManager.getConnection(DriverManager.java:185)

        at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:745)

        at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52)

        at org.apache.sqoop.manager.CatalogQueryManager.getColumnNames(CatalogQueryManager.java:147)

        at org.apache.sqoop.orm.ClassWriter.getColumnNames(ClassWriter.java:1222)

        at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1074)

        at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:82)

        at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:64)

        at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:100)

        at org.apache.sqoop.Sqoop.run(Sqoop.java:145)

        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

        at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)

        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)

        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)

        at org.apache.sqoop.Sqoop.main(Sqoop.java:238)

Caused by: java.net.ConnectException: Connection refused

        at java.net.PlainSocketImpl.socketConnect(Native Method)

        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)

        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)

        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)

        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)

        at java.net.Socket.connect(Socket.java:529)

        at java.net.Socket.connect(Socket.java:478)

        at java.net.Socket.<init>(Socket.java:375)

        at java.net.Socket.<init>(Socket.java:218)

        at com.mysql.jdbc.StandardSocketFactory.connect(StandardSocketFactory.java:259)

        at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:305)

        ... 28 more

14/11/06 04:55:27 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.RuntimeException: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.

java.lang.RuntimeException: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

 

6  mysql utf8編碼只是3個字節,可能是因為某些unicode字符轉成utf8之后變成了4個字節,需要mysql支持utf8mb4

Warning: /usr/lib/hcatalog does not exist! HCatalog jobs will fail. 
Please set $HCAT_HOME to the root of your HCatalog installation. 
14/11/09 07:00:33 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 
14/11/09 07:00:33 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 
14/11/09 07:00:33 INFO tool.CodeGenTool: Beginning code generation 
14/11/09 07:00:33 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `dm_go_snger` AS t LIMIT 1 
14/11/09 07:00:34 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `dm_go_snger` AS t LIMIT 1 
14/11/09 07:00:34 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/hadoop/hadoop/hadoop-1.1.2 
Note: /tmp/sqoop-hadoop/compile/211b9679d4ac771d3ff710cbbd1c7277/dm_go_snger.java uses or overrides a deprecated API. 
Note: Recompile with -Xlint:deprecation for details. 
14/11/09 07:00:35 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/211b9679d4ac771d3ff710cbbd1c7277/dm_go_snger.jar 
14/11/09 07:00:35 INFO mapreduce.ExportJobBase: Beginning export of dm_go_snger 
14/11/09 07:00:36 INFO input.FileInputFormat: Total input paths to process : 1 
14/11/09 07:00:36 INFO input.FileInputFormat: Total input paths to process : 1 
14/11/09 07:00:36 WARN snappy.LoadSnappy: Snappy native library is available 
14/11/09 07:00:36 INFO util.NativeCodeLoader: Loaded the native-hadoop library 
14/11/09 07:00:36 INFO snappy.LoadSnappy: Snappy native library loaded 
14/11/09 07:00:37 INFO mapred.JobClient: Running job: job_201408301703_121362 
14/11/09 07:00:38 INFO mapred.JobClient: map 0% reduce 0% 
14/11/09 07:00:52 INFO mapred.JobClient: Task Id : attempt_201408301703_121362_m_000000_0, Status : FAILED 
java.io.IOException: Can't export data, please check task tracker logs 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112) 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39) 
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) 
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) 
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) 
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) 
at org.apache.hadoop.mapred.Child$4.run(Child.java:255) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:415) 
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) 
at org.apache.hadoop.mapred.Child.main(Child.java:249) 
Caused by: java.io.IOException: java.sql.SQLException: Incorrect string value: 'xF3x90x8Cx92xEFxBF...' for column 'singer' at row 1 
at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:220) 
at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:46) 
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:639) 
at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:84) 
... 10 more 
Caused by: java.sql.SQLException: Incorrect string value: 'xF3x90x8Cx92xEFxBF...' for column 'singer' at row 1 
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1072) 
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3563) 
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3495) 
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1959) 
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2113) 
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2693) 
at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2102) 
at com.mysql.jdbc.PreparedStatement.execute(PreparedStatement.java:1364) 
at org.apache.sqoop.mapreduce.AsyncSqlOutputFormat$AsyncSqlExecThread.run(AsyncSqlOutputFormat.java:233) 

14/11/09 07:00:59 INFO mapred.JobClient: Task Id : attempt_201408301703_121362_m_000000_1, Status : FAILED 
java.io.IOException: Can't export data, please check task tracker logs 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112) 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39) 
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) 
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) 
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) 
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) 
at org.apache.hadoop.mapred.Child$4.run(Child.java:255) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:415) 
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) 
at org.apache.hadoop.mapred.Child.main(Child.java:249) 
Caused by: java.io.IOException: java.sql.SQLException: Incorrect string value: 'xF3x90x8Cx92xEFxBF...' for column 'singer' at row 1 
at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:220) 
at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:46) 
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:639) 
at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:84) 
... 10 more 
Caused by: java.sql.SQLException: Incorrect string value: 'xF3x90x8Cx92xEFxBF...' for column 'singer' at row 1 
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1072) 
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3563) 
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3495) 
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1959) 
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2113) 
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2693) 
at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2102) 
at com.mysql.jdbc.PreparedStatement.execute(PreparedStatement.java:1364) 
at org.apache.sqoop.mapreduce.AsyncSqlOutputFormat$AsyncSqlExecThread.run(AsyncSqlOutputFormat.java:233) 

14/11/09 07:01:07 INFO mapred.JobClient: Task Id : attempt_201408301703_121362_m_000000_2, Status : FAILED 
java.io.IOException: Can't export data, please check task tracker logs 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112) 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39) 
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) 
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) 
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) 
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) 
at org.apache.hadoop.mapred.Child$4.run(Child.java:255) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:415) 
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) 
at org.apache.hadoop.mapred.Child.main(Child.java:249) 
Caused by: java.io.IOException: java.sql.SQLException: Incorrect string value: 'xF3x90x8Cx92xEFxBF...' for column 'singer' at row 1 
at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:220) 
at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:46) 
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:639) 
at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:84) 
... 10 more 
Caused by: java.sql.SQLException: Incorrect string value: 'xF3x90x8Cx92xEFxBF...' for column 'singer' at row 1 
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1072) 
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3563) 
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3495) 
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1959) 
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2113) 
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2693) 
at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2102) 
at com.mysql.jdbc.PreparedStatement.execute(PreparedStatement.java:1364) 
at org.apache.sqoop.mapreduce.AsyncSqlOutputFormat$AsyncSqlExecThread.run(AsyncSqlOutputFormat.java:233) 

14/11/09 07:01:20 INFO mapred.JobClient: Job complete: job_201408301703_121362 
14/11/09 07:01:20 INFO mapred.JobClient: Counters: 8 
14/11/09 07:01:20 INFO mapred.JobClient: Job Counters 
14/11/09 07:01:20 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=31500 
14/11/09 07:01:20 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0 
14/11/09 07:01:20 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0 
14/11/09 07:01:20 INFO mapred.JobClient: Rack-local map tasks=3 
14/11/09 07:01:20 INFO mapred.JobClient: Launched map tasks=4 
14/11/09 07:01:20 INFO mapred.JobClient: Data-local map tasks=1 
14/11/09 07:01:20 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=0 
14/11/09 07:01:20 INFO mapred.JobClient: Failed map tasks=1 
14/11/09 07:01:20 INFO mapreduce.ExportJobBase: Transferred 0 bytes in 44.4052 seconds (0 bytes/sec) 
14/11/09 07:01:20 INFO mapreduce.ExportJobBase: Exported 0 records. 
14/11/09 07:01:20 ERROR tool.ExportTool: Error during export: Export job failed! 
WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files. 
Logging initialized using configuration in jar:file:/home/hadoop/hadoop/hive-0.10.0.20140629/lib/hive-common-0.10.0.jar!/hive-log4j.properties 
Hive history file=/tmp/hadoop/hive_job_log_hadoop_201411090701_1423933997.txt 

7  Sqoop調式信息

 增加關鍵字--verbose

sqoop export --connect jdbc:mysql://192.168.119.129:3306/student?characterEncoding=utf8 --username li72 --password 123 --verbose --table dm_trlog --export-dir /home/bigdata/hive/data/db1.db/trlog --input-fields-terminated-by '\t' --null-non-string '0' --null-string '0';

8 修改生成的Java類,重新打包。

每次通過sqoop導入MySql的時,都會生成一個以MySql表命名的.java文件,然后打成JAR包,給sqoop提交給hadoop MR來解析Hive表中的數據。那可以根據報的錯誤,找到對應的行,改寫該文件,編譯,重新打包,sqoop可以通過 -jar-file --class-name 組合讓我們指定運行自己的jar包中的某個class。來解析該hive表中的每行數據。腳本如下:一個完整的例子如下: 
./bin/sqoop export --connect "jdbc:mysql://192.168.119.129:3306/student?useUnicode=true&characterEncoding=utf-8" 
--username li72 --password 123 --table dm_trlog 
--export-dir /hive/warehouse/trlog --input-fields-terminated-by '\t' 
--input-null-string '\\N' --input-null-non-string '\\N' 
--class-name com.li72.trlog 
--jar-file /tmp/sqoopTempjar/trlog.jar 
上面--jar-file 參數指定jar包的路徑。--class-name 指定jar包中的class 
這樣就可以解決所有解析異常了。 

Sqoop命令行說明

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
Common arguments:
   --connect <jdbc-uri> Specify JDBC connect
                                                string
   --connection-manager <class-name> Specify connection manager
                                                class name
   --connection-param-file <properties-file> Specify connection
                                                parameters file
   --driver <class-name> Manually specify JDBC
                                                driver class to use
   --hadoop-home <hdir> Override
                                                $HADOOP_MAPRED_HOME_ARG
   --hadoop-mapred-home <dir> Override
                                                $HADOOP_MAPRED_HOME_ARG
   --help Print usage instructions
-P Read password from console
   --password <password> Set authentication
                                                password
   --password-file <password-file> Set authentication
                                                password file path
   --relaxed-isolation Use read-uncommitted
                                                isolation for imports
   --skip-dist-cache Skip copying jars to
                                                distributed cache
   --username <username> Set authentication
                                                username
   --verbose Print more information
                                                while working
  
Import control arguments:
   --append Imports data
                                                              in append
                                                              mode
   --as-avrodatafile Imports data
                                                              to Avro data
                                                              files
   --as-sequencefile Imports data
                                                              to
                                                              SequenceFile
                                                              s
   --as-textfile Imports data
                                                              as plain
                                                              text
                                                              (default)
   --boundary-query <statement> Set boundary
                                                              query for
                                                              retrieving
                                                              max and min
                                                              value of the
                                                              primary key
   --columns <col,col,col...> Columns to
                                                              import from
                                                              table
   --compression-codec <codec> Compression
                                                              codec to use
                                                              for import
   --delete-target-dir Imports data
                                                              in delete
                                                              mode
   --direct Use direct
                                                              import fast
                                                              path
   --direct-split-size <n> Split the
                                                              input stream
                                                              every 'n'
                                                              bytes when
                                                              importing in
                                                              direct mode
-e,--query <statement> Import
                                                              results of
                                                              SQL
                                                              'statement'
   --fetch-size <n> Set number
                                                              'n' of rows
                                                              to fetch
                                                              from the
                                                              database
                                                              when more
                                                              rows are
                                                              needed
   --inline-lob-limit <n> Set the
                                                              maximum size
                                                              for an
                                                              inline LOB
-m,--num-mappers <n> Use 'n' map
                                                              tasks to
                                                              import in
                                                              parallel
   --mapreduce-job-name <name> Set name for
                                                              generated
                                                              mapreduce
                                                              job
   --split-by <column-name> Column of
                                                              the table
                                                              used to
                                                              split work
                                                              units
   --table <table-name> Table to
                                                              read
   --target-dir <dir> HDFS plain
                                                              table
                                                              destination
   --validate Validate the
                                                              copy using
                                                              the
                                                              configured
                                                              validator
   --validation-failurehandler <validation-failurehandler> Fully
                                                              qualified
                                                              class name
                                                              for
                                                              ValidationFa
                                                              ilureHandler
   --validation-threshold <validation-threshold> Fully
                                                              qualified
                                                              class name
                                                              for
                                                              ValidationTh
                                                              reshold
   --validator <validator> Fully
                                                              qualified
                                                              class name
                                                              for the
                                                              Validator
   --warehouse-dir <dir> HDFS parent
                                                              for table
                                                              destination
   --where <where clause> WHERE clause
                                                              to use
                                                              during
                                                              import
-z,--compress Enable
                                                              compression
  
Incremental import arguments:
   --check-column <column> Source column to check for incremental
                                  change
   --incremental <import-type> Define an incremental import of type
                                  'append' or 'lastmodified'
   --last-value <value> Last imported value in the incremental
                                  check column
  
Output line formatting arguments:
   --enclosed-by <char> Sets a required field enclosing
                                      character
   --escaped-by <char> Sets the escape character
   --fields-terminated-by <char> Sets the field separator character
   --lines-terminated-by <char> Sets the end-of-line character
   --mysql-delimiters Uses MySQL's default delimiter set:
                                      fields: , lines: \n escaped-by: \
                                      optionally-enclosed-by: '
   --optionally-enclosed-by <char> Sets a field enclosing character
  
Input parsing arguments:
   --input-enclosed-by <char> Sets a required field encloser
   --input-escaped-by <char> Sets the input escape
                                            character
   --input-fields-terminated-by <char> Sets the input field separator
   --input-lines-terminated-by <char> Sets the input end-of-line
                                            char
   --input-optionally-enclosed-by <char> Sets a field enclosing
                                            character
  
Hive arguments:
   --create-hive-table Fail if the target hive
                                               table exists
   --hive-database <database-name> Sets the database name to
                                               use when importing to hive
   --hive-delims-replacement <arg> Replace Hive record \0x01
                                               and row delimiters (\n\r)
                                               from imported string fields
                                               with user-defined string
   --hive-drop-import-delims Drop Hive record \0x01 and
                                               row delimiters (\n\r) from
                                               imported string fields
   --hive-home <dir> Override $HIVE_HOME
   --hive-import Import tables into Hive
                                               (Uses Hive's default
                                               delimiters if none are
                                               set.)
   --hive-overwrite Overwrite existing data in
                                               the Hive table
   --hive-partition-key <partition-key> Sets the partition key to
                                               use when importing to hive
   --hive-partition-value <partition-value> Sets the partition value to
                                               use when importing to hive
   --hive-table <table-name> Sets the table name to use
                                               when importing to hive
   --map-column-hive <arg> Override mapping for
                                               specific column to hive
                                               types.
  
HBase arguments:
   --column-family <family> Sets the target column family for the
                               import
   --hbase-bulkload Enables HBase bulk loading
   --hbase-create-table If specified, create missing HBase tables
   --hbase-row-key <col> Specifies which input column to use as the
                               row key
   --hbase-table <table> Import to <table> in HBase
  
HCatalog arguments:
   --hcatalog-database <arg> HCatalog database name
   --hcatalog-home <hdir> Override $HCAT_HOME
   --hcatalog-table <arg> HCatalog table name
   --hive-home <dir> Override $HIVE_HOME
   --hive-partition-key <partition-key> Sets the partition key to
                                               use when importing to hive
   --hive-partition-value <partition-value> Sets the partition value to
                                               use when importing to hive
   --map-column-hive <arg> Override mapping for
                                               specific column to hive
                                               types.
  
HCatalog import specific options:
   --create-hcatalog-table Create HCatalog before import
   --hcatalog-storage-stanza <arg> HCatalog storage stanza for table
                                      creation
  
Accumulo arguments:
   --accumulo-batch-size <size> Batch size in bytes
   --accumulo-column-family <family> Sets the target column family for
                                         the import
   --accumulo-create-table If specified, create missing
                                         Accumulo tables
   --accumulo-instance <instance> Accumulo instance name.
   --accumulo-max-latency <latency> Max write latency in milliseconds
   --accumulo-password <password> Accumulo password.
   --accumulo-row-key <col> Specifies which input column to
                                         use as the row key
   --accumulo-table <table> Import to <table> in Accumulo
   --accumulo-user <user> Accumulo user name.
   --accumulo-visibility <vis> Visibility token to be applied to
                                         all rows imported
   --accumulo-zookeepers <zookeepers> Comma-separated list of
                                         zookeepers (host:port)
  
Code generation arguments:
   --bindir <dir> Output directory for compiled
                                         objects
   --class-name <name> Sets the generated class name.
                                         This overrides --package-name.
                                         When combined with --jar-file,
                                         sets the input class.
   --input-null-non-string <null-str> Input null non-string
                                         representation
   --input-null-string <null-str> Input null string representation
   --jar-file <file> Disable code generation; use
                                         specified jar
   --map-column-java <arg> Override mapping for specific
                                         columns to java types
   --null-non-string <null-str> Null non-string representation
   --null-string <null-str> Null string representation
   --outdir <dir> Output directory for generated
                                         code
   --package-name <name> Put auto-generated classes in
                                         this package


向AI問一下細節

免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。

AI

荣成市| 万安县| 麻阳| 鱼台县| 浪卡子县| 吉隆县| 泉州市| 大同县| 德保县| 嘉善县| 洛南县| 辉南县| 贵港市| 香港| 泰兴市| 三原县| 吕梁市| 南昌县| 定远县| 嘉禾县| 湘潭县| 雷州市| 天柱县| 绿春县| 德庆县| 兰溪市| 太仓市| 大安市| 芦溪县| 宽城| 荔浦县| 孟州市| 宜丰县| 汾西县| 青海省| 甘德县| 新密市| 肥乡县| 罗平县| 杭锦后旗| 德昌县|