hive0.14-inrt、update、delete操作测试
⾸先⽤最普通的建表语句建⼀个表:
hive>create table test(id int,name string)row format delimited fields terminated by ',';
测试inrt:
inrt into table test values (1,'row1'),(2,'row2');
读题结果报错:
java.io.FileNotFoundException: File does not exist: hdfs://127.0.0.1:9000/home/hadoop/git/hive/packaging/target/apache-hive-0.14.0-SNAPSHOT-bin/
apache-hive-0.14.0-SNAPSHOT-bin/lib/curator-client-2.6.0.jar
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1128)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
at org.apache.hadoop.solve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.FileStatus(DistributedFileSystem.java:1120)
at org.apache.hadoop.mapreduce.FileStatus(ClientDistributedCacheManager.java:288)
at org.apache.hadoop.mapreduce.FileStatus(ClientDistributedCacheManager.java:224)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:99)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57) at org.apache.hadoop.pyAndConfigureFiles(JobSubmitter.java:265)
at org.apache.hadoop.pyAndConfigureFiles(JobSubmitter.java:301)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:389)
东马塍at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
at java.curity.AccessController.doPrivileged(Native Method)
......
貌似往hdfs上找jar包了,⼩问题,直接把lib下的jar包上传到hdfs
hadoop fs -mkdir -p /home/hadoop/git/hive/packaging/target/apache-hive-0.14.0-SNAPSHOT-bin/apache-hive-0.14.0-SNAPSHOT-bin/lib/
hadoop fs -put $HIVE_HOME/lib/* /home/hadoop/git/hive/packaging/target/apache-hive-0.14.0-SNAPSHOT-bin/apache-hive-0.14.0-SNAPSHOT-bin/lib/
接着运⾏inrt,没有问题,接下来测试delete
hive>delete from test where id = 1;
报错!:
FAILED: SemanticException [Error 10294]: Attempt to do update or delete using transaction manage
r that does not support the operations.
说是在使⽤的转换管理器不⽀持update跟delete操作。
原来要⽀持update操作跟delete操作,必须额外再配置⼀些东西,见:
根据提⽰配置l:
urrency – true
< – true
配置完以为能够顺利运⾏了,谁知开始报下⾯这个错误:苞组词
FAILED: LockException [Error 10280]: Error communicating with the metastore
与元数据库出现了问题,修改log为DEBUG查看具体错误:
2014-11-04 14:20:14,367 DEBUG [Thread-8]: txn.CompactionTxnHandler (CompactionTxnHandler.java:findReadyToClean(265)) - Going to execute quer
艺术教育cq_databa, cq_table, cq_partition, cq_type, cq_run_as from COMPACTION_QUEUE where cq_state = 'r'>
2014-11-04 14:20:14,367 ERROR [Thread-8]: txn.CompactionTxnHandler (CompactionTxnHandler.java:findReadyToClean(285)) - Unable to lect next element Table 'hivePACTION_QUEUE' doesn't exist
2014-11-04 14:20:14,367 DEBUG [Thread-8]: txn.CompactionTxnHandler (CompactionTxnHandler.java:findReadyToClean(287)) - Going to rollback
2014-11-04 14:20:14,368 ERROR [Thread-8]: compactor.Cleaner (Cleaner.java:run(143)) - Caught an exception in the main loop of compactor cleaner, MetaExc :Unable to connect to transaction ptions.jdbc4.MySQLSyntaxErrorException: Table 'hivePACTION
_QUEUE' doesn't exist
wInstance(Unknown Source)
wInstance(DelegatingConstructorAccessorImpl.java:45)
at wInstance(Constructor.java:526)
sql.jdbc.Util.handleNewInstance(Util.java:409)
在元数据库中找不到COMPACTION_QUEUE这个表,赶紧去mysql中查看,确实没有这个表。怎么会没有这个表呢?找了很久都没找到什
么原因,查源码吧。
在org.apache.下的TxnDbUtil类中找到了建表语句,顺藤摸⽠,找到了下⾯这个⽅法会调⽤建表语句:
private void checkQFileTestHack() {
滴水寺
boolean hackOn = BoolVar(conf, HiveConf.ConfVars.HIVE_IN_TEST) ||
if (hackOn) {
LOG.info("Hacking in canned values for transaction manager");
// Set up the transaction/locking db in the derby metastore
TxnDbUtil.tConfValues(conf);
try {
TxnDbUtil.prepDb();
} catch (Exception e) {
// We may have already created the tables and thus don't need to redo it.
if (!e.getMessage().contains("already exists")) {
throw new RuntimeException("Unable to t up transaction databa for" +
" testing: " + e.getMessage());
}
}
}
}双桂坊
什么意思呢,就是说要运⾏建表语句还有⼀个条件:HIVE_IN_TEST或者HIVE_IN_TEZ_TEST.只有在测试环境中才能⽤delete,update操
作,也可以理解,毕竟还没有开发完全。
终于找到原因,解决⽅法也很简单:在l中添加下⾯的配置:
<property>
<name>st</name>
<value>true</value>
</property>
OK,再重新启动服务,再运⾏delete:
hive>delete from test where id = 1;
⼜报错:
FAILED: SemanticException [Error 10297]: Attempt to do update or delete on st that does not u an AcidOutputFormat or is not bucketed
说是要进⾏delete操作的表test不是AcidOutputFormat或没有分桶。 估计是要求输出是AcidOutputFormat然后必须分桶
⽹上查到确实如此,⽽且⽬前只有ORCFileformat⽀持AcidOutputFormat,不仅如此建表时必须指定参数('transactional' = true)。感觉太⿇烦了。。。。
于是按照⽹上⽰例建表:
hive>create table test(id int ,name string )clustered by (id) into 2 buckets stored as orc TBLPROPERTIES('transactional'='true');
红色资本
inrt
hive>inrt into table test values (1,'row1'),(2,'row2'),(3,'row3');
delete
hive>delete from test where id = 1;
update
hive>update test t name = 'Raj' where id = 2;
OK!全部顺利运⾏,不过貌似效率太低了,基本都要30s左右,估计应该可以优化,再研究研究
肉饼子蒸蛋最后还有个问题:show tables时报错:
hive> show tables;
OK
tab_name
Failed with exception java.io.IOException:java.lang.IllegalArgumentException: java.URISyntaxException: Relative path in absolute URI: fcitx-socket-:0 Time taken: 0.064 conds
好像跟/tmp/下fcitx-socket-:0⽂件名有关,待解决。。。