• 52389

    文章

  • 521

    评论

  • 43

    友链

  • 最近新加了换肤功能,大家多来逛逛吧~~~~
  • 喜欢这个网站的朋友可以加一下QQ群,我们一起交流技术。

大数据教程(12.4)hive实战--级联求和

695856371Web网页设计师②群 | 喜欢本站的朋友可以收藏本站,或者加入我们大家一起来交流技术!

欢迎来到梁钟霖个人博客网站。本个人博客网站提供最新的站长新闻,各种互联网资讯。 还提供个人博客模板,最新最全的java教程,java面试题。在此我将尽我最大所能将此个人博客网站做的最好! 谢谢大家,愿大家一起进步!

本节博主分享一个在工作中经常遇到的级联求和出报表的案例。需求如下:

(1)有如下访客访问次数统计表 t_access_times

访客

月份

访问次数

A

2015-01

5

A

2015-01

15

B

2015-01

5

A

2015-01

8

B

2015-01

25

A

2015-01

5

A

2015-02

4

A

2015-02

6

B

2015-02

10

B

2015-02

5

……

……

……

(2)需要输出报表:t_access_times_accumulate

访客

月份

月访问总计

累计访问总计

A

2015-01

33

33

A

2015-02

10

43

…….

…….

…….

…….

B

2015-01

30

30

B

2015-02

15

45

…….

…….

…….

…….

 (3)实现步骤

create table t_access_times(username string,month string,salary int)
row format delimited fields terminated by ',';

load data local inpath '/home/hadoop/t_access_times.dat' into table t_access_times;

A,2015-01,5
A,2015-01,15
B,2015-01,5
A,2015-01,8
B,2015-01,25
A,2015-01,5
A,2015-02,4
A,2015-02,6
B,2015-02,10
B,2015-02,5


1、第一步,先求个用户的月总金额
select username,month,sum(salary) as salary from t_access_times group by username,month

+-----------+----------+---------+--+
| username  |  month   | salary  |
+-----------+----------+---------+--+
| A         | 2015-01  | 33      |
| A         | 2015-02  | 10      |
| B         | 2015-01  | 30      |
| B         | 2015-02  | 15      |
+-----------+----------+---------+--+

2、第二步,将月总金额表 自己连接 自己连接
select *
from 
(select username,month,sum(salary) as salary from t_access_times group by username,month) A 
inner join 
(select username,month,sum(salary) as salary from t_access_times group by username,month) B
on
A.username=B.username
+-------------+----------+-----------+-------------+----------+-----------+--+
| a.username  | a.month  | a.salary  | b.username  | b.month  | b.salary  |
+-------------+----------+-----------+-------------+----------+-----------+--+
| A           | 2015-01  | 33        | A           | 2015-01  | 33        |
| A           | 2015-01  | 33        | A           | 2015-02  | 10        |
| A           | 2015-02  | 10        | A           | 2015-01  | 33        |
| A           | 2015-02  | 10        | A           | 2015-02  | 10        |
| B           | 2015-01  | 30        | B           | 2015-01  | 30        |
| B           | 2015-01  | 30        | B           | 2015-02  | 15        |
| B           | 2015-02  | 15        | B           | 2015-01  | 30        |
| B           | 2015-02  | 15        | B           | 2015-02  | 15        |
+-------------+----------+-----------+-------------+----------+-----------+--+

3、第三步,从上一步的结果中
进行分组查询,分组的字段是a.username a.month
求月累计值:  将b.month <= a.month的所有b.salary求和即可
select A.username,A.month,max(A.salary) as salary,sum(B.salary) as accumulate
from 
(select username,month,sum(salary) as salary from t_access_times group by username,month) A 
inner join 
(select username,month,sum(salary) as salary from t_access_times group by username,month) B
on
A.username=B.username
where B.month <= A.month
group by A.username,A.month
order by A.username,A.month;

 (4)操作效果

0: jdbc:hive2://centos-aaron-h1:10000> create table t_access_times(username string,month string,salary int)
0: jdbc:hive2://centos-aaron-h1:10000> row format delimited fields terminated by ',';
OK
No rows affected (0.908 seconds)
0: jdbc:hive2://centos-aaron-h1:10000> load data local inpath '/home/hadoop/t_access_times.dat' into table t_access_times;
Loading data to table default.t_access_times
Table default.t_access_times stats: [numFiles=1, totalSize=123]
OK
INFO  : Loading data to table default.t_access_times from file:/home/hadoop/t_access_times.dat
INFO  : Table default.t_access_times stats: [numFiles=1, totalSize=123]
No rows affected (2.88 seconds)
0: jdbc:hive2://centos-aaron-h1:10000> select username,month,sum(salary) as salary from t_access_times group by username,month;
Query ID = hadoop_20190212052316_64866ab3-25a5-4f1e-8ae4-7b2dcfcc1c1f
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
INFO  : Number of reduce tasks not specified. Estimated from input data size: 1
INFO  : In order to change the average load for a reducer (in bytes):
INFO  :   set hive.exec.reducers.bytes.per.reducer=<number>
INFO  : In order to limit the maximum number of reducers:
INFO  :   set hive.exec.reducers.max=<number>
INFO  : In order to set a constant number of reducers:
INFO  :   set mapreduce.job.reduces=<number>
Starting Job = job_1549919838832_0001, Tracking URL = http://centos-aaron-h1:8088/proxy/application_1549919838832_0001/
Kill Command = /home/hadoop/apps/hadoop-2.9.1/bin/hadoop job  -kill job_1549919838832_0001
INFO  : number of splits:1
INFO  : Submitting tokens for job: job_1549919838832_0001
INFO  : The url to track the job: http://centos-aaron-h1:8088/proxy/application_1549919838832_0001/
INFO  : Starting Job = job_1549919838832_0001, Tracking URL = http://centos-aaron-h1:8088/proxy/application_1549919838832_0001/
INFO  : Kill Command = /home/hadoop/apps/hadoop-2.9.1/bin/hadoop job  -kill job_1549919838832_0001
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2019-02-12 05:23:38,459 Stage-1 map = 0%,  reduce = 0%
INFO  : Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
INFO  : 2019-02-12 05:23:38,459 Stage-1 map = 0%,  reduce = 0%
2019-02-12 05:23:55,144 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.51 sec
INFO  : 2019-02-12 05:23:55,144 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.51 sec
2019-02-12 05:24:01,293 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 3.37 sec
INFO  : 2019-02-12 05:24:01,293 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 3.37 sec
MapReduce Total cumulative CPU time: 3 seconds 370 msec
Ended Job = job_1549919838832_0001
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 1  Reduce: 1   Cumulative CPU: 3.37 sec   HDFS Read: 7681 HDFS Write: 52 SUCCESS
Total MapReduce CPU Time Spent: 3 seconds 370 msec
OK
INFO  : MapReduce Total cumulative CPU time: 3 seconds 370 msec
INFO  : Ended Job = job_1549919838832_0001
+-----------+----------+---------+--+
| username  |  month   | salary  |
+-----------+----------+---------+--+
| A         | 2015-01  | 33      |
| A         | 2015-02  | 10      |
| B         | 2015-01  | 30      |
| B         | 2015-02  | 15      |
+-----------+----------+---------+--+
4 rows selected (46.143 seconds)
0: jdbc:hive2://centos-aaron-h1:10000> select *
0: jdbc:hive2://centos-aaron-h1:10000> from 
0: jdbc:hive2://centos-aaron-h1:10000> (select username,month,sum(salary) as salary from t_access_times group by username,month) A 
0: jdbc:hive2://centos-aaron-h1:10000> inner join 
0: jdbc:hive2://centos-aaron-h1:10000> (select username,month,sum(salary) as salary from t_access_times group by username,month) B
0: jdbc:hive2://centos-aaron-h1:10000> on
0: jdbc:hive2://centos-aaron-h1:10000> A.username=B.username;
Query ID = hadoop_20190212052542_208d2ee5-d122-4a12-a0d6-aa6ec90e031a
Total jobs = 5
Launching Job 1 out of 5
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1549919838832_0002, Tracking URL = http://centos-aaron-h1:8088/proxy/application_1549919838832_0002/
Kill Command = /home/hadoop/apps/hadoop-2.9.1/bin/hadoop job  -kill job_1549919838832_0002
INFO  : Number of reduce tasks not specified. Estimated from input data size: 1
INFO  : In order to change the average load for a reducer (in bytes):
INFO  :   set hive.exec.reducers.bytes.per.reducer=<number>
INFO  : In order to limit the maximum number of reducers:
INFO  :   set hive.exec.reducers.max=<number>
INFO  : In order to set a constant number of reducers:
INFO  :   set mapreduce.job.reduces=<number>
INFO  : number of splits:1
INFO  : Submitting tokens for job: job_1549919838832_0002
INFO  : The url to track the job: http://centos-aaron-h1:8088/proxy/application_1549919838832_0002/
INFO  : Starting Job = job_1549919838832_0002, Tracking URL = http://centos-aaron-h1:8088/proxy/application_1549919838832_0002/
INFO  : Kill Command = /home/hadoop/apps/hadoop-2.9.1/bin/hadoop job  -kill job_1549919838832_0002
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2019-02-12 05:25:55,359 Stage-1 map = 0%,  reduce = 0%
INFO  : Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
INFO  : 2019-02-12 05:25:55,359 Stage-1 map = 0%,  reduce = 0%
2019-02-12 05:26:05,614 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.61 sec
INFO  : 2019-02-12 05:26:05,614 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.61 sec
2019-02-12 05:26:11,762 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 3.39 sec
INFO  : 2019-02-12 05:26:11,762 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 3.39 sec
MapReduce Total cumulative CPU time: 3 seconds 390 msec
Ended Job = job_1549919838832_0002
Launching Job 2 out of 5
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
INFO  : MapReduce Total cumulative CPU time: 3 seconds 390 msec
INFO  : Ended Job = job_1549919838832_0002
INFO  : Number of reduce tasks not specified. Estimated from input data size: 1
INFO  : In order to change the average load for a reducer (in bytes):
INFO  :   set hive.exec.reducers.bytes.per.reducer=<number>
INFO  : In order to limit the maximum number of reducers:
INFO  :   set hive.exec.reducers.max=<number>
INFO  : In order to set a constant number of reducers:
INFO  :   set mapreduce.job.reduces=<number>
Starting Job = job_1549919838832_0003, Tracking URL = http://centos-aaron-h1:8088/proxy/application_1549919838832_0003/
Kill Command = /home/hadoop/apps/hadoop-2.9.1/bin/hadoop job  -kill job_1549919838832_0003
INFO  : number of splits:1
INFO  : Submitting tokens for job: job_1549919838832_0003
INFO  : The url to track the job: http://centos-aaron-h1:8088/proxy/application_1549919838832_0003/
INFO  : Starting Job = job_1549919838832_0003, Tracking URL = http://centos-aaron-h1:8088/proxy/application_1549919838832_0003/
INFO  : Kill Command = /home/hadoop/apps/hadoop-2.9.1/bin/hadoop job  -kill job_1549919838832_0003
Hadoop job information for Stage-3: number of mappers: 1; number of reducers: 1
2019-02-12 05:26:34,772 Stage-3 map = 0%,  reduce = 0%
INFO  : Hadoop job information for Stage-3: number of mappers: 1; number of reducers: 1
INFO  : 2019-02-12 05:26:34,772 Stage-3 map = 0%,  reduce = 0%
2019-02-12 05:26:43,958 Stage-3 map = 100%,  reduce = 0%, Cumulative CPU 0.86 sec
INFO  : 2019-02-12 05:26:43,958 Stage-3 map = 100%,  reduce = 0%, Cumulative CPU 0.86 sec
2019-02-12 05:26:52,127 Stage-3 map = 100%,  reduce = 100%, Cumulative CPU 1.73 sec
INFO  : 2019-02-12 05:26:52,127 Stage-3 map = 100%,  reduce = 100%, Cumulative CPU 1.73 sec
MapReduce Total cumulative CPU time: 1 seconds 730 msec
Ended Job = job_1549919838832_0003
Stage-7 is selected by condition resolver.
Stage-8 is filtered out by condition resolver.
Stage-2 is filtered out by condition resolver.
Execution log at: /tmp/hadoop/hadoop_20190212052542_208d2ee5-d122-4a12-a0d6-aa6ec90e031a.log
INFO  : MapReduce Total cumulative CPU time: 1 seconds 730 msec
INFO  : Ended Job = job_1549919838832_0003
INFO  : Stage-7 is selected by condition resolver.
INFO  : Stage-8 is filtered out by condition resolver.
INFO  : Stage-2 is filtered out by condition resolver.
2019-02-12 05:26:58     Starting to launch local task to process map join;      maximum memory = 518979584
2019-02-12 05:26:59     Dump the side-table for tag: 1 with group count: 2 into file: file:/tmp/hadoop/2d536889-8e64-4ece-91b9-6ae10c4ff631/hive_2019-02-12_05-25-42_429_4168125726438328049-1/-local-10005/HashTable-Stage-4/MapJoin-mapfile01--.hashtable
2019-02-12 05:26:59     Uploaded 1 File to: file:/tmp/hadoop/2d536889-8e64-4ece-91b9-6ae10c4ff631/hive_2019-02-12_05-25-42_429_4168125726438328049-1/-local-10005/HashTable-Stage-4/MapJoin-mapfile01--.hashtable (346 bytes)
2019-02-12 05:26:59     End of local task; Time Taken: 1.103 sec.
Execution completed successfully
MapredLocal task succeeded
Launching Job 4 out of 5
Number of reduce tasks is set to 0 since there's no reduce operator
INFO  : Execution completed successfully
INFO  : MapredLocal task succeeded
INFO  : Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1549919838832_0004, Tracking URL = http://centos-aaron-h1:8088/proxy/application_1549919838832_0004/
Kill Command = /home/hadoop/apps/hadoop-2.9.1/bin/hadoop job  -kill job_1549919838832_0004
INFO  : number of splits:1
INFO  : Submitting tokens for job: job_1549919838832_0004
INFO  : The url to track the job: http://centos-aaron-h1:8088/proxy/application_1549919838832_0004/
INFO  : Starting Job = job_1549919838832_0004, Tracking URL = http://centos-aaron-h1:8088/proxy/application_1549919838832_0004/
INFO  : Kill Command = /home/hadoop/apps/hadoop-2.9.1/bin/hadoop job  -kill job_1549919838832_0004
Hadoop job information for Stage-4: number of mappers: 1; number of reducers: 0
2019-02-12 05:27:14,062 Stage-4 map = 0%,  reduce = 0%
INFO  : Hadoop job information for Stage-4: number of mappers: 1; number of reducers: 0
INFO  : 2019-02-12 05:27:14,062 Stage-4 map = 0%,  reduce = 0%
2019-02-12 05:27:22,362 Stage-4 map = 100%,  reduce = 0%, Cumulative CPU 0.83 sec
INFO  : 2019-02-12 05:27:22,362 Stage-4 map = 100%,  reduce = 0%, Cumulative CPU 0.83 sec
MapReduce Total cumulative CPU time: 830 msec
Ended Job = job_1549919838832_0004
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 1  Reduce: 1   Cumulative CPU: 3.39 sec   HDFS Read: 7283 HDFS Write: 208 SUCCESS
Stage-Stage-3: Map: 1  Reduce: 1   Cumulative CPU: 1.73 sec   HDFS Read: 7285 HDFS Write: 208 SUCCESS
Stage-Stage-4: Map: 1   Cumulative CPU: 0.83 sec   HDFS Read: 5188 HDFS Write: 208 SUCCESS
Total MapReduce CPU Time Spent: 5 seconds 950 msec
OK
INFO  : MapReduce Total cumulative CPU time: 830 msec
INFO  : Ended Job = job_1549919838832_0004
+-------------+----------+-----------+-------------+----------+-----------+--+
| a.username  | a.month  | a.salary  | b.username  | b.month  | b.salary  |
+-------------+----------+-----------+-------------+----------+-----------+--+
| A           | 2015-01  | 33        | A           | 2015-01  | 33        |
| A           | 2015-01  | 33        | A           | 2015-02  | 10        |
| A           | 2015-02  | 10        | A           | 2015-01  | 33        |
| A           | 2015-02  | 10        | A           | 2015-02  | 10        |
| B           | 2015-01  | 30        | B           | 2015-01  | 30        |
| B           | 2015-01  | 30        | B           | 2015-02  | 15        |
| B           | 2015-02  | 15        | B           | 2015-01  | 30        |
| B           | 2015-02  | 15        | B           | 2015-02  | 15        |
+-------------+----------+-----------+-------------+----------+-----------+--+
8 rows selected (101.008 seconds)
0: jdbc:hive2://centos-aaron-h1:10000> 

0: jdbc:hive2://centos-aaron-h1:10000> select A.username,A.month,max(A.salary) as salary,sum(B.salary) as accumulate
0: jdbc:hive2://centos-aaron-h1:10000> from 
0: jdbc:hive2://centos-aaron-h1:10000> (select username,month,sum(salary) as salary from t_access_times group by username,month) A 
0: jdbc:hive2://centos-aaron-h1:10000> inner join 
0: jdbc:hive2://centos-aaron-h1:10000> (select username,month,sum(salary) as salary from t_access_times group by username,month) B
0: jdbc:hive2://centos-aaron-h1:10000> on
0: jdbc:hive2://centos-aaron-h1:10000> A.username=B.username
0: jdbc:hive2://centos-aaron-h1:10000> where B.month <= A.month
0: jdbc:hive2://centos-aaron-h1:10000> group by A.username,A.month
0: jdbc:hive2://centos-aaron-h1:10000> order by A.username,A.month;
Query ID = hadoop_20190212053047_a2bcb673-b252-4277-85dd-8085248520aa
Total jobs = 7
Launching Job 1 out of 7
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1549919838832_0005, Tracking URL = http://centos-aaron-h1:8088/proxy/application_1549919838832_0005/
Kill Command = /home/hadoop/apps/hadoop-2.9.1/bin/hadoop job  -kill job_1549919838832_0005
INFO  : Number of reduce tasks not specified. Estimated from input data size: 1
INFO  : In order to change the average load for a reducer (in bytes):
INFO  :   set hive.exec.reducers.bytes.per.reducer=<number>
INFO  : In order to limit the maximum number of reducers:
INFO  :   set hive.exec.reducers.max=<number>
INFO  : In order to set a constant number of reducers:
INFO  :   set mapreduce.job.reduces=<number>
INFO  : number of splits:1
INFO  : Submitting tokens for job: job_1549919838832_0005
INFO  : The url to track the job: http://centos-aaron-h1:8088/proxy/application_1549919838832_0005/
INFO  : Starting Job = job_1549919838832_0005, Tracking URL = http://centos-aaron-h1:8088/proxy/application_1549919838832_0005/
INFO  : Kill Command = /home/hadoop/apps/hadoop-2.9.1/bin/hadoop job  -kill job_1549919838832_0005
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2019-02-12 05:30:59,370 Stage-1 map = 0%,  reduce = 0%
INFO  : Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
INFO  : 2019-02-12 05:30:59,370 Stage-1 map = 0%,  reduce = 0%
2019-02-12 05:31:06,540 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.49 sec
INFO  : 2019-02-12 05:31:06,540 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.49 sec
2019-02-12 05:31:12,653 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 2.29 sec
INFO  : 2019-02-12 05:31:12,653 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 2.29 sec
MapReduce Total cumulative CPU time: 2 seconds 290 msec
Ended Job = job_1549919838832_0005
Launching Job 2 out of 7
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1549919838832_0006, Tracking URL = http://centos-aaron-h1:8088/proxy/application_1549919838832_0006/
Kill Command = /home/hadoop/apps/hadoop-2.9.1/bin/hadoop job  -kill job_1549919838832_0006
INFO  : MapReduce Total cumulative CPU time: 2 seconds 290 msec
INFO  : Ended Job = job_1549919838832_0005
INFO  : Number of reduce tasks not specified. Estimated from input data size: 1
INFO  : In order to change the average load for a reducer (in bytes):
INFO  :   set hive.exec.reducers.bytes.per.reducer=<number>
INFO  : In order to limit the maximum number of reducers:
INFO  :   set hive.exec.reducers.max=<number>
INFO  : In order to set a constant number of reducers:
INFO  :   set mapreduce.job.reduces=<number>
INFO  : number of splits:1
INFO  : Submitting tokens for job: job_1549919838832_0006
INFO  : The url to track the job: http://centos-aaron-h1:8088/proxy/application_1549919838832_0006/
INFO  : Starting Job = job_1549919838832_0006, Tracking URL = http://centos-aaron-h1:8088/proxy/application_1549919838832_0006/
INFO  : Kill Command = /home/hadoop/apps/hadoop-2.9.1/bin/hadoop job  -kill job_1549919838832_0006
Hadoop job information for Stage-5: number of mappers: 1; number of reducers: 1
2019-02-12 05:31:37,597 Stage-5 map = 0%,  reduce = 0%
INFO  : Hadoop job information for Stage-5: number of mappers: 1; number of reducers: 1
INFO  : 2019-02-12 05:31:37,597 Stage-5 map = 0%,  reduce = 0%
2019-02-12 05:31:49,323 Stage-5 map = 100%,  reduce = 0%, Cumulative CPU 3.24 sec
INFO  : 2019-02-12 05:31:49,323 Stage-5 map = 100%,  reduce = 0%, Cumulative CPU 3.24 sec
2019-02-12 05:31:55,512 Stage-5 map = 100%,  reduce = 100%, Cumulative CPU 4.02 sec
MapReduce Total cumulative CPU time: 4 seconds 20 msec
Ended Job = job_1549919838832_0006
Stage-9 is selected by condition resolver.
Stage-10 is filtered out by condition resolver.
Stage-2 is filtered out by condition resolver.
INFO  : 2019-02-12 05:31:55,512 Stage-5 map = 100%,  reduce = 100%, Cumulative CPU 4.02 sec
INFO  : MapReduce Total cumulative CPU time: 4 seconds 20 msec
INFO  : Ended Job = job_1549919838832_0006
INFO  : Stage-9 is selected by condition resolver.
INFO  : Stage-10 is filtered out by condition resolver.
INFO  : Stage-2 is filtered out by condition resolver.
Execution log at: /tmp/hadoop/hadoop_20190212053047_a2bcb673-b252-4277-85dd-8085248520aa.log
2019-02-12 05:32:00     Starting to launch local task to process map join;      maximum memory = 518979584
2019-02-12 05:32:01     Dump the side-table for tag: 1 with group count: 2 into file: file:/tmp/hadoop/e8520f79-8d60-4b0e-a593-b9cfbad8463e/hive_2019-02-12_05-30-47_300_154987026293528311-4/-local-10007/HashTable-Stage-6/MapJoin-mapfile21--.hashtable
2019-02-12 05:32:01     Uploaded 1 File to: file:/tmp/hadoop/e8520f79-8d60-4b0e-a593-b9cfbad8463e/hive_2019-02-12_05-30-47_300_154987026293528311-4/-local-10007/HashTable-Stage-6/MapJoin-mapfile21--.hashtable (346 bytes)
2019-02-12 05:32:01     End of local task; Time Taken: 0.824 sec.
Execution completed successfully
MapredLocal task succeeded
Launching Job 4 out of 7
Number of reduce tasks is set to 0 since there's no reduce operator
INFO  : Execution completed successfully
INFO  : MapredLocal task succeeded
INFO  : Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1549919838832_0007, Tracking URL = http://centos-aaron-h1:8088/proxy/application_1549919838832_0007/
Kill Command = /home/hadoop/apps/hadoop-2.9.1/bin/hadoop job  -kill job_1549919838832_0007
INFO  : number of splits:1
INFO  : Submitting tokens for job: job_1549919838832_0007
INFO  : The url to track the job: http://centos-aaron-h1:8088/proxy/application_1549919838832_0007/
INFO  : Starting Job = job_1549919838832_0007, Tracking URL = http://centos-aaron-h1:8088/proxy/application_1549919838832_0007/
INFO  : Kill Command = /home/hadoop/apps/hadoop-2.9.1/bin/hadoop job  -kill job_1549919838832_0007
Hadoop job information for Stage-6: number of mappers: 1; number of reducers: 0
2019-02-12 05:32:16,644 Stage-6 map = 0%,  reduce = 0%
INFO  : Hadoop job information for Stage-6: number of mappers: 1; number of reducers: 0
INFO  : 2019-02-12 05:32:16,644 Stage-6 map = 0%,  reduce = 0%
2019-02-12 05:32:27,075 Stage-6 map = 100%,  reduce = 0%, Cumulative CPU 1.06 sec
INFO  : 2019-02-12 05:32:27,075 Stage-6 map = 100%,  reduce = 0%, Cumulative CPU 1.06 sec
MapReduce Total cumulative CPU time: 1 seconds 60 msec
Ended Job = job_1549919838832_0007
Launching Job 5 out of 7
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1549919838832_0008, Tracking URL = http://centos-aaron-h1:8088/proxy/application_1549919838832_0008/
Kill Command = /home/hadoop/apps/hadoop-2.9.1/bin/hadoop job  -kill job_1549919838832_0008
INFO  : MapReduce Total cumulative CPU time: 1 seconds 60 msec
INFO  : Ended Job = job_1549919838832_0007
INFO  : Number of reduce tasks not specified. Estimated from input data size: 1
INFO  : In order to change the average load for a reducer (in bytes):
INFO  :   set hive.exec.reducers.bytes.per.reducer=<number>
INFO  : In order to limit the maximum number of reducers:
INFO  :   set hive.exec.reducers.max=<number>
INFO  : In order to set a constant number of reducers:
INFO  :   set mapreduce.job.reduces=<number>
INFO  : number of splits:1
INFO  : Submitting tokens for job: job_1549919838832_0008
INFO  : The url to track the job: http://centos-aaron-h1:8088/proxy/application_1549919838832_0008/
INFO  : Starting Job = job_1549919838832_0008, Tracking URL = http://centos-aaron-h1:8088/proxy/application_1549919838832_0008/
INFO  : Kill Command = /home/hadoop/apps/hadoop-2.9.1/bin/hadoop job  -kill job_1549919838832_0008
Hadoop job information for Stage-3: number of mappers: 1; number of reducers: 1
2019-02-12 05:32:44,584 Stage-3 map = 0%,  reduce = 0%
INFO  : Hadoop job information for Stage-3: number of mappers: 1; number of reducers: 1
INFO  : 2019-02-12 05:32:44,584 Stage-3 map = 0%,  reduce = 0%
2019-02-12 05:32:56,191 Stage-3 map = 100%,  reduce = 0%, Cumulative CPU 2.87 sec
INFO  : 2019-02-12 05:32:56,191 Stage-3 map = 100%,  reduce = 0%, Cumulative CPU 2.87 sec
2019-02-12 05:33:02,318 Stage-3 map = 100%,  reduce = 100%, Cumulative CPU 3.62 sec
INFO  : 2019-02-12 05:33:02,318 Stage-3 map = 100%,  reduce = 100%, Cumulative CPU 3.62 sec
MapReduce Total cumulative CPU time: 3 seconds 620 msec
Ended Job = job_1549919838832_0008
Launching Job 6 out of 7
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1549919838832_0009, Tracking URL = http://centos-aaron-h1:8088/proxy/application_1549919838832_0009/
Kill Command = /home/hadoop/apps/hadoop-2.9.1/bin/hadoop job  -kill job_1549919838832_0009
INFO  : MapReduce Total cumulative CPU time: 3 seconds 620 msec
INFO  : Ended Job = job_1549919838832_0008
INFO  : Number of reduce tasks determined at compile time: 1
INFO  : In order to change the average load for a reducer (in bytes):
INFO  :   set hive.exec.reducers.bytes.per.reducer=<number>
INFO  : In order to limit the maximum number of reducers:
INFO  :   set hive.exec.reducers.max=<number>
INFO  : In order to set a constant number of reducers:
INFO  :   set mapreduce.job.reduces=<number>
INFO  : number of splits:1
INFO  : Submitting tokens for job: job_1549919838832_0009
INFO  : The url to track the job: http://centos-aaron-h1:8088/proxy/application_1549919838832_0009/
INFO  : Starting Job = job_1549919838832_0009, Tracking URL = http://centos-aaron-h1:8088/proxy/application_1549919838832_0009/
INFO  : Kill Command = /home/hadoop/apps/hadoop-2.9.1/bin/hadoop job  -kill job_1549919838832_0009
Hadoop job information for Stage-4: number of mappers: 1; number of reducers: 1
2019-02-12 05:33:15,716 Stage-4 map = 0%,  reduce = 0%
INFO  : Hadoop job information for Stage-4: number of mappers: 1; number of reducers: 1
INFO  : 2019-02-12 05:33:15,716 Stage-4 map = 0%,  reduce = 0%
2019-02-12 05:33:22,868 Stage-4 map = 100%,  reduce = 0%, Cumulative CPU 0.75 sec
INFO  : 2019-02-12 05:33:22,868 Stage-4 map = 100%,  reduce = 0%, Cumulative CPU 0.75 sec
2019-02-12 05:33:28,985 Stage-4 map = 100%,  reduce = 100%, Cumulative CPU 1.59 sec
MapReduce Total cumulative CPU time: 1 seconds 590 msec
Ended Job = job_1549919838832_0009
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 1  Reduce: 1   Cumulative CPU: 2.29 sec   HDFS Read: 7284 HDFS Write: 208 SUCCESS
Stage-Stage-5: Map: 1  Reduce: 1   Cumulative CPU: 4.02 sec   HDFS Read: 7284 HDFS Write: 208 SUCCESS
Stage-Stage-6: Map: 1   Cumulative CPU: 1.06 sec   HDFS Read: 5638 HDFS Write: 212 SUCCESS
Stage-Stage-3: Map: 1  Reduce: 1   Cumulative CPU: 3.62 sec   HDFS Read: 5420 HDFS Write: 212 SUCCESS
Stage-Stage-4: Map: 1  Reduce: 1   Cumulative CPU: 1.59 sec   HDFS Read: 5597 HDFS Write: 64 SUCCESS
Total MapReduce CPU Time Spent: 12 seconds 580 msec
OK
INFO  : 2019-02-12 05:33:28,985 Stage-4 map = 100%,  reduce = 100%, Cumulative CPU 1.59 sec
INFO  : MapReduce Total cumulative CPU time: 1 seconds 590 msec
INFO  : Ended Job = job_1549919838832_0009
+-------------+----------+---------+-------------+--+
| a.username  | a.month  | salary  | accumulate  |
+-------------+----------+---------+-------------+--+
| A           | 2015-01  | 33      | 33          |
| A           | 2015-02  | 10      | 43          |
| B           | 2015-01  | 30      | 30          |
| B           | 2015-02  | 15      | 45          |
+-------------+----------+---------+-------------+--+
4 rows selected (162.792 seconds)
0: jdbc:hive2://centos-aaron-h1:10000> 

    最后寄语,以上是博主本次文章的全部内容,如果大家觉得博主的文章还不错,请点赞;如果您对博主其它服务器大数据技术或者博主本人感兴趣,请关注博主博客,并且欢迎随时跟博主沟通交流。


 转载至链接:https://my.oschina.net/u/2371923/blog/3009017。


转载原创文章请注明出处,转载至: 梁钟霖个人博客www.liangzl.com

您觉喜欢本网站,或者觉得本文章对您有帮助,那么可以选择打赏。
打赏多少,您高兴就行,谢谢您对梁钟霖这小子的支持! ~(@^_^@)~

  • 微信扫一扫

  • 支付宝扫一扫

    支付宝打赏

0条评论

Loading...


发表评论

电子邮件地址不会被公开。 必填项已用*标注

自定义皮肤
注册梁钟霖个人博客