您好,登錄后才能下訂單哦!
這篇文章主要講解了“hadoop WordCount案例分析”,文中的講解內容簡單清晰,易于學習與理解,下面請大家跟著小編的思路慢慢深入,一起來研究和學習“hadoop WordCount案例分析”吧!
public class WordCount {
public static class TokenizerMapper extends Mapper<Object, Text, Text, IntWritable>{
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
//TextInput默認設置是讀取一行數據,map階段是按照我們的需求將讀取到的每一行進行分割。
public void map(Object key, Text value, Context context) throws IOException, InterruptedException {
StringTokenizer line = new StringTokenizer(value.toString());
while(line.hasMoreTokens()){
word.set(line.nextToken());
context.write(word, one);
}
}
}
//在reduce階段,是map階段分割后的經過排序后的數據向reduce任務中copy的過程,在此過程中會有一個背景線程將相同的key值進行合并,并將其value值歸并到一個類似集合的容器中,此時的邏輯就是我們要遍歷這個容器中的數據,計算它的值,然后輸出。
public static class IntSumReducer extends Reducer<Text,IntWritable,Text,IntWritable> {
private IntWritable result = new IntWritable();
public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {
int sum = 0;
for (IntWritable val : values) {
sum+=val.get();
}
result.set(sum);
context.write(key, result);
}
}
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();
if (otherArgs.length != 2) {
System.err.println("Usage: wordcount <in> <out>");
System.exit(2);
}
Job job = new Job(conf, "word count");
job.setJarByClass(WordCount.class);
job.setMapperClass(TokenizerMapper.class);
job.setCombinerClass(IntSumReducer.class);
job.setReducerClass(IntSumReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}
感謝各位的閱讀,以上就是“hadoop WordCount案例分析”的內容了,經過本文的學習后,相信大家對hadoop WordCount案例分析這一問題有了更深刻的體會,具體使用情況還需要大家實踐驗證。這里是億速云,小編將為大家推送更多相關知識點的文章,歡迎關注!
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。