The company's real-time streaming pipeline uses SparkStreaming to process data imported into Kinesis.There was a Streaming task termination before and no alarm was notified.Plus, there is currently only one person in the company's big data department. In order to prevent being called by the police one night to restart the service, we decided to ...
Posted on Sat, 30 Nov 2019 03:06:47 -0800 by sapna
drwxrwxr-x 8 hadoop hadoop 4096 Apr 16 04:45 ./
drwxr-xr-x 28 hadoop hadoop 4096 Apr 16 07:04 ../
drwxrwxr-x 3 hadoop hadoop 4096 Apr 16 04:45 bin/
drwxrwxr-x 2 hadoop hadoop 4096 Apr 16 07:02 conf/
drwxrwxr-x 4 hadoop hadoop 4096 Apr 16 04:45 examples/
Posted on Wed, 27 Nov 2019 12:35:35 -0800 by dr4296
originAfter the Pinpoint access service monitoring, the data volume soars, with an average daily HBase data increment of about 20G. The data volume is too large, and the data needs to be cleaned up regularly, otherwise the monitoring availability will be reduced. Because the previous e ...
Posted on Wed, 20 Nov 2019 12:13:25 -0800 by landung
After starting spark shell, query the table information in hive and report an error
spark.sql("select * from student.student ").show()
Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
at org.apache.hadoop.hive.metastore.MetaStor ...
Posted on Wed, 06 Nov 2019 08:12:21 -0800 by Jagarm
Three CentOS virtual machines with JDK and Zookeeper installed
1) download the compressed package
[feng@hadoop129 software]$ ls
[feng@hadoop129 software ...
Posted on Tue, 05 Nov 2019 10:31:51 -0800 by Xorandnotor
Dataset import HDFS
Command line access to the dataset just uploaded to HDFS
[hadoop@master hadoop-2.6.0]$ bin/hdfs dfs -ls /weather/
MapReduce program compilation and operation:
Step 1: in the Map stage, extract the weather station and temperature data
public static class TemperatureMapper extends Mapper< LongWritable, ...
Posted on Mon, 04 Nov 2019 07:53:28 -0800 by cyberdesi
I. hdfs namenode HA
In Hadoop 1.0, there is a single point of failure of the namenode in the hdfs cluster. When the namenode is unavailable, the whole hdfs cluster service will be unavailable. In addition, if you need to temporarily design or operate the namenode, after you stop the namenode, the hdfs cluster cannot be used.Through ...
Posted on Sat, 02 Nov 2019 16:24:43 -0700 by newbie8899
Code download address: https://github.com/tazhigang/big-data-github.git
I. demand: output statistical results to different files (zones) according to different provinces where mobile phones belong
II. Data preparation
Data preparation: phoneData.txt in case 2
According to the first three digits of the phone number
III. create maven project
Posted on Fri, 01 Nov 2019 16:50:43 -0700 by biz0r
I. Basic overview of serialization
1. What is serialization
Serialization is the conversion of objects in memory into byte sequences (or according to other data transfer protocols), so as to facilitate persistent storage to disk and network transmission.
2. Why serialization is needed
Generally, objects are only stored in local memory, and onl ...
Posted on Mon, 28 Oct 2019 22:13:00 -0700 by magic-eyes
On the use of exec function in process
Problems encountered in the learning process:
How to use it?
What's the difference?
What is the role?
1. How to use it?
In Linux system, we can view it through man command:
View the use of man command
The standard sections o ...
Posted on Fri, 25 Oct 2019 22:53:58 -0700 by countcet