SparkStreaming auto restart

The company's real-time streaming pipeline uses SparkStreaming to process data imported into Kinesis.There was a Streaming task termination before and no alarm was notified.Plus, there is currently only one person in the company's big data department. In order to prevent being called by the police one night to restart the service, we decided to ...

Posted on Sat, 30 Nov 2019 03:06:47 -0800 by sapna

Construction and basic use of hive environment

hadoop@vm2:~/apache-hive-0.14.0-bin$ ll total 400 drwxrwxr-x 8 hadoop hadoop 4096 Apr 16 04:45 ./ drwxr-xr-x 28 hadoop hadoop 4096 Apr 16 07:04 ../ drwxrwxr-x 3 hadoop hadoop 4096 Apr 16 04:45 bin/ drwxrwxr-x 2 hadoop hadoop 4096 Apr 16 07:02 conf/ drwxrwxr-x 4 hadoop hadoop 4096 Apr 16 04:45 examples/ drwxrwxr-x ...

Posted on Wed, 27 Nov 2019 12:35:35 -0800 by dr4296

Modify TTL value of hbase table by pinpoint

Reference document https://greatwqs.iteye.com/blog/1741330 originAfter the Pinpoint access service monitoring, the data volume soars, with an average daily HBase data increment of about 20G. The data volume is too large, and the data needs to be cleaned up regularly, otherwise the monitoring availability will be reduced. Because the previous e ...

Posted on Wed, 20 Nov 2019 12:13:25 -0800 by landung

Spark SQL error report summary

Wrong one: After starting spark shell, query the table information in hive and report an error $SPARK_HOME/bin/spark-shell spark.sql("select * from student.student ").show() Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient at org.apache.hadoop.hive.metastore.MetaStor ...

Posted on Wed, 06 Nov 2019 08:12:21 -0800 by Jagarm

My big data Tour - Kafka environment construction

  Environmental preparation: Three CentOS virtual machines with JDK and Zookeeper installed   Environment building   1) download the compressed package https://www.apache.org/dyn/closer.cgi?path=/kafka/2.2.0/kafka_2.12-2.2.0.tgz 2) decompression: [feng@hadoop129 software]$ ls kafka_2.11-2.2.0.tgz [feng@hadoop129 software ...

Posted on Tue, 05 Nov 2019 10:31:51 -0800 by Xorandnotor

Analyze American average temperature project and master MapReduce programming

Dataset import HDFS Command line access to the dataset just uploaded to HDFS [hadoop@master hadoop-2.6.0]$ bin/hdfs dfs -ls /weather/    MapReduce program compilation and operation: Step 1: in the Map stage, extract the weather station and temperature data public static class TemperatureMapper extends Mapper< LongWritable, ...

Posted on Mon, 04 Nov 2019 07:53:28 -0800 by cyberdesi

III. zookeeper -- implement HA of NN and RM

I. hdfs namenode HA 1. Overview In Hadoop 1.0, there is a single point of failure of the namenode in the hdfs cluster. When the namenode is unavailable, the whole hdfs cluster service will be unavailable. In addition, if you need to temporarily design or operate the namenode, after you stop the namenode, the hdfs cluster cannot be used.Through ...

Posted on Sat, 02 Nov 2019 16:24:43 -0700 by newbie8899

Big data case - MapReduce traffic statistics case - partition

Code download address: https://github.com/tazhigang/big-data-github.git I. demand: output statistical results to different files (zones) according to different provinces where mobile phones belong II. Data preparation Data preparation: phoneData.txt in case 2 According to the first three digits of the phone number III. create maven project P ...

Posted on Fri, 01 Nov 2019 16:50:43 -0700 by biz0r

12. Serialization of hadoop

I. Basic overview of serialization 1. What is serialization Serialization is the conversion of objects in memory into byte sequences (or according to other data transfer protocols), so as to facilitate persistent storage to disk and network transmission. 2. Why serialization is needed Generally, objects are only stored in local memory, and onl ...

Posted on Mon, 28 Oct 2019 22:13:00 -0700 by magic-eyes

The use of exec function

On the use of exec function in process Problems encountered in the learning process: How to use it? What's the difference? What is the role? 1. How to use it? In Linux system, we can view it through man command: For example: View the use of man command MANUAL SECTIONS: The standard sections o ...

Posted on Fri, 25 Oct 2019 22:53:58 -0700 by countcet