Offline tasks, data re-run is a normal thing, such as the program runs and hangs up, such as the data run out is incorrect, need to check after re-run
But when you re-run, it's important to note that no data has been written to hbase, or that the partition of hive already has data on that day.
If hive h ...
Posted on Thu, 03 Oct 2019 19:06:21 -0700 by aunquarra
1. single version
dhcp The source address is 0..0.0.0，The target address is 255.255.255.255,
//The ports are UDP67 and UDP68, one sending and one receiving. Client to port 68 (bootps)
//Broadcast request configuration, the server broadcasts the response request to port 67 (bootpc).
//Default format ...
Posted on Wed, 02 Oct 2019 19:19:24 -0700 by azunoman
1. Overview of MapReduce
Hadoop MapReduce is a distributed computing framework for writing batch applications.Written programs can be submitted to the Hadoop cluster for parallel processing of large datasets.
The MapReduce job splits the input dataset into separate blocks, which are processed by the map in parallel, and the framework sorts the ...
Posted on Fri, 13 Sep 2019 09:21:27 -0700 by tharagleb
In the java.nio package is a new API that Java uses to process IO. It uses channel, select and other models to re-implement IO operations.
DirectByteBuffer is one of the classes under the nio package.This class is used to save byte arrays, in particular because it stores data in out-of-heap memory.Unlike traditional objects, objects are in the ...
Posted on Thu, 05 Sep 2019 17:50:11 -0700 by mulysa
Links to the original text: http://www.cnblogs.com/DamianZhou/p/4184026.html
Hadoop Cluster Modification and Cluster Version Adjustment
1. JDK modification
Posted on Wed, 17 Jul 2019 13:02:17 -0700 by pesoto74
HMater is responsible for homogenizing regions into each region server. One of the threaded tasks in the hmaster is dedicated to balancing and is executed every five minutes by default.
Each load balancing operation can be divided into two steps:
Generating Load Balancing Schedule
Assignment Manager class execution schedule
Let's go into ...
Posted on Sat, 13 Jul 2019 15:15:04 -0700 by phpnewbie8
Recently, in the company's unified log collection and processing platform, the choice of technology must be elastic search, because it can quickly retrieve system logs, log problem checking and power business chain calls can be quickly retrieved. Some fields of the company's application logs, such as content, do not need to be stored in es. A ...
Posted on Mon, 24 Jun 2019 17:09:12 -0700 by Tr4mpldUndrfooT
Use the list command to list all tables
hbase(main):001:0 > list
Listing tables using Java API
What follows is the use Java API The program lists all HBase List of tables in the table.
import org.apach ...
Posted on Fri, 17 May 2019 16:29:48 -0700 by po
1. Persistence operator cache
Introduction: Normally, an RDD does not contain real data, but only contains metadata information describing the RDD. If the cache method is called on the RDD, then the data of the RDD still has no real data. Until the first call of an action operator triggers the data generation of the RDD, then the cache operati ...
Posted on Sun, 05 May 2019 01:32:37 -0700 by techker
1. Configuring flume files
2. Data Acquisition Part Gets Through
2.1 Start zookeeper and cluster
2.2 Start kafka cluster
2.3 Start flume Cluster
2.4 Production data
3 Data Consumption Environment Preparedness
3.1 Add maven configuration
3.2 Add maven configuration
4 Consumer Data Tools
4.1 PropertiesUti ...
Posted on Mon, 22 Apr 2019 18:06:34 -0700 by softnmedia