RDD programming learning note 3 data reading and writing

Local read scala> var textFile = sc.textFile("file:///root/1.txt") textFile: org.apache.spark.rdd.RDD[String] = file:///root/1.txt MapPartitionsRDD[57] at textFile at <console>:24 scala> textFile.saveAsTextFile("file:///root/writeback") scala> textFile.foreach(println) hadoop hello bi ...

Posted on Wed, 29 Jan 2020 05:34:55 -0800 by dizel247

Process and key code analysis of Hbase Clinet scan

  1, Hbase Connection 1. What kind of connection is created by   org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(Configuration, ExecutorService, User) //Initialized by ConnectionImplementation by default String className = conf.get(ClusterConnection.HBASE_CLIENT_CONNECTION_I ...

Posted on Mon, 20 Jan 2020 07:51:08 -0800 by manianprasanna

How to connect HBase with Phoenix JDBC

I. introduce Mavan dependency <!-- https://mvnrepository.com/artifact/org.apache.phoenix/phoenix-core --> <dependency> <groupId>org.apache.phoenix</groupId> <artifactId>phoenix-core</artifactId> <version>4.14.0-HBase-1.3</version> </dependency> 2. Establish J ...

Posted on Fri, 13 Dec 2019 11:21:25 -0800 by god_zun

HBase page filter ideas + codes

What needs can the paging filter address? Chestnut: divide the table into multiple pages, 5 lines per page. Now you need to query all the information on page 3 thinking 1. Target: determine the content to be queried by entering page number and number of lines 2. The starting output position can be controlled by setStartRow met ...

Posted on Tue, 10 Dec 2019 13:31:58 -0800 by Braveheart

Basic operation of hbase

1. shell operation Common commands: [root@hadoop01 ~]# hbase shell #Enter HBASE client hbase(main):001:0> help "dml" #Get a group of command prompts HBase (main): 001:0 > help "put" get a single command prompt help hbase(main):001:0> exit #Exit client #View all tables in hbase hbase(main):001:0>list Cr ...

Posted on Mon, 09 Dec 2019 10:46:36 -0800 by corsc

Modify TTL value of hbase table by pinpoint

Reference document https://greatwqs.iteye.com/blog/1741330 originAfter the Pinpoint access service monitoring, the data volume soars, with an average daily HBase data increment of about 20G. The data volume is too large, and the data needs to be cleaned up regularly, otherwise the monitoring availability will be reduced. Because the previous e ...

Posted on Wed, 20 Nov 2019 12:13:25 -0800 by landung

How to Rerun Data

Offline tasks, data re-run is a normal thing, such as the program runs and hangs up, such as the data run out is incorrect, need to check after re-run But when you re-run, it's important to note that no data has been written to hbase, or that the partition of hive already has data on that day. If hive h ...

Posted on Thu, 03 Oct 2019 19:06:21 -0700 by aunquarra

hadoop distributed file system

1. single version dhcp The source address is 0..0.0.0,The target address is 255.255.255.255, //The ports are UDP67 and UDP68, one sending and one receiving. Client to port 68 (bootps) //Broadcast request configuration, the server broadcasts the response request to port 67 (bootpc). //Default format ...

Posted on Wed, 02 Oct 2019 19:19:24 -0700 by azunoman

Hadoop Series - Distributed Computing Framework MapReduce

1. Overview of MapReduce Hadoop MapReduce is a distributed computing framework for writing batch applications.Written programs can be submitted to the Hadoop cluster for parallel processing of large datasets. The MapReduce job splits the input dataset into separate blocks, which are processed by the map in parallel, and the framework sorts the ...

Posted on Fri, 13 Sep 2019 09:21:27 -0700 by tharagleb

DirectByteBuffer and File IO Details

In the java.nio package is a new API that Java uses to process IO. It uses channel, select and other models to re-implement IO operations. DirectByteBuffer is one of the classes under the nio package.This class is used to save byte arrays, in particular because it stores data in out-of-heap memory.Unlike traditional objects, objects are in the ...

Posted on Thu, 05 Sep 2019 17:50:11 -0700 by mulysa