How to read and write Aliyun Hbase using MaxCompute Spark

background Spark on MaxCompute has access to instances (e.g. ECS, HBase, RDS) within the VPC in the Ali cloud. The default MaxCompute underlying network is isolated from the external network, and Spark on MaxCompute provides a solution through configurationSpark.hadoop.odps.Cupid.vpc.domain.list to access the Hbase of Ali Cloud's VPC network e ...

Posted on Mon, 01 Jun 2020 23:53:10 -0700 by Assorro

Build hive clusters based on different versions of Hadoop (with configuration files)

Focus on Public Number: Alliance of Java Architects, Daily Technical Updates This tutorial uses two scenarios One is hive-1.21 and hadoop is hadoop 2.6.5 Another is mainly about the construction based on hadoop3.x hive   First come first 1. Local (embedded derby) step This storage requires running a mysql server locally and configuring ...

Posted on Thu, 28 May 2020 09:18:13 -0700 by adam119

Big data learning (1) Hadoop installation

Cluster architecture The installation of Hadoop is actually the configuration of HDFS and YARN cluster. As can be seen from the following architecture diagram, each data node of HDFS needs to be configured with the location of NameNode. Similarly, every NodeManager in YARN needs to configure the location of ResourceManager. NameNode and Resour ...

Posted on Mon, 18 May 2020 08:25:53 -0700 by EZbb

Using the wagon Maven plugin plug-in to automatically deploy a project

The maven dependency of this plug-in is: <dependency>       <groupId>org.codehaus.mojo</groupId>       <artifactId>wagon-maven-plugin</artifactId>       <version>1.0</version>   </dependency>   The document address of the plug-in is: http://www.mojohaus.org/wagon-maven-plugin/ ...

Posted on Sun, 03 May 2020 15:55:34 -0700 by Sanoz0r

Build a fully distributed Hadoop2.6 environment under Centos7

1. Download the Hadoop package and JDK 1. Download Hadoop address: https://archive.apache.org/dist/hadoop/common/hadoop-2.6.4/hadoop-2.6.4.tar.gz2. Download jdk: Link: https://pan.baidu.com/s/1lbu7eBEtgjeGIi2bWthLnA Extraction code: 0j0j 2. Preparing virtual machines 1. Create a new virtual machine (Centos7) in VMware, which is omitted. 2. ...

Posted on Sat, 02 May 2020 21:01:08 -0700 by calavera

Detailed steps for Hadoop installation

Write before If you want to successfully build a Hadoop cluster locally through this blog, you need to follow the video course first Three-day Starter Big Data Practice Course Build a local cluster environment. The chapters you need to learn in this video lesson are: Course objectives VMWare WorkStation Installation Create Virtual Machine Ins ...

Posted on Tue, 28 Apr 2020 10:28:10 -0700 by Twentyoneth

Day02 -- list, tuple, dictionary and set of Python data types

List in python #List#List class, list#Enclosed in brackets, separated by commas, the elements in the list can be numbers, strings, lists, Booleans, etc.#Lists can also be nested =========Basic operation of list=========(1) Common operations of list list1 = [11,22,33,44,55] # len Number of elements to view the list print(l ...

Posted on Thu, 23 Apr 2020 08:36:01 -0700 by djr587

IDEA, SparkSql read data in HIve

The traditional hive computing engine is MapReduce. After Spark 1.3, SparkSql was officially released, and it is basically compatible with apache hive. Based on the powerful computing power of Spark, the data processing speed of using Spark to process hive is far faster than that of traditional hive.Using SparkSql in idea to read the data in H ...

Posted on Mon, 30 Mar 2020 14:23:09 -0700 by bl00dshooter

Implementation, cluster submission and operation of wordcount in mapReduce

This example implements the statistics of the number of words in all files in a directory in hdfs. Three java classes are used: WordcountMapper is responsible for the reputation mapTask WordcountReducer is responsible for the reputation ReduceTask WordcountDriver is responsible for submitting tasks to yarn.   Related code Mapper packag ...

Posted on Sun, 29 Mar 2020 09:02:12 -0700 by tech603

Zookeeper Getting Started

Zookeeper is a middleware that provides a consistent and coordinated service for distributed applications. It mainly solves some consistency problems frequently encountered in distributed applications, such as unified naming service, state synchronization service, cluster management, management of distributed application configuration items, et ...

Posted on Thu, 26 Mar 2020 19:42:20 -0700 by kaumilpatel