xpath of Python 3 crawler

1, Introduction XPath is a language for finding information in XML documents. XPath can be used to traverse elements and attributes in an XML document. XPath is the main element of the W3C's XSLT standard, and XQuery and XPointer are built on top of XPath expressions. 2, Installation pip3 install lxml 3, Use Select node Common path expressions ...

Posted on Fri, 05 Jun 2020 20:15:25 -0700 by BinaryDragon

Kafka cluster construction and necessary knowledge and meeting

Kafka cluster deployment and startup In this article, we will start by demonstrating how to build a Kafka cluster, and then briefly introduce some basic knowledge points about Kafka cluster. However, this paper only introduces the cluster, and does not explain the basic concept of Kafka too much. It is assumed that the reader has some basic kn ...

Posted on Thu, 04 Jun 2020 18:21:26 -0700 by Stonewall

Hadoop example: calculating the total amount of stock transactions

The optional Cloud Computing and Big Data Overview final job requires a stock case study with the following specific requirements: Case: Attachment Document TextData.txtThe file (shown in Fig. 1) shows the trading data of the daily stocks from 2011-1 to today, the trading data of the daily stocks, the t ...

Posted on Thu, 04 Jun 2020 11:54:49 -0700 by poppy

CDH6.3.2 enable Kerberos authentication

Tags (space separated): building big data platform 1: How to install and configure KDC service 2: How to enable Kerberos through CDH 3: How to log in to Kerberos and access Hadoop related services 1: How to install and configure KDC service 1.1 system environment 1. Operating system: CentOS7.5x64 2.CDH6.3.2 3. Use root user fo ...

Posted on Sat, 30 May 2020 08:55:47 -0700 by radhoo

CDH6.3.2 deployment of big data platform

Tags (space separated): building big data platform 1: Environment initialization 2: Installation of CDH6.3.2 1: Environment initialization 1.1 environment introduction System: CentOS7.5X64 cat /etc/hosts ---- 192.168.11.160 dev01.lanxintec.cn 192.168.11.161 dev02.lanxintec.cn 192.168.11.162 dev03.lanxintec.cn ---- 1.2: Keyless l ...

Posted on Thu, 28 May 2020 08:02:40 -0700 by tomhilton

Quick Start for Kafka - Introduction to Confluent Kafka

Quick Start for Kafka (8) - Introduction to Confluent Kafka 1. Introduction to Confluent Kafka 1. Introduction to Confluent Kafka In 2014, Jay Kreps, NahaNarkhede and Rao Jun, Kafka's founders, left LinkedIn to create Confluent, which focused on providing Kafka-based enterprise stream processing solutions and released Confluent Kafka.Confluent ...

Posted on Wed, 27 May 2020 09:17:52 -0700 by ranam

Kafka Quick Start-Kafka Monitoring

Kafka Quick Start (7) - Kafka Monitoring 1. Kafka Monitoring Indicators 1. Kafka host monitoring indicators Host monitoring monitors the performance of the node machine where the Kafka cluster Broker resides.Common host monitoring metrics include:(1) Machine Load(2) CPU utilization(3) Memory usage, including Free Memory and Used Memory(4) Disk ...

Posted on Mon, 25 May 2020 10:37:33 -0700 by meckr

Springboot 2.x integration ElasticSearch6.x

SpringBoot integration with ElasticSearch is simple, simply by introducing a spring-data-elasticsearch dependency and adding a configuration. Because Elasticsearch has many versions, it is now available 7.5.x, spring-data-elasticsearch is compatible with the corresponding version as follows:   Project structure   pom.xml <?xml ver ...

Posted on Sun, 17 May 2020 11:14:30 -0700 by Houdini

PySaprk saves DataFrame data as Hive partition table

Create a SparkSession from pyspark.sql import SparkSession spark = SparkSession.builder.enableHiveSupport().appName('test_app').getOrCreate() sc = spark.sparkContext hc = HiveContext(sc) 1. Spark creates partition table # You can change append to overwrite, so that if the table already exists, the previous table will be deleted and a ...

Posted on Mon, 11 May 2020 01:18:45 -0700 by [xNet]DrDre

Python notes: making dashboard to monitor sales achievement rate

Our department will hold a monthly operation analysis meeting every month, so it is necessary to make a PPT report. As we all know, it will be more persuasive to quote data in PPT according to the facts. However, adding some cool and reasonable visual charts will add a lot of color (more visual impact)! First, install the pyecharts and Gauge m ...

Posted on Tue, 05 May 2020 16:41:59 -0700 by invisionx