Installation experience of handle series | oceanbase version 2.2 - OCP 2.3

OB Jun: good news! "Oceanbase version 2.2" is officially launched on the official website! (click to read the original text and download it directly) OceanBase 2.2 is a stable version that successfully supports tmall double 11 in 2019, and it is also the version used for TPC-C testing and ranked first in TPC-C performance. We will continue to launch "OceanBase 2.2 handlebar series" for you in the next period, and take you to experience the powerful functions of OceanBase 2.2. Welcome to continue to pay attention!

1. introduction

The relevant software package of OceanBase version 2.2 has recently been provided through the official website of OceanBase, including the installation files and related container images of an OceanBase database product, an automatic operation and maintenance product (OCP) of OceanBase. OceanBase version 2.2 provided on the official website can be used for learning, development and testing environment free of charge, with no significant difference in function from the commercial version.

2. Installation planning

OceanBase database products must be installed. OCP is not necessary but recommended. If you do not install OCP, you need to manually install the OceanBase database product. For details, refer to OceanBase 2.x experience: manually building an OceanBase cluster>.

The download file on the official website contains the OceanBase version 2.2 installation manual. This article shares the OCP installation experience. You can join the OceanBase technology exchange nail group (search group number: 21949783, note 2.2) if you encounter any technical problems in use.

2.1 container introduction

Compared with 1.x, OCP version 2. X has been reconstructed, which is much simpler in architecture. OCP deployment architecture usually includes three types of docker containers: OCP application container, OBProxy container and metabase OB container. Docker container technology is mainly used to facilitate automatic deployment.

Among the three types of containers, OBProxy container is the simplest. There is only one OBProxy process in it. Strictly speaking, this container can be replaced by a single OBProxy process. The other two containers are a little more complex. The starting order of the three types of containers is metabase OB container, then OBProxy container, and finally OCP container.

In the actual installation steps, a temporary OCP container will be started before the OBProxy container is started (only API functions are provided). After the OBProxy runs normally, the temporary OCP container will be replaced with a formal OCP container.

The OCP metabase provided by the official website has only one node, and the commercial version will be a three node OB cluster. The metabase three containers will start on three machines. Therefore, there will be three OBProxy and OCP, and the corresponding access addresses need to be unified. There are usually two methods: VIP or DNS for load balancing. Therefore, a load balancing container (essentially a HAProxy or DNS service) will be started in the commercial OCP installation step.

2.2 machine initialization requirements

In the three types of containers, the OB container stores metadata, and its database file uses the directory of the host. Therefore, it is necessary to plan the OB database file directory on the host computer. A common best practice is to use separate file systems for data files and transaction log files.

The owner of OB process and OBProxy process is user admin, so the host machine is required to create user admin before installation, and the default resources that admin users can use also need to be modified.

The kernel parameters, firewall and security parameters of the host also need to be adjusted according to the characteristics of the database.

The essence of database automation operation and maintenance platform is to execute specific scripts remotely to the database nodes through one or a group of central operation and maintenance nodes, ssh. Therefore, all nodes on the network should be interconnected with each other, run the sshd service and listen to the same port (the default is 22, this installation does not support modifying the ssh custom port), and ensure that there is no firewall to block the specific port communication between these nodes.

There are many listening communication ports related to OCP and ob. For simplicity, please shut down the host firewall directly. The production network will open relevant port access rights in advance at the switch level. Each machine is configured with the same time synchronization source. This is very important for the normal operation of OB cluster.

3. Machine initialization

The file size of oceanbase version 2.2 installation package is about 1.9G. After downloading the file, it is recommended to decompress it with root or an account with sudo permission and run relevant commands.

[root@xxx /root]
#tar zxvfoceanbase_trial.tar.gz
[root@xxx /root]
#ls -lrthoceanbase_trial/

3.1 list of installation documents

In the above files, the installation packages of OB, OBProxy and OBClient are ending with. rpm. The files beginning with install and ending with. sh are shell scripts for automatic installation. If you are interested in specific installation logic, you can see these scripts.

obcluster.conf is the configuration file used in the installation process, which needs to be modified.

The clonescripts directory is used for host initialization. It is recommended that each OB machine use the script in this directory for one initialization.

[root@xxx /root]
#cd oceanbase_trial/clonescripts/
[root@xxx /root/oceanbase_trial/clonescripts]
#ls
auto_clone.sh  clone.conf  clone_remote.sh  clone.sh  db_ob_v1  pre_check.sh

Initialization is executed through the script clone.sh, which will read the content of the configuration file clone.conf.

[root@xxx /root/oceanbase_trial/clonescripts]
#./clone.sh -h
Usage:  ./clone.sh [OPTIONS]

Options:
  -h, --help                   Print help and exit
  -a, --auto-config            Automatically generate clone config, use -d | --dry-run to get the auto config result
  -r, --role                   Set machine role, by default is ocp, could be set as ob or obbackup
  -d, --dry-run                Print dry run information
  -V, --version                Print version
  -p, --disk-part              Only part disk
  -u, --add-user               Only add user
  -c, --run-config             Only run OS parameters config
  -i, --install-docker         Only install docker
  -m, --install-rpms           Only install dependent rpms
  -t, --pre-check              Pre-check after clone

If the model is an ant model, it is the fastest to use automatic selection (- a) during initialization. If it is an external customer model, it is recommended to initialize one by one.

The recommended order of initialization is: modify configuration file, install rpm package, modify kernel parameters, create new user, partition disk, install docker (optional).

3.2 modify the configuration file clone.conf

The configuration in the configuration file corresponds to each initialization task. If you do not do that initialization task, the parameter can be ignored. If not all machines need to install Docker, then the parameters related to Docker can be ignored.

1) Machine role (required)

## machine role, value could be ob, ocp or obbackup
machineRole=ocp

It is required to set the machine role. Currently, there are three types:

  • OCP: This is an OCP machine. The directory / docker, / data/1, / data/log1, / home (optional) will be initialized.
  • OB: This is an OB machine. The / docker directory will not be initialized. The / data/1, / data/log1, and / home (optional) directories will be initialized.
  • OBbackup: This is the backup machine, which will be introduced separately later.

2) Partition configuration (optional)

Partition configuration is mainly used to create the corresponding directories for different models. Customer models vary widely. The configuration and methods here are not necessarily applicable to all models. Customers can partition and create relevant directories according to the characteristics of the local disk.

## part disk mode, value either lvm or parted
diskPartMode=lvm

There are currently two ways to set up partitions:

  • lvm: use lvm related commands to aggregate multiple disks into one VG, and then divide them into multiple LVS, including data directory, log directory, HOME directory (optional, determined by parameters), Docker directory (optional, determined by parameters). lvm is recommended.
  • Parted: use the parted command to divide a large disk into several raw devices for data directory, log directory, HOME directory (optional, determined by parameters), Docker directory (optional, determined by parameters).

For the directory part, lvm and parted have corresponding configuration file parts to define the size of each directory.

Here are several points to note:

The log directory is mainly used to store transaction logs. It is recommended that the size be 3 to 4 times of the memory size, especially on the premise of performance testing. If the machine resources are not good, just look at the OB function, the log directory size can be about twice the memory.

Data directory and log directory are strongly recommended to be different LV or raw devices, that is, they cannot share the same file system. If a machine with limited conditions has only one large disk and is still used by the operating system, the partition command cannot be executed. Related directories need to be created manually.
lvm partition is usually recommended for convenience.

############ lvm settings begin ###################
## could be partition or device name
devArray=(sdb sdc sdd sde sdf sdg sdh sdi sdj sdk sdl)

This is the right way to set up multiple disks. These values are taken from the unused PV listed in the command pvs.

3) Operating system type

## os, option support centos, alios and redhat
os=alios

The operating system type, and the external customer models generally choose centos or redhat.

This is mainly for selecting the corresponding operating system type software package when installing the software package. However, the installation script does not contain all the package files, and it also needs to rely on the client's yum install command.

Therefore, it is recommended that all machines of the customer machine be configured with YUM source. If there is no remote one, you can make a local YUM source to point to the disc image file.

4) Home directory

OCP, OB and backup software will be installed under the admin user, using the default directory of admin / home/admin. In particular, the operation logs of ob (observer.log, rootservice.log, etc.) will be in the / home/admin directory. So the space of the / home partition is recommended to be more than 100G. The test environment can be smaller, but if it's small, keep an eye out for space exhaustion. The default log level of ob is INFO, and the log growth is very fast.

## if make home disk, value could be yes or no. If yes selected, the script will make home disk, if no selected, the script will not make home disk, in which case you already have home disk made and donot
 want re-make it again
makeHomeDisk=no

If the local / home partition space is large enough, you do not need to initialize the / home directory; otherwise, it is set to automatically initialize the / home directory. Previously, the data in the / home / directory would be backed up to / tmp /, but it would not be recovered. Pay attention to this!

3.3 install RPM package

First, make sure that the local yum source is available.

[root@xxx /root/oceanbase_trial/clonescripts]
#yum list

Execute initialization script

[root@xxx /root/oceanbase_trial/clonescripts]
#./clone.sh -m

3.4 modifying kernel parameters

It is mainly to modify the kernel parameter / etc/sysctl.conf.

[root@xxx /root/oceanbase_trial/clonescripts]
#./clone.sh -c

3.5 new user

Create a new user admin, and set the uid to 500 (this is Ali's internal habit). Later, when the internal and external directories of docker are mapped, there is no uid mapping conversion, so the default requirement is that the owner of the internal and external directories is admin and the uid is 500. Otherwise, you may face directory permission problems.

If the client already has an admin user, it will be deleted.

[root@xxx /root/oceanbase_trial/clonescripts]
#./clone.sh -u

3.6 disk partition

Note that the configuration in the previous configuration file should be correct, otherwise existing data may be damaged.

[root@xxx /root/oceanbase_trial/clonescripts]
#./clone.sh -p

Reruns are not supported in this step. If there is an error in operation, it may be an intermediate state. At this time, you need to manually delete the relevant bare equipment or LV/VG/PV.

Here is an example after initialization.

3.7 install docker software (optional)

In this step, install the relevant software package of docker and start the docker service. The directory / docker will be used by default. So before this step, make sure that the disk is partitioned and that the / docker directory space is sufficient.

[root@xxx /root/oceanbase_trial/clonescripts]
#./clone.sh -i

Check if the installation is using the command correctly

[root@xxx /root/oceanbase_trial/clonescripts]
#systemctl status docker
●  docker.service - Docker Application Container Engine
   Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor preset: disabled)
   Active: active (running) since Sat 2020-01-18 15:54:48 CST; 2 weeks 2 days ago
     Docs: http://docs.docker.com
 Main PID: 13771 (dockerd)
   Memory: 2.4G
   CGroup: /system.slice/docker.service
           ├─13771 /usr/bin/dockerd -H tcp://0.0.0.0:4243 -H unix:///var/run/docker.sock --selinux-enabled=false --log-opt max-size=1g --graph=/docker
           └─13791 docker-containerd --config /var/run/docker/containerd/containerd.toml

Jan 31 11:22:36 xxx dockerd[13771]: time="2020-01-31T11:22:36+08:00" level=info msg="shim docker-containerd-shim started" address="/containerd-shim/moby/28379e1464373c7249a7...ks" pid=71070
Jan 31 11:26:19 xxx dockerd[13771]: time="2020-01-31T11:26:19+08:00" level=error msg="stat cgroup 28379e1464373c7249a76c03162a7dd6e6d285618fd98346180e57dbfc271621" error=""/...ave 4 fields"
Jan 31 11:26:19 xxx dockerd[13771]: time="2020-01-31T11:26:19+08:00" level=info msg="shim reaped" id=28379e1464373c7249a76c03162a7dd6e6d285618fd98346180e57dbfc271621 module=...ainerd/tasks"
Jan 31 11:26:19 xxx dockerd[13771]: time="2020-01-31T11:26:19.780741127+08:00" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="...s.TaskDelete"
Jan 31 11:26:20 xxx dockerd[13771]: time="2020-01-31T11:26:20+08:00" level=error msg="stat cgroup fb959a57982a6312804c0ff765f345d8cf6f007b3d6d222d30eaa0573dceb44f" error=""/...ave 4 fields"
Jan 31 11:26:20 xxx dockerd[13771]: time="2020-01-31T11:26:20+08:00" level=info msg="shim reaped" id=fb959a57982a6312804c0ff765f345d8cf6f007b3d6d222d30eaa0573dceb44f module=...ainerd/tasks"
Jan 31 11:26:20 xxx dockerd[13771]: time="2020-01-31T11:26:20.997790335+08:00" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="...s.TaskDelete"
Jan 31 11:26:25 xxx dockerd[13771]: time="2020-01-31T11:26:25+08:00" level=error msg="stat cgroup 175364fb736988d24c05261497b9d4160188cd8d53e5b2f1b695e79cd4ac5d9c" error=""/...ave 4 fields"
Jan 31 11:26:25 xxx dockerd[13771]: time="2020-01-31T11:26:25+08:00" level=info msg="shim reaped" id=175364fb736988d24c05261497b9d4160188cd8d53e5b2f1b695e79cd4ac5d9c module=...ainerd/tasks"
Jan 31 11:26:25 xxx dockerd[13771]: time="2020-01-31T11:26:25.837715440+08:00" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="...s.TaskDelete"
Hint: Some lines were ellipsized, use -l to show in full.

[root@xxx /root/oceanbase_trial/clonescripts]
#docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES

3.8 final inspection

In the last step of the installation, using. / clone.sh -t will check whether the previous configuration operation meets the requirements and give suggestions. In addition, network delay inspection and time synchronization delay inspection between OB machines need additional inspection.

4. install OCP

4.1 OCP installation profile

The OCP installation configuration file is obcluster.conf in the uncompressed directory.

The configuration file needs to be slightly modified according to the information of the client's machine.

1) OCP operation mode

##
SINGLE_OCP_MODE=TRUE
################################   Must be modified according to the environment / MUST CHANGE ACCORDING ENVIRONMENT   ################################
############  Fill in the machine IP and root/admin Password / Edit Machine IP and Password Of root/admin  ############
ZONE1_RS_IP=11.xxx.xxx.5
OBSERVER01_ROOTPASS=root087005
OBSERVER01_ADMINPASS=admin087005

Single? OCP? Mode specifies whether a single node is running. If not, it will run in three node cluster mode by default.

Zone1? RS? IP is the IP information of a single node, followed by OBSERVER01 is the password of root and admin users on this node. You can change the password after installation.

If it is a three node operation, the configuration here specifies three groups of such information, which are distinguished by 1, 2 and 3. The trial version downloaded from the official website only supports single node operation mode, which is also sufficient for learning and POC verification.

2) Container resources

Allocate container resources accordingto server
OB_docker_cpus=18
OB_docker_memory=106G
OCP_docker_cpus=8
OCP_docker_memory=16G
OBProxy_docker_cpus=4
OBProxy_docker_memory=6G

OCP consists of three containers: OB container, OCP container and OBProxy container. Here you specify the CPU and memory required for each container. If the host has sufficient resources, it can share more resources. When the host resource is small, you can only reduce the resource of the container here. The memory of OB container shall be at least 32G, the memory of OCP container shall be at least 16G, and the direct installation of RPM package can be omitted for OBProxy container.

The host configured above is 32C128G.

3) Container image information

Here is the information of the image file corresponding to three containers to be specified: filename, REPO, and TAG.

############  fill in OCP Version information of each component container / Edit Docker Image, Repo And Tag of OCP Components  ############
# OB docker
docker_image_package=observer1478.tar.gz
OB_image_REPO=reg.docker.alibaba-inc.com/antman/ob-docker
OB_image_TAG=OB1478_20200102_2113
# OCP docker
ocp_docker_image_package=ocp233.tar.gz
OCP_image_REPO=reg.docker.alibaba-inc.com/antman/ocp-all-in-one
OCP_image_TAG=OCP233_20200102_2122
# OBPROXY docker
obproxy_docker_image_package=obproxy156.tar.gz
obproxy_image_REPO=reg.docker.alibaba-inc.com/antman/obproxy
obproxy_image_TAG=OBP156_20200102_2110

If it is the installation file downloaded from the official website, this part of the configuration information does not need to be modified.

However, if the image file is updated later, the information here may need to be modified. To view the REPO and TAG information of the image file, you can first load the image (docker load), and then view the image information (docker images).

[root@xxx /root/oceanbase_trial]
#ls *.gz
obproxy156.tar.gz  observer1478.tar.gz  ocp233.tar.gz

[root@xxx /root/oceanbase_trial]
#for img in `ls *.gz`;do echo $img ;docker load < $img; done
obproxy156.tar.gz
Loaded image: reg.docker.alibaba-inc.com/antman/obproxy:OBP156_20200102_2110
observer1478.tar.gz
Loaded image: reg.docker.alibaba-inc.com/antman/ob-docker:OB1478_20200102_2113
ocp233.tar.gz
Loaded image: reg.docker.alibaba-inc.com/antman/ocp-all-in-one:OCP233_20200102_2122

[root@xxx /root/oceanbase_trial]
#docker images
REPOSITORY                                         TAG                    IMAGE ID            CREATED             SIZE
reg.docker.alibaba-inc.com/antman/ocp-all-in-one   OCP233_20200102_2122   967f985769d2        4 weeks ago         1.8GB
reg.docker.alibaba-inc.com/antman/ob-docker        OB1478_20200102_2113   5351e5990e0f        4 weeks ago         780MB
reg.docker.alibaba-inc.com/antman/obproxy          OBP156_20200102_2110   c4076538faa1        4 weeks ago         282MB

Other configurations in the future do not need to be modified.

4.2 start OCP installation

1) Installation command

The installation will be divided into several steps, which can be seen in the help document. The steps of installation have been mentioned many times before, and will not be introduced here.

[root@xxxs /root/oceanbase_trial]
#./install.sh -h

Usage:  ./install.sh [OPTIONS]

Options:
  -h, --help                   Print help and exit
  -d, --debug                  Print debug information
  -V, --version                Print version
  -i, --install-steps string   For example 1,3-5,7-
  -c, --clear-steps string     For example 1,3-5,7-
  -f, --config-file string     Read in a config file
  -l, --load-balance           Load balance mode

Steps:

    1. ssh authorization
    2. install load balancer
    3. install ob server
    4. init ocp metadb
    5. install temp OCP
    6. install obproxy
    7. install OCP
    8. POSTCHECK

Note that the installation parameter is - i and the cleanup parameter is - c. Can be installed or cleaned continuously, or step-by-step.

After each command, there will be a corresponding log file to view the progress.

2) Automatic continuous installation

[root@xxx /root/oceanbase_trial]
#./install.sh -i 1-
run install.sh with DEBUG=FALSE, INSTALL_STEPS=1 2 3 4 5 6 7 8 CLEAR_STEPS= CONFIG_FILE=/root/oceanbase_trial/obcluster.conf
[2020-02-04 17:25:26.261595] INFO [start antman API service]
LB_MODE=none
[2020-02-04 17:25:26.353919] INFO [step1: making ssh authorization, logfile: /root/oceanbase_trial/logs/ssh_auth.log]
[2020-02-04 17:25:26.832011] INFO [step1: ssh authorization done]
[2020-02-04 17:25:26.847573] INFO [step2: no action need when LB_MODE=none]
[2020-02-04 17:25:26.850503] INFO [step3: check whether OBSERVER port 2881,2882 are in use or not on 11.xxx.xxx.5]
[2020-02-04 17:25:27.197266] INFO [step3: OBSERVER port 2881,2882 are idle on 11.xxx.xxx.5]
[2020-02-04 17:25:27.199657] INFO [step3: installing ob cluster, logfile: /root/oceanbase_trial/logs/install_ob.log]
[2020-02-04 17:30:26.007264] INFO [step3: installation of ob cluster done]
[2020-02-04 17:30:26.009985] INFO [step4: initializing ocp metadb, logfile: /root/oceanbase_trial/logs/init_metadb.log]
[2020-02-04 17:30:26.941684] INFO [step4: initialization of ocp metadb done]
[2020-02-04 17:30:26.944131] INFO [step5: check whether OCP port 8080 is in use or not on 11.xxx.xxx.5]
[2020-02-04 17:30:27.308464] INFO [step5: OCP port 8080 is idle on 11.xxx.xxx.5]
[2020-02-04 17:30:27.311137] INFO [step5: installing temporary ocp, logfile: /root/oceanbase_trial/logs/install_tmp_ocp.log]
[2020-02-04 17:34:36.886177] INFO [step5: installation of temporary ocp done]
[2020-02-04 17:34:36.889325] INFO [step6: check whether OBPROXY port 2883 is in use or not on 11.xxx.xxx.5]
[2020-02-04 17:34:37.084248] INFO [step6: OBPROXY port 2883 is idle on 11.xxx.xxx.5]
[2020-02-04 17:34:37.087163] INFO [step6: installing obproxy, logfile: /root/oceanbase_trial/logs/install_obproxy.log]
[2020-02-04 17:35:09.314143] INFO [step6: installation of obproxy done]
[2020-02-04 17:35:09.316600] INFO [step7: installing ocp, logfile: /root/oceanbase_trial/logs/install_ocp.log]
[2020-02-04 17:36:19.347975] INFO [step7: installation of ocp done]
[2020-02-04 17:36:19.350974] INFO [step8: post-checking service, logfile: /root/oceanbase_trial/logs/post_check_service.log]
[2020-02-04 17:36:27.431920] INFO [step8: post check done]f

Automatic continuous cleaning:

[root@xxx /root/oceanbase_trial]
#./install.sh -c 1-
run install.sh with DEBUG=FALSE, INSTALL_STEPS= CLEAR_STEPS=8 7 6 5 4 3 2 1 CONFIG_FILE=/root/oceanbase_trial/obcluster.conf
[2020-02-04 17:24:41.127969] INFO [start antman API service]
LB_MODE=none
[2020-02-04 17:24:41.183515] INFO [clear_step8: no need to clear for step network post-check]
[2020-02-04 17:24:41.186271] INFO [clear_step7: uninstalling ocp and remove docker, logfile: /root/oceanbase_trial/logs/uninstall_ocp.log]
[2020-02-04 17:24:42.034660] INFO [clear_step7: uninstallation of ocp done]
[2020-02-04 17:24:42.036854] INFO [clear_step6: uninstalling obproxy and remove docker, logfile: /root/oceanbase_trial/logs/uninstall_obproxy.log]
[2020-02-04 17:24:42.886604] INFO [clear_step6: uninstallation of obproxy done]
[2020-02-04 17:24:42.888851] INFO [clear_step5: uninstalling ocp and remove docker, logfile: /root/oceanbase_trial/logs/uninstall_tmp_ocp.log]
[2020-02-04 17:24:43.738130] INFO [clear_step5: uninstallation of temporary ocp done]
[2020-02-04 17:24:43.740634] INFO [clear_step4: drop ocp meta db/tenant/user/resource, logfile: /root/oceanbase_trial/logs/uninit_metadb.log]
[2020-02-04 17:24:43.813946] INFO [clear_step4: uninit of metadb done]
[2020-02-04 17:24:43.816483] INFO [clear_step3: uninstalling OB server and remove docker, logfile: /root/oceanbase_trial/logs/uninstall_ob.log]
[2020-02-04 17:24:44.832929] INFO [clear_step3: uninstallation of ob done]
[2020-02-04 17:24:44.835425] INFO [clear_step1: no need to clear for step ssh authorization]

3) Step or clean (optional)

For example, step 3 is to refit it. It should be noted that if step 3 is a re installation, it is recommended to clean up step 3 and all subsequent steps.

[root@xxx /root/oceanbase_trial]
#./install.sh -c 3- -i 3

In theory, each step can be reinstalled. But there may be some dependencies between the different steps. The focus here is on whether there are configuration information changes before and after reinstallation. If the configuration file is changed, the metabase step must be performed again.

4) Inspection after installation

[root@xxx /root/oceanbase_trial]
#docker ps
CONTAINER ID        IMAGE                                                                   COMMAND                  CREATED             STATUS              PORTS               NAMES
04d585acbeed        reg.docker.alibaba-inc.com/antman/ocp-all-in-one:OCP233_20200102_2122   "/bin/sh -c '/usr/bi..."   12 minutes ago      Up 12 minutes                           ocp
17097103b74b        reg.docker.alibaba-inc.com/antman/obproxy:OBP156_20200102_2110          "sh start_obproxy.sh"    13 minutes ago      Up 13 minutes                           obproxy
dca537d692ea        reg.docker.alibaba-inc.com/antman/ob-docker:OB1478_20200102_2113        "/usr/bin/supervisor..."   22 minutes ago      Up 22 minutes                           META_OB_ZONE_1

Punch browser, accessing http://11.xxx.87.5:8080 The login interface is basically installed successfully.

After the OCP is installed, if the OB cluster is installed, you can refer to the 2.2 installation document in the official download installation package.

5. Common problems and diagnosis methods

5.1 common problem diagnosis methods

1) View corresponding log

Each installation step has a corresponding log file. First, you need to obtain information from the log file.

[root@xxx /root/oceanbase_trial]
#tail /root/oceanbase_trial/logs/install_tmp_ocp.log
[2020-02-04 17:20:40.305403] INFO [waiting ocp to be ready on host 11.xxx.xxx.5 for 7 Minites]
[2020-02-04 17:21:10.313486] INFO [waiting ocp to be ready on host 11.xxx.xxx.5 for 8 Minites]
[2020-02-04 17:21:40.321695] INFO [waiting ocp to be ready on host 11.xxx.xxx.5 for 8 Minites]
[2020-02-04 17:22:10.330312] INFO [waiting ocp to be ready on host 11.xxx.xxx.5 for 9 Minites]
[2020-02-04 17:22:40.338612] INFO [waiting ocp to be ready on host 11.xxx.xxx.5 for 9 Minites]
[2020-02-04 17:23:10.346736] INFO [waiting ocp to be ready on host 11.xxx.xxx.5 for 10 Minites]
[2020-02-04 17:23:40.355044] ERROR [ANTMAN-503: timeout( 10 Minites) on waiting ocp ready, URL=http://11.xxx.xxx.5:8080/services?Action=GetObProxyConfig&User_ID=admin&UID=alibaba]
[2020-02-04 17:23:40.360255] WARN [error ERROR exists in /root/oceanbase_trial/logs/install_tmp_ocp.log]
[2020-02-04 17:23:40.817147] INFO [install_ocp.sh finished and reg.docker.alibaba-inc.com/antman/ocp-all-in-one:OCP233_20200102_2122 started on 11.xxx.xxx.5]
[2020-02-04 17:23:40.819777] ERROR [ANTMAN-308: tmp_ocp docker on 11.xxx.xxx.5 is NOT started]

2) View installation script (optional)

Check the logic in the installation script according to the information in the log file, and you can further see the logic of error reporting. Installation scripts are not complicated.

3) View container run log

[root@xxx /root/oceanbase_trial]
#docker logs ocp

Look for clues in the journal.

4) To reassemble a step or steps

According to the above clues, analyze the possible relationship between the problem and the steps, and then clean up the corresponding steps and execute them again.

[root@xxx /root/oceanbase_trial]
#./install.sh -c 3- -I 3

5.2 common problems

In general, the problems are resource and environment (listening process, MySQL client), etc.

1) waiting on observer ready on timeout failed

In the step of OB container installation, an OBServer process will be started. Before startup, there will be an IO performance test process and an OB initialization (bootstrap) operation. Therefore, this step will take 5-10 minutes. The main installation program uses the account name and password of MySQL client to check whether the process is ready.

If you see this related timeout error, first check whether the OB container is normal. The monitoring port of OBServer is whether 2881 and 2882 survive. If the container exists but the status is exited, start it once manually.

[root@xxx /root/oceanbase_trial]
#docker ps -a
CONTAINER ID        IMAGE                                                                   COMMAND                  CREATED              STATUS                          PORTS               NAMES

4eaf2d600d31        reg.docker.alibaba-inc.com/antman/ob-docker:OB1478_20200102_2113        "/usr/bin/supervisor..."   6 minutes ago        Up 6 minutes                                        META_OB_ZONE_1

-- If the container OB_ZONE1 It exists but the status is exited. It is started manually once

[root@xxx /root/oceanbase_trial]
#docker start OB_ZONE1

[root@xxx /root/oceanbase_trial]#netstat -ntlp |grep 2881
[root@xxx /root/oceanbase_trial]
#which mysql
[root@xxx /root/oceanbase_trial]
#mysql-h127.1 -uroot@sys -P2881 -psys1234

If it is normal, check whether the MySQL client running environment of the client is damaged.

If the OBServer process is abnormal / the monitoring is abnormal, see if it is a resource problem. If the memory given by the OB container is too small, it may cause the OBServer process to fail.

For specific confirmation, you need to enter the container to view the operation log of the OBServer process.

2) waiting ocp to be ready on host timeout failed

In the step of temp OCP installation, the main program will frequently check whether the API of OCP container is ready. The detection method is as follows:

> [root@xxx /root/oceanbase_trial]  
> #curl -s "http://11.xxx.87.5:8080/services?Action=GetObProxyConfig&User_ID=admin&UID=alibaba"  
> {"Code":200,"Cost":4,"Data":{"ObRootServiceInfoUrlList":[{"ObRegion":"obcluster","ObRootServiceInfoUrl":"http://11.xxx.87.5:8080/services?Action=ObRootServiceInfo&User_ID=admin&UID=alibaba&ObRegion=obcluster"}],"ObProxyBinUrl":"http://11.xxx.87.5:8080/client?Action=GetObProxy&User_ID=admin&UID=alibaba","Version":"7a7b4f210f02e4c9549fc9c6085e4299","ObProxyDatabaseInfo":{"MetaDataBase":"http://11.xxx.87.5:8080/services?Action=ObRootServiceInfo&User_ID=admin&UID=alibaba&ObRegion=obdv1","DataBase":"obproxy","User":"root@obproxy","Password":"**********"}},"Message":"successful","Success":true,"Trace":"11.xxx.87.5:11.xxx.87.5:1580808939674"}

If there is result output, the API is normal. Otherwise, the waiting timeout will be reported finally.

[root@xxx /root/oceanbase_trial]
#docker ps -a
CONTAINER ID        IMAGE                                                                   COMMAND                  CREATED              STATUS                          PORTS               NAMES

78c80e4ad822        reg.docker.alibaba-inc.com/antman/ocp-all-in-one:OCP233_20200102_2122   "/bin/sh -c '/usr/bi..."   About a minute ago   Exited (2) About a minute ago                       ocp

4eaf2d600d31        reg.docker.alibaba-inc.com/antman/ob-docker:OB1478_20200102_2113        "/usr/bin/supervisor..."   6 minutes ago        Up 6 minutes                                        META_OB_ZONE_1 

-- If the container ocp It exists but the status is exited. It is started manually once

[root@xxx /root/oceanbase_trial]
#docker start ocp

The way to analyze this problem is to check the current TCP port monitoring status. See if the old process or other processes occupy the ports (8001 and 8080) that OCP uses.

Apply for free experience of oceanbase version 2.2 now

"Oceanbase version 2.2" is officially launched on the official website! Oceanbase version 2.2 is a stable version that successfully supports tmall's double 11 promotion in 2019. It is also used for TPC-C testing and ranked first in TPC-C performance. Oceanbase version 2.2 is widely used not only in ant financial services and online banking, but also in some financial institutions.

Want to experience oceanbase version 2.2 now?

Get links for free: https://oceanbase.alipay.com/download/resource

If you encounter problems in the process of installation and use and want to have technical exchanges with the first-line experts of OceanBase: join the OceanBase technical exchange nail group, open the nail search group number: 21949783 note 2.2, you can join

We attach great importance to the experience and experience from every developer user, and hope to get your valuable feedback.

Tags: Database Docker ssh MySQL

Posted on Wed, 04 Mar 2020 22:37:06 -0800 by truckie2