OtterTune source code analysis

In order to make it convenient for you to do the Hu change (gao) on ottertune, you need to know its source structure and pipeline first

 

OtterTune is divided into two parts:

server side: including a MySQL database (used to store tuning data for ml model), Django (front end user interface), and Gallery (used to schedule ML task);

Client side: target [DBMS (for storing user's business data, supporting multiple DBMS), controller (for controlling target DBMS), Driver (for calling controller, with the entry of failfile. Py)

 

For the meaning of codes (S1-S4, C1-C6) of each operation, please refer to: https://www.cnblogs.com/pdev/p/10903628.html

 

                                                                             Client Side                                                                             

1. Driver

Driver is the user's entry to run the client. The user does not directly execute the command of the controller, but controls it through the driver.

Driver uses Python's fabric Library To write. The author preset many common commands in it (such as switch target DBMS, run oltpbench, etc.).

fabfile.py

This is the core of Driver. You need to call this file to run the operation (C6) last on the Client side.

In C6 operation, you need to use fab loop and Fab run loops to periodically collect knob/metric sample (in each loop, it collections target DBMS info, uploads to the server, gets new recommended configuration, installs the config and restarts DBMS. Users can continue to run loops until they are satisfied with the recommended configuration )

  fab loop               runs one single loop. 

 1 @task
 2 def loop():
 3     # free cache, clean Linux PageCache 
 4     free_cache()
 5 
 6     # remove oltpbench log and controller log
 7     clean_logs()
 8 
 9     # restart database, shell "sudo service postgresql restart"
10     restart_database()
11 
12     # check whether there are enough free space on disk
13     if check_disk_usage() > MAX_DISK_USAGE:
14         LOG.WARN('Exceeds max disk usage %s', MAX_DISK_USAGE)
15 
16     # run controller as another process. Run the following command line in "../controller" folder: 
17     # sudo gradle run -PappArgs="-c CONF_controller_config -d output/" --no-daemon > CONF_controller_log
18     p = Process(target=run_controller, args=())
19     p.start()
20     LOG.info('Run the controller')
21 
22     # check whether the controller is ready(has created the log files)
23     while not _ready_to_start_oltpbench():
24         pass
25     # run oltpbench as a background job. Run the following command line in CONF_oltpbench_home folder: 
26     # ./oltpbenchmark -b CONF_oltpbench_workload -c CONF_oltpbench_config --execute=true -s 5 -o outputfile > CONF_oltpbench_log 2>&1 &
27     run_oltpbench_bg()
28     LOG.info('Run OLTP-Bench')
29 
30     # the controller starts the first collection
31 
32     # check whether 'Warmup complete, starting measurements' is in CONF_oltpbench_log file
33     while not _ready_to_start_controller():
34         pass
35     # shell 'sudo kill -2 CTL_PID'
36     # shutdown the process CTL_PID, where CTL_PID is the content of '../controller/pid.txt'
37     signal_controller()
38     LOG.info('Start the first collection')
39 
40     # stop the experiment    
41 
42     # check whether 'Output Raw data into file' is in CONF_oltpbench_log file
43     while not _ready_to_shut_down_controller():
44         pass
45     # shell 'sudo kill -2 CTL_PID'
46     # shutdown the process CTL_PID, where CTL_PID is the content of '../controller/pid.txt'
47     signal_controller()
48     LOG.info('Start the second collection, shut down the controller')
49 
50     p.join()
51 
52     # add user defined target objective
53     # add_udf()
54 
55     # save result file: 'knobs.json', 'metrics_after.json', 'metrics_before.json', 'summary.json'
56     save_dbms_result()
57 
58     # upload result to Django web interface
59     upload_result()
60 
61     # get result
62     # shell 'python3 ../../script/query_and_get.py CONF_upload_url CONF_upload_code 5'
63     get_result()
64 
65     # change target DBMS config
66     # shell 'sudo python3 PostgresConf.py next_config CONF_database_conf'
67     change_conf()

 

 fab run_loops:max_iter=10    runs 10 loops. You can set max_iter to change the maximum iterations.

 1 # intervals of restoring the databse
 2 RELOAD_INTERVAL = 10
 3 
 4 @task
 5 def run_loops(max_iter=1):
 6     # dump database if it's not done before.
 7     # shell 'PGPASSWORD=CONF_password pg_dump -U CONF_username -F c -d CONF_database_name > CONF_database_save_path/CONF_database_name.dump'
 8     dump = dump_database()
 9 
10     for i in range(int(max_iter)):
11         # restore database every RELOAD_INTERVAL
12         # shell these operations:
13         #     PGPASSWORD=CONF_password dropdb -e --if-exists CONF_database_name -U CONF_username
14         #     PGPASSWORD=CONF_password createdb -e CONF_database_name -U CONF_username
15         #     PGPASSWORD=CONF_password pg_restore -U CONF_username -j 8 -F c -d CONF_database_name CONF_database_save_path/CONF_database_name.dump
16         if RELOAD_INTERVAL > 0:
17             if i % RELOAD_INTERVAL == 0:
18                 if i == 0 and dump is False:
19                     restore_database()
20                 elif i > 0:
21                     restore_database()
22 
23         LOG.info('The %s-th Loop Starts / Total Loops %s', i + 1, max_iter)
24         loop()
25         LOG.info('The %s-th Loop Ends / Total Loops %s', i + 1, max_iter)

 

 

                                                                             Server Side                                                                             

1. Driver

hhhhhhhhh

Tags: PHP shell sudo Database JSON

Posted on Sun, 03 Nov 2019 01:25:42 -0700 by luv2climb