Author Topic: replicate mysql with MariaDB-Galera-cluster on 3 nodes  (Read 12511 times)

golfreeze

  • Administrator
  • Hero Member
  • *****
  • Posts: 2145
    • View Profile
    • นั่งสมาธิ สติปัฏฐานสี่ พาเที่ยววัด แนะนำวัด แจกcd ธรรมะฟรี
    • Email
replicate mysql with MariaDB-Galera-cluster on 3 nodes
« on: กันยายน 27, 2015, 08:47:49 AM »
Installing MariaDB Galera MySQL Cluster on Debian 7 VPS
http://blog.networkpresence.co/?p=4297

golfreeze

  • Administrator
  • Hero Member
  • *****
  • Posts: 2145
    • View Profile
    • นั่งสมาธิ สติปัฏฐานสี่ พาเที่ยววัด แนะนำวัด แจกcd ธรรมะฟรี
    • Email
Re: replicate mysql with MariaDB-Galera-cluster on 3 nodes
« Reply #1 on: กันยายน 27, 2015, 08:48:32 AM »
At Network Presence we don’t meter or charge for data exchanged between customer VPS, so you can purchase 3 x VPS from us and use open-source Clustered Database technologies described below to create a High Availability and scaled MySQL compatible database for your sites and content. And it’s a Cluster that doesn’t need any shared, distributed or clustered filesystem for the storage of the MySQL Databases themselves (that’s because this data is replicated to each node by the MariaDB Galera software itself).

Here’s the steps (down to the commands to run as root) to install and configure the MariaDB Galera Clustered Database server software on Debian 7 running across a 3 Node Cluster (3 nodes to increase DB consistency, response time & reduce split-brain occurrences of full DB re-syncs).

Also note that currently (in MariaDB 5.5 at least) the MariaDB Galera Cluster only supports the InnoDB/XtraDB storage engine, so ensure that your Databases to be Clustered are InnoDB based for DB type.

Some Linux or system level requirements or pre-install setup tasks are:

1) Ensure the /etc/hosts files are the same on each Node and that it contains entries for each DB Node’s IP addresses and hostname (both the node hostname and its fully qualified hostname).

eg:
cat << EOF >> /etc/hosts
192.168.0.1 node1 node1.domain.name
192.168.0.2 node2 node2.domain.name
192.168.0.3 node3 node3.domain.name
EOF

2) Ensure that each Node’s Firewall permits each other node host IP addresses access to ports 4444 and 4567 and the standard MySQL port of 3306.

Then, run/do the following on each Node to get the MariaDB Galera software installed:

a) Install per-requisite Debian Packages:

apt-get -y install perl python-software-properties rsync

b) Setup the MariaDB 5.5 Repository (localised for Australia):

apt-key adv --recv-keys --keyserver keyserver.ubuntu.com 0xcbcb082a1bb943db
add-apt-repository 'deb http://mirror.aarnet.edu.au/pub/MariaDB/repo/5.5/debian wheezy main'

Note: using MariaDB version 5.5 as 10.0 is still alpha/beta at this time.

c) Install MariaDB Galera:

apt-get install mariadb-galera-server galera

d) Ensure that there’s no “Empty” MySQL Users:

mysql -e "SET wsrep_on = OFF; DELETE FROM mysql.user WHERE user = '';"

Once all 3 Nodes have the above software installed and the initial MySQL settings applied, bring them all up and online (mysql running with the default Debian settings on each node is ok at this point).

Install the relevant Database content to the “first” Node, which we’ll call “Node1” or reference as “node1”, this may well involve using and knowing the MySQL ‘root’ username and its password.

After that and on Node 1, restart its MySQL software as the initiating node of a Galera Cluster by loading the following configuration file(s) and restarting Node1’s MySQL server software, as per:

d) cat << EOF > /etc/mysql/conf.d/cluster.cnf
[server]
bind_address = 0.0.0.0 ## Be reachable via the network
#
[mysqld]
##
## gcomm:// only for initial start of first Node
## After that it should list each Node's IP as it's joined to the Cluster, ending up with:
##wsrep_cluster_address = 'gcomm://192.168.0.2,192.168.0.3'
wsrep_cluster_address = 'gcomm://'
wsrep_provider = /usr/lib/galera/libgalera_smm.so
##wsrep_retry_autocommit = 0
wsrep_sst_method = rsync
binlog_format=ROW
query_cache_size=0
default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2
innodb_locks_unsafe_for_binlog=1
#
# optional additional suggested settings:
##innodb_buffer_pool_size=28G
##innodb_log_file_size=100M
##innodb_file_per_table
##innodb_flush_log_at_trx_commit=2
#
[mysqld_safe]
log-error=/var/log/mysql/mysqld_safe.log
EOF

e) Restart the MySQL Server on Node 1 with the above /etc/mysql/conf.d/cluster.cnf file installed:

/etc/init.d/mysql restart

f) Check how MySQL Galera Cluster servers is going after restart with:

tail -f /var/log/mysql/mysqld_safe.log

g) Check how the MariaDB Galera Server is running as a Cluster with the local mysql client command:

mysql -uroot -pMYSQLROOTPASSWD -e "SHOW STATUS LIKE 'wsrep_%';"

You should see output like:

+----------------------------+--------------------------------------+
| Variable_name | Value |
+----------------------------+--------------------------------------+
| wsrep_local_state_uuid | cafebeef-0123-1234-9876-5ebcdef01234 |
| wsrep_protocol_version | 4 |
| wsrep_last_committed | 1 |
| wsrep_replicated | 0 |
| wsrep_replicated_bytes | 0 |
| wsrep_received | 2 |
| wsrep_received_bytes | 232 |
| wsrep_local_commits | 0 |
| wsrep_local_cert_failures | 0 |
| wsrep_local_bf_aborts | 0 |
| wsrep_local_replays | 0 |
| wsrep_local_send_queue | 0 |
| wsrep_local_send_queue_avg | 0.500000 |
| wsrep_local_recv_queue | 0 |
| wsrep_local_recv_queue_avg | 0.000000 |
| wsrep_flow_control_paused | 0.000000 |
| wsrep_flow_control_sent | 0 |
| wsrep_flow_control_recv | 0 |
| wsrep_cert_deps_distance | 0.000000 |
| wsrep_apply_oooe | 0.000000 |
| wsrep_apply_oool | 0.000000 |
| wsrep_apply_window | 0.000000 |
| wsrep_commit_oooe | 0.000000 |
| wsrep_commit_oool | 0.000000 |
| wsrep_commit_window | 0.000000 |
| wsrep_local_state | 4 |
| wsrep_local_state_comment | Synced |
| wsrep_cert_index_size | 0 |
| wsrep_causal_reads | 0 |
| wsrep_incoming_addresses | 192.168.0.1:3306 |
| wsrep_cluster_conf_id | 4 |
| wsrep_cluster_size | 1 |
| wsrep_cluster_state_uuid | cafebeef-0123-1234-9876-5ebcdef01234 |
| wsrep_cluster_status | Primary |
| wsrep_connected | ON |
| wsrep_local_index | 1 |
| wsrep_provider_name | Galera |
| wsrep_provider_vendor | Codership Oy |
| wsrep_provider_version | 23.2.7-wheezy(r) |
| wsrep_ready | ON |
+----------------------------+--------------------------------------+

With Node 1 now running a single-node Galera Cluster, you can configure and start Nodes 2 and 3 with these following commands done and verified on Node 2 and then Node 3:

h) Load MariaDB Cluster config file, listing Node 1’s IP address as the initial ‘gcomm’ Cluster server, on Node 2 do:

cat << EOF > /etc/mysql/conf.d/cluster.cnf
[server]
bind_address = 0.0.0.0 ## Be reachable via the network
#
[mysqld]
##
## gcomm:// only for initial start of first Node
## After that it should list each Node's IP as it's joined to the Cluster, ending up with:
##wsrep_cluster_address = 'gcomm://192.168.0.1,192.168.0.3' (for Node 2) and
##wsrep_cluster_address = 'gcomm://192.168.0.1,192.168.0.2' (for Node 3) once all nodes are in the Cluster.
wsrep_cluster_address = 'gcomm://192.168.0.1'
wsrep_provider = /usr/lib/galera/libgalera_smm.so
##wsrep_retry_autocommit = 0
wsrep_sst_method = rsync
binlog_format=ROW
query_cache_size=0
default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2
innodb_locks_unsafe_for_binlog=1
#
# optional additional suggested settings:
##innodb_buffer_pool_size=28G
##innodb_log_file_size=100M
##innodb_file_per_table
##innodb_flush_log_at_trx_commit=2
#
[mysqld_safe]
log-error=/var/log/mysql/mysqld_safe.log
EOF

i) Restart the MySQL Server on Node 2 with the above /etc/mysql/conf.d/cluster.cnf file installed:

/etc/init.d/mysql restart

j) Check how MySQL Galera Cluster servers is going after restart with:
Note: Any UUID numbers are replaced with ‘…’ in the following output

tail -f /var/log/mysql/mysqld_safe.log

You should see log file entries about each node joining and being syncronised into the Galera MySQL Cluster on Node 1, entries like:

[Note] WSREP: declaring ... stable
[Note] WSREP: Node ... state prim
[Note] WSREP: (..., 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://192.168.0.3:4567
[Note] WSREP: view(view_id(PRIM,...,11) memb {
...,
...,
} joined {
} left {
} partitioned {
...,
})
[Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 2
[Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
[Note] WSREP: forgetting ... (tcp://192.168.0.3:4567)
[Note] WSREP: (..., 'tcp://0.0.0.0:4567') turning message relay requesting off
[Note] WSREP: STATE EXCHANGE: sent state msg: ...
[Note] WSREP: STATE EXCHANGE: got state msg: ... from 0 (node1.domain.name)
[Note] WSREP: STATE EXCHANGE: got state msg: ... from 1 (node2.domain.name)
[Note] WSREP: Quorum results:
version = 2,
component = PRIMARY,
conf_id = 8,
members = 2/2 (joined/total),
act_id = 162,
last_appl. = 0,
protocols = 0/4/2 (gcs/repl/appl),
group UUID = ...
[Note] WSREP: Flow-control interval: [23, 23]
[Note] WSREP: New cluster view: global state: ...:162, view# 9: Primary, number of nodes: 2, my index: 1, protocol version 2
[Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
[Note] WSREP: Assign initial position for certification: 162, protocol version: 2
14:24:37 [Note] WSREP: cleaning up ... (tcp://192.168.0.3:4567)
[Note] WSREP: declaring ... stable
[Note] WSREP: declaring ... stable
[Note] WSREP: Node ... state prim
[Note] WSREP: view(view_id(PRIM,...,12) memb {
...,
...,
...,
} joined {
} left {
} partitioned {
})
[Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 3
[Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
[Note] WSREP: STATE EXCHANGE: sent state msg: ...
[Note] WSREP: STATE EXCHANGE: got state msg: ... from 0 (node1.domain.name)
[Note] WSREP: STATE EXCHANGE: got state msg: ... from 1 (node2.domain.name)
[Note] WSREP: STATE EXCHANGE: got state msg: ... from 2 (node3.domain.name)
[Note] WSREP: Quorum results:
version = 2,
component = PRIMARY,
conf_id = 9,
members = 3/3 (joined/total),
act_id = 162,
last_appl. = 0,
protocols = 0/4/2 (gcs/repl/appl),
group UUID = ...
[Note] WSREP: Flow-control interval: [28, 28]
[Note] WSREP: New cluster view: global state: ...:162, view# 10: Primary, number of nodes: 3, my index: 1, protocol version 2
[Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
[Note] WSREP: Assign initial position for certification: 162, protocol version: 2
[Note] WSREP: Member 2 (node3.domain.name) synced with group.

k) Again, use the mysql client on Node 2 to query Node 2’s Galera Server for its idea of how many Nodes the Cluster has, etc.

Run on Node 2:

mysql -uroot -pMYSQLROOTPASSWD -e "SHOW STATUS LIKE 'wsrep_%';"

With Node 1 and 2 now running a dual-node Galera Cluster, you can configure and start Node 3 with these following commands done and verified on Node 1 and then Node 2:

l) Load MariaDB Cluster config file, listing Node 1 and 2’s IP address as the initial ‘gcomm’ Cluster server, on Node 3 do:

cat << EOF > /etc/mysql/conf.d/cluster.cnf
[server]
bind_address = 0.0.0.0 ## Be reachable via the network
#
[mysqld]
##
## gcomm:// only for initial start of first Node
## After that it should list each Node's IP as it's joined to the Cluster, ending up with:
##wsrep_cluster_address = 'gcomm://192.168.0.1,192.168.0.2' (for Node 3) once all nodes are in the Cluster.
wsrep_cluster_address = 'gcomm://192.168.0.1,192.168.0.2'
wsrep_provider = /usr/lib/galera/libgalera_smm.so
##wsrep_retry_autocommit = 0
wsrep_sst_method = rsync
binlog_format=ROW
query_cache_size=0
default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2
innodb_locks_unsafe_for_binlog=1
#
# optional additional suggested settings:
##innodb_buffer_pool_size=28G
##innodb_log_file_size=100M
##innodb_file_per_table
##innodb_flush_log_at_trx_commit=2
#
[mysqld_safe]
log-error=/var/log/mysql/mysqld_safe.log
EOF

m) Restart the MySQL Server on Node 3 with the above /etc/mysql/conf.d/cluster.cnf file installed:

/etc/init.d/mysql restart

n) Check the mysqld_safe.log logfiles on all nodes to see that Node 3 also joins and syncronises with the Cluster.

At this point you have all 3 x Nodes in the Galera MySQL Cluster and you can now update Node 1 and Node 2’s cluster.cnf file to set the IP Addresses of all Nodes in the Cluster.

o) On Node 1, change the line in /etc/mysql/conf.d/cluster.cnf that says:

From:
wsrep_cluster_address = 'gcomm://'

To:
wsrep_cluster_address = 'gcomm://192.168.0.1,192.168.0.2,192.168.0.3'

And then restart MySQL on Node 1, checking the relevant log files and mysql client command to query the Cluster membership once Node 1’s mysql server software has restarted, with:

/etc/init.d/mysql restart
tail -f /var/log/mysql/mysqld_safe.log
mysql -uroot -pMYSQLROOTPASSWD -e "SHOW STATUS LIKE 'wsrep_%';"

Once Node 1 has been reconfigured to know of all Cluster Members, do the same step above on Node 2 and Node 3, setting Node 2 and Node 3’s wsrep_cluster_address in cluster.cnf to:

wsrep_cluster_address = 'gcomm://192.168.0.1,192.168.0.2,192.168.0.3'

And restarting their MySQL Servers and checking that they then each rejoin the live Cluster.

Finally by now you should have a running live 3 Node MariaDB Galera MySQL Cluster, the status of which on each Node should list that there’s 3 members of the Cluster, via:

mysql -uroot -pMYSQLROOTPASSWD -e "SHOW STATUS LIKE 'wsrep_%';" | egrep "wsrep_incoming_addresses|wsrep_cluster_size"

FYI and regards,
Richard.

golfreeze

  • Administrator
  • Hero Member
  • *****
  • Posts: 2145
    • View Profile
    • นั่งสมาธิ สติปัฏฐานสี่ พาเที่ยววัด แนะนำวัด แจกcd ธรรมะฟรี
    • Email
Re: replicate mysql with MariaDB-Galera-cluster on 3 nodes
« Reply #2 on: กรกฎาคม 01, 2017, 10:54:59 AM »
เจอปัญหาเกี่ยวกับ MariaDB-galera cluster เมื่อโหนดทั้งหมดที่เป็นตัว cluster ทำการปิดการทำงานลงพร้อมกัน แล้วไม่ได้เปิดใช้เป็นเวลานาน
อาจจะทำให้ตัว ตำแหน่งที่ทำการ sync ล่าสุดเคลื่อนได้ครับ หรือเวลาที่ปิด service mariadb พร้อมกัน เลยทำให้ mysql ยังปิดตัวได้ไม่สมบูรณ์ ก็จะทำให้เกิดปัญหาว่า เวลาเปิดขึ้นมาใหม่แล้ว mariadb cluster ไม่ทำงานครับ
Error :
It may not be safe to bootstrap the cluster from this node. It was not the last one to leave the cluster and may not contain all the updates.
To force cluster bootstrap with this node, edit the grastate.dat file manually and set safe_to_bootstrap to 1 .

Jul  1 10:26:28  mysqld: 2017-07-01 10:26:28  [ERROR] WSREP: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out)
Jul  1 10:26:28  mysqld: #011 at gcomm/src/pc.cpp:connect():158


หรือถ้าเช็คโดย netstat ก็จะไม่เจอว่า port 4567 ทำงานอยู่

แต่เราสามารถแก้ไขได้ง่ายๆ ครับ โดยขั้นแรกต้องตั้งสติให้ดีก่อน เพราะทุกปัญหามีทางออกเสมอ
ไปที่ตัว Primary node ครับ เข้าไปที่ /var/lib/mysql/grastate.dat
ทำการ vi เข้าไปปรับค่า  safe_to_bootstrap  จาก 0 ให้ปรับเป็น 1 ครับ เพื่อให้ระบบ mariadb-cluster ทำการซ่อมตำแหน่งให้กับ cluster ก่อน ถึงเปิดการทำงาน

# GALERA saved state
version: 2.1
uuid:    9acf4d34-acdb-11e6-bcc3-d3e36276629f
seqno:   15
safe_to_bootstrap: 1

หลังจากนั้นทำการ รัน command ให้ join cluster กันอีกรอบ พร้อมกับ debug log ไปด้วย
#service mysql bootstrap ; tail -f /var/log/syslog

ถ้าทำได้ไม่มี error ขึ้น ก็จะสามารถซ่อมตำแหน่งแล้วเปิดให้ cluster ทำงานได้อีกครั้ง
หลังจากนั้นก็ทำการ start node อื่นๆ ที่อยู่ใน clustering ครับ

node2#/etc/init.d/mysql start
node3#/etc/init.d/mysql start

เท่านี้ก็เรียบร้อยครับผม : )

golfreeze

  • Administrator
  • Hero Member
  • *****
  • Posts: 2145
    • View Profile
    • นั่งสมาธิ สติปัฏฐานสี่ พาเที่ยววัด แนะนำวัด แจกcd ธรรมะฟรี
    • Email
Re: replicate mysql with MariaDB-Galera-cluster on 3 nodes
« Reply #3 on: มีนาคม 17, 2018, 09:34:17 PM »
เจอปัญหา ของ slave mysql ไม่ sync เนื่องจาก HA_ERR_KEY_NOT_FOUND
Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000854, end_log_pos 859094925, Error_code: 1032

https://www.fromdual.ch/replication-troubleshooting-classic-vs-gtid
https://www.opsdash.com/blog/mysql-replication-howto.html