1. POSTGRES-BDR WITH
GOOGLE CLOUD PLATFORM
ROCKPLACE
Email: sjyun@rockplace.co.kr
http://rockplace.co.kr
Copyright ⓒ 2016 Rockplace Inc.. All right
Reserved
2. Copyright ⓒ All right Reserved by 2016 Rockplace Inc.
윤성재 (공작명왕)
gongjak@gmail.com https://gongjak.me
• 리눅스 엔지니어
• 서비스 운영 Administrator
• Database 운영 Administrator
• 클라우드 솔루션 아키텍쳐
• 1984년 Apple II 와 만남
• 1994년 Linux 와의 첫 만남
• 2015년 서비스에 PostgreSQL 처음 적용
• 2016년 빠르게 훑어보는 구글 클라우드 플랫폼
3. Copyright ⓒ All right Reserved by 2016 Rockplace Inc.
AGENDA
구글 클라우드 플랫폼
- 테스트 환경으로 선택한 이유
Postgres-BDR
- Installation & Setup
- Create Cluster
pgbench Test
- Test 실행
- 결과
5. OpenSource PLACE, ROCKPLACE
Copyright ⓒ 2016 Rockplace Inc. All rights Reserved
Create Instance on Google Cloud Platform
• Region 위치 : asia-east, eu-west, us-central
• 각 Region 별로 3대씩, 총 9 대 생성
• Region 별 2 대는 BDR 설치, 1대는 pgbench 테스트용
• OS : Debian 8 Jessie
9. OpenSource PLACE, ROCKPLACE
Copyright ⓒ 2016 Rockplace Inc. All rights Reserved
Install Postgres-BDR
BDR extension 이 반드시 설치되어야 한다. (package의 경우, postgresql-bdr-contrib)
sudo sh -c 'echo "deb [arch=amd64]
http://packages.2ndquadrant.com/bdr/apt/ jessie-2ndquadrant
main" >> /etc/apt/sources.list.d/2ndquadrant.list'
wget --quiet -O -
http://packages.2ndquadrant.com/bdr/apt/AA7A6805.asc | sudo apt-key
add -
sudo apt-get update
sudo apt-get -y install postgresql-bdr-9.4-bdr-plugin
10. OpenSource PLACE, ROCKPLACE
Copyright ⓒ 2016 Rockplace Inc. All rights Reserved
Initial PostgreSQL
PostgreSQL 에 문제가 있다고 생각되어 초기화를 하고자 할 때
sudo service postgresql stop
sudo rm -rf /var/lib/postgresql/9.4
sudo rm -rf /etc/postgresql/9.4/main
sudo rm -rf /service/db/pgsql/9.4/main
sudo chown -R postgres.postgres /var/lib/postgresql
sudo pg_createcluster -d /var/lib/postgresql/9.4/main 9.4 main
sudo service postgresql start
11. OpenSource PLACE, ROCKPLACE
Copyright ⓒ 2016 Rockplace Inc. All rights Reserved
Add DB account
Username : dbadmin Password : dbadmin1234
sudo su - postgres
psql -dpostgres -c "CREATE ROLE dbadmin LOGIN PASSWORD
'dbadmin1234' superuser;"
12. OpenSource PLACE, ROCKPLACE
Copyright ⓒ 2016 Rockplace Inc. All rights Reserved
BDR을 위한 PostgreSQL 설정
/etc/postgresql/9.4/main/postgresql.conf
• listen_addresses = '*'
• shared_preload_libraries =
'bdr'
• wal_level = 'logical'
• track_commit_timestamp = on
• max_connections = 100
• max_wal_senders = 10
• max_replication_slots = 10
• max_worker_processes = 10
13. OpenSource PLACE, ROCKPLACE
Copyright ⓒ 2016 Rockplace Inc. All rights Reserved
BDR을 위한 PostgreSQL 설정
/etc/postgresql/9.4/main/pg_hba.conf
all all 10.128.0.0/9 md5
# The standby server must connect with a user that has replication
privileges.
# for BDR setting
echo 'host replication postgres 10.128.0.0/9 trust
# 중간 IPv4부분을 아래와 같이 10.128.0.0/9 해 줘야 로컬에서도 접속이 됨
host all all 10.128.0.0/9 trust
15. OpenSource PLACE, ROCKPLACE
Copyright ⓒ 2016 Rockplace Inc. All rights Reserved
Setup Postgres-BDR
All nodes
sudo su – postgres
createdb testdb
psql
postgres=# connect testdb
You are now connected to database "testdb" as user "postgres".
testdb=# CREATE EXTENSION btree_gist;
CREATE EXTENSION
testdb=# CREATE EXTENSION bdr;
CREATE EXTENSION
16. OpenSource PLACE, ROCKPLACE
Copyright ⓒ 2016 Rockplace Inc. All rights Reserved
Setup Postgres-BDR
Instance 생성 순서 및 Cluster Join
2. bdr-asia-2
1. bdr-asia-1 3. bdr-us-1
4. bdr-us-2
5. bdr-eu-1
6. bdr-eu-2
17. OpenSource PLACE, ROCKPLACE
Copyright ⓒ 2016 Rockplace Inc. All rights Reserved
Setup Postgres-BDR
BDR Cluster 내부 구성
bdr-asia-2
bdr-asia-1 bdr-us-1
bdr-us-2
bdr-eu-1
bdr-eu-2
18. OpenSource PLACE, ROCKPLACE
Copyright ⓒ 2016 Rockplace Inc. All rights Reserved
Setup Postgres-BDR
bdr-asia-1 서버에서의 설정 (메인 노드 그룹 생성)
testdb=# SELECT bdr.bdr_group_create(
local_node_name := 'asia-node-001',
node_external_dsn := 'host=bdr-asia-1 port=5432 dbname=testdb’
);
bdr_group_join
----------------
(1 row)
19. OpenSource PLACE, ROCKPLACE
Copyright ⓒ 2016 Rockplace Inc. All rights Reserved
Setup Postgres-BDR
bdr-asia-1 서버에서의 설정 (확인)
testdb=# SELECT bdr.bdr_node_join_wait_for_ready();
bdr_node_join_wait_for_ready
------------------------------
(1 row)
20. OpenSource PLACE, ROCKPLACE
Copyright ⓒ 2016 Rockplace Inc. All rights Reserved
Setup Postgres-BDR
bdr-asia-2 서버와 그 외 서버에서의 설정 (메인 그룹 참여)
testdb=# SELECT bdr.bdr_group_join(
local_node_name := 'asia-node-002',
node_external_dsn := 'host=bdr-asia-2 port=5432 dbname=testdb',
join_using_dsn := 'host=bdr-asia-1 port=5432 dbname=testdb'
);
bdr_group_join
----------------
(1 row)
21. OpenSource PLACE, ROCKPLACE
Copyright ⓒ 2016 Rockplace Inc. All rights Reserved
Setup Postgres-BDR
bdr-asia-2 서버와 그 외 서버에서의 설정 (확인)
testdb=# SELECT bdr.bdr_node_join_wait_for_ready();
bdr_node_join_wait_for_ready
------------------------------
(1 row)
22. OpenSource PLACE, ROCKPLACE
Copyright ⓒ 2016 Rockplace Inc. All rights Reserved
Add node
• pg_basebackup 으로 SOURCE로부터 전체 백업
• bdr_init_copy 명령어를 이용
• 주요 Configuration Setting : 충분히 크게~~
max_wal_senders = 10
max_replication_slots = 10
max_worker_processes = 10
• bdr_init_copy_postgres.log 를 확인
23. OpenSource PLACE, ROCKPLACE
Copyright ⓒ 2016 Rockplace Inc. All rights Reserved
Remove node
• 함수로 쉽게 노드를 제거 할 수 있다고 하나, 실제로는 함수 실행 후 확인해보면
노드 정보가 남아있다. (select * from bdr.bdr_nodes;)
• node 하나만 제거
• 여러개의 노드를 한 번에 제거
• asia-node-003 제거
SELECT bdr.bdr_part_by_node_names(ARRAY['node-1']);
SELECT bdr.bdr_part_by_node_names(ARRAY['node-1', 'node-2', 'node-3']);
SELECT bdr.bdr_part_by_node_names(ARRAY['asia-node-003']);
24. OpenSource PLACE, ROCKPLACE
Copyright ⓒ 2016 Rockplace Inc. All rights Reserved
Monitoring
• Monitoring nodes
• Monitoring connected peers using pg_stat_replication
• Monitoring replication slots
• Monitoring global DDL locks
SELECT * FROM bdr.bdr_nodes;
SELECT * FROM pg_stat_replication;
SELECT pg_xlog_location_diff(pg_current_xlog_insert_location(), flush_location) AS
lag_bytes, pid, application_nameFROM pg_stat_replication;
SELECT * FROM pg_replication_slots;
SELECT slot_name, database, active,
pg_xlog_location_diff(pg_current_xlog_insert_location(), restart_lsn) AS
retained_bytesFROM pg_replication_slotsWHERE plugin = 'bdr';
select * from bdr.bdr_global_locks ;
27. OpenSource PLACE, ROCKPLACE
Copyright ⓒ 2016 Rockplace Inc. All rights Reserved
pgbench Test 실행
pgbench -U postgres -P 5432 -i testdb
NOTICE: table "pgbench_history" does not exist, skipping
NOTICE: table "pgbench_tellers" does not exist, skipping
NOTICE: table "pgbench_accounts" does not exist, skipping
NOTICE: table "pgbench_branches" does not exist, skipping
creating tables...100000 of 100000 tuples (100%) done (elapsed 0.15 s,
remaining 0.00 s).
vacuum...
set primary keys...
done. -U postgres : postgres 유저로 접속
-P 5432 : 5432 포트로 접속
-i : 테스트준비
testdb : 디비로 testdb 를 이용하겠다.
28. OpenSource PLACE, ROCKPLACE
Copyright ⓒ 2016 Rockplace Inc. All rights Reserved
pgbench Test 실행
pgbench -U postgres -p 5432 -S -c 10 -t 10000 testdb
starting vacuum...end.
transaction type: SELECT only
scaling factor: 1
query mode: simple
number of clients: 10
number of threads: 1
number of transactions per client: 10000
number of transactions actually processed: 100000/100000
latency average: 0.000 ms
tps = 11496.597122 (including connections establishing)
tps = 11520.436044 (excluding connections establishing)
-U postgres : postgres 유저로 접속
-P 5432 : 5432 포트로 접속
-i : 테스트준비
testdb : 디비로 testdb 를 이용하겠다.
29. OpenSource PLACE, ROCKPLACE
Copyright ⓒ 2016 Rockplace Inc. All rights Reserved
Custom query 를 이용한 pgbench test
• create table tbl_bdr(c1 int);
• vi test.sql
insert into tbl_bdr values (3);
• pgbench -U postgres -n -S -T 60 -c 10
-f test.sql testdb
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 10
number of threads: 1
duration: 60 s
number of transactions actually processed: 37
9920
latency average: 1.579 ms
tps = 6331.344284 (including connections est
ablishing)
tps = 6332.955115 (excluding connections est
ablishing)
31. OpenSource PLACE, ROCKPLACE
Copyright ⓒ 2016 Rockplace Inc. All rights Reserved
각 Region 별 ping 테스트 결과
Count : 100
Ping 방향 결과
Asia US 100 packets transmitted, 100 received, 0% packet loss, time 99131ms
rtt min/avg/max/mdev = 153.147/153.280/154.927/0.547 ms
US Asia 100 packets transmitted, 100 received, 0% packet loss, time 99132ms
rtt min/avg/max/mdev = 153.192/153.291/153.849/0.403 ms
Asia EU 100 packets transmitted, 100 received, 0% packet loss, time 99134ms
rtt min/avg/max/mdev = 257.257/257.393/258.276/0.412 ms
EU Asia 100 packets transmitted, 100 received, 0% packet loss, time 99135ms
rtt min/avg/max/mdev = 257.297/257.418/258.220/0.582 ms
US EU 100 packets transmitted, 100 received, 0% packet loss, time 99175ms
rtt min/avg/max/mdev = 105.583/105.661/106.485/0.435 ms
EU US 100 packets transmitted, 100 received, 0% packet loss, time 99075ms
rtt min/avg/max/mdev = 105.564/105.661/106.434/0.132 ms
32. OpenSource PLACE, ROCKPLACE
Copyright ⓒ 2016 Rockplace Inc. All rights Reserved
case 1. 한 대의 서버에만 데이타 insert
Number
of Client
Time (s) number of
transactions
actually processed
latency
average
tps (a) tps (b)
1 600 608506 0.986 ms 1014.175220 1014.186607
5 600 2042736 1.469 ms 3404.536435 3404.757975
10 600 3928831 1.527 ms 6547.974117 6548.823522
* tps (a) : including connections establishing
* tps (b) : excluding connections establishing
35. OpenSource PLACE, ROCKPLACE
Copyright ⓒ 2016 Rockplace Inc. All rights Reserved
Target : 1 Server / single table / 10 Client / Time
600s
8:38:43
8:45:08
8:45:37
36. OpenSource PLACE, ROCKPLACE
Copyright ⓒ 2016 Rockplace Inc. All rights Reserved
case 3-1. 모든 6대 서버의 동일한 table에 데이타 insert
1 Client
* tps (a) : including connections establishing
* tps (b) : excluding connections establishing
Server number of transactions
actually processed
latency
average
tps (a) tps (b)
asia-1 534279 1.123 ms 890.464274 890.475745
asia-2 502395 1.194 ms 837.324570 837.335128
us-1 599298 1.001 ms 998.827496 998.839511
us-1 579556 1.035 ms 965.924930 965.935587
eu-1 536775 1.118 ms 894.624563 894.635521
eu-2 491666 1.220 ms 819.441429 819.451690
38. OpenSource PLACE, ROCKPLACE
Copyright ⓒ 2016 Rockplace Inc. All rights Reserved
case 2. 모든 6대 서버의 동일한 table에 데이타 insert
5 Client
* tps (a) : including connections establishing
* tps (b) : excluding connections establishing
Server number of transactions
actually processed
latency
average
tps (a) tps (b)
asia-1 1473954 2.035 ms 2456.558290 2456.709572
asia-2 1419295 2.114 ms 2365.475865 2365.627880
us-1 1885541 1.591 ms 3142.556035 3142.733625
us-1 1963248 1.528 ms 3272.055383 3272.237804
eu-1 1463636 2.050 ms 2439.364252 2439.493710
eu-2 1327172 2.260 ms 2211.920534 2212.054668
40. OpenSource PLACE, ROCKPLACE
Copyright ⓒ 2016 Rockplace Inc. All rights Reserved
case 2-3. 모든 6대 서버의 동일한 table에 데이타 insert
10 Client
* tps (a) : including connections establishing
* tps (b) : excluding connections establishing
Server number of transactions
actually processed
latency
average
tps (a) tps (b)
asia-1 2463532 2.436 ms 4105.810928 4106.300082
asia-2 2331712 2.573 ms 3886.117520 3886.644086
us-1 3116742 1.925 ms 5194.523249 5195.134521
us-1 3368447 1.781 ms 5614.055016 5614.691748
eu-1 2276560 2.636 ms 3794.106359 3794.560195
eu-2 2085027 2.878 ms 3475.015874 3475.501669
42. OpenSource PLACE, ROCKPLACE
Copyright ⓒ 2016 Rockplace Inc. All rights Reserved
case 3-1. 모든 6대 서버의 서로 다른 table에 데이타
insert
1 Client
* tps (a) : including connections establishing
* tps (b) : excluding connections establishing
Server number of transactions
actually processed
latency
average
tps (a) tps (b)
asia-1 529021 1.134 ms 881.701129 881.711758
asia-2 522471 1.148 ms 870.784077 870.795609
us-1 550149 1.091 ms 916.913836 916.924565
us-1 575519 1.043 ms 959.198086 959.209793
eu-1 532446 1.127 ms 887.408480 887.418299
eu-2 484087 1.239 ms 806.810680 806.820653
44. OpenSource PLACE, ROCKPLACE
Copyright ⓒ 2016 Rockplace Inc. All rights Reserved
case 3-2. 모든 6대 서버의 서로 다른 table에 데이타
insert
5 Client
* tps (a) : including connections establishing
* tps (b) : excluding connections establishing
Server number of transactions
actually processed
latency
average
tps (a) tps (b)
asia-1 1431422 2.096 ms 2385.673194 2385.813896
asia-2 1427147 2.102 ms 2378.573572 2378.737027
us-1 1859986 1.613 ms 3099.954957 3100.141470
us-1 1926080 1.558 ms 3210.126357 3210.306652
eu-1 1445888 2.075 ms 2409.783223 2409.921491
eu-2 1432652 2.094 ms 2387.676745 2387.819102
46. OpenSource PLACE, ROCKPLACE
Copyright ⓒ 2016 Rockplace Inc. All rights Reserved
case 3-3. 모든 6대 서버의 서로 다른 table에 데이타
insert
10 Client
* tps (a) : including connections establishing
* tps (b) : excluding connections establishing
Server number of transactions
actually processed
latency
average
tps (a) tps (b)
asia-1 2371002 2.531 ms 3951.535516 3952.012653
asia-2 2287929 2.622 ms 3813.095796 3813.547846
us-1 3132266 1.916 ms 5220.347514 5221.013372
us-1 3130898 1.916 ms 5218.109569 5218.779341
eu-1 2564901 2.339 ms 4274.759315 4275.233359
eu-2 2244939 2.673 ms 3741.478971 3741.955598
48. OpenSource PLACE, ROCKPLACE
Copyright ⓒ 2016 Rockplace Inc. All rights Reserved
마치며…
• Support Multi-Master
• Multi Region Replication
• Effect Network latency
• Community Support
Email : bdr-list@2ndquadrant.com
Forum :
https://groups.google.com/a/2ndquadrant.com/forum/#!forum/bdr-list .
구글은 하나의 네트워크로 구성되어 있어 BDR을 테스트하고 사용함에 있어 최적의 조건을 가지고 있다.
# 모든 서버 요청에 대해서 받을수 있게 설정
listen_addresses = '*'
# BDR 이 매개 변수는 쉼표로 구분 된 값 중 하나 BDR을 포함한다.매개 변수는 서버 기동시 변경 될 수 있습니다.
shared_preload_libraries = 'bdr'
# BDR 둘다 이 변수는 logical 을 세팅해야 함
wal_level = 'logical'
# BDR을 사용하기 위해서는 이 변수가 true 세팅 되어야 하며, UDR을 사용할 경우 false, 문서와 실제 파일이 안 맞음 on 으로 세팅
track_commit_timestamp = on
max_connections = 100
# 접속가능한 슬레이브의 접속수를 설정( 슬레이브 수 + 2)인거 같은데.. backup용 확실치 않음
max_wal_senders = 10
# 노드 + 1
max_replication_slots = 10
# BDR 구성 데이터베이스 당 하나의 작업자 및 연결 당 하나의 작업자을 가지고 충분히 큰 값으로 설정해야합니다.
max_worker_processes = 10
bdr-asia-1 서버 생성
bdr-asia-2 서버를 bdr-asia-1 로 join
bdr-us-1 서버를 bdr-asia-1 로 join
bdr-us-2 서버를 bdr-us-1 로 join
bdr-eu-1 서버를 bdr-us-1 로 join
bdr-eu-2 서버를 bdr-eu-1 로 join
bdr-asia-1 서버 생성
bdr-asia-2 서버를 bdr-asia-1 로 join
bdr-us-1 서버를 bdr-asia-1 로 join
bdr-us-2 서버를 bdr-us-1 로 join
bdr-eu-1 서버를 bdr-us-1 로 join
bdr-eu-2 서버를 bdr-eu-1 로 join
max_wal_senders : 접속가능한 슬레이브의 접속수를 설정( 슬레이브 수 + 2)
For BDR this needs to be set big enough so that every connection to this node has a free wal sender process.
max_replication_slots : 노드 + 1
For BDR this needs to be set big enough so that every connection to this node has a free replication slot.
max_worker_processes : 충분히 큰 값으로 설정
For BDR this has to be set to a big enough value to have one worker per configured database, and one worker per connection.