您好,登錄后才能下訂單哦!
postgres-xl是個好東西?為什么呢?
它可以搞oltp, 抗衡mysql, mysql沒有分析函數功能
它可以搞oltp, 抗衡oracle, oracle生態弱, 沒法實時
它對json支持好, 抗衡mongodb, mongodb用得人越來越少了。。。
它還恐怖地支持各種語言擴展, java, javascript, r, python, haskell。。。
并且開源免費,簡單強大的沒朋友。。。
它比greenpulm版本新: 它跟greenpulm本是一家人,都是MPP架構的postgresql集群
但是原生改造, 版本基本上與postgresql一致, greenpulm的版本升不動啊。。。
它比oracle RAC/Teradata便宜,免費使用
它比hadoop省資源,沒有GC,基于C語言資源利用率高,并且生態圈豐富,可視化方便
它出道早,版本穩定性強。
postgres-xl分為以下組件:
a. gtm 負責全局事務
b. coordinator 處理分發執行
c. datanode 負責底層處理
datanode跟coordinator都 連接到gtm,
客戶端連接到coordinator運行sql,
coordinator使用gtm進行一些事務功能分發給datanode執行
大數據的發展方向
什么是MPP?
MPP (Massively Parallel Processing),即大規模并行處理,是分布式、并行、結構化數據庫集群,具備高性能、高可用、高擴展特性,可以為超大規模數據管理提供高性價比的通用計算平臺,廣泛用于支撐各類數據倉庫系統、BI 系統。
MPP架構特征:
?任務并行執行?
?數據分布式存儲(本地化)?
?分布式計算?
?橫向擴展
思考對比
Oracle集群的特點:每個節點全量存儲
Mysql熱備的特點:主從,每個節點全量存儲
MPP——分布式關系數據庫
GreenPlum:基于Postgres XL 8.2進行封裝,多年沒升級過。
Mysql Cluster:剛出來兩年,穩定性和性能比較差。
Postgres XL 與 Postgres LL ,是Postgresql數據庫兩種集群模式,Postgres XL是當今主流MPP
Postgres XL 集群架構
GTM:Global Transaction Manager
Coordinator:協調器
Datanode:數據節點
GTM-Proxy:GTM代理器
組件介紹
Global Transaction Monitor (GTM)
全局事務管理器,確保群集范圍內的事務一致性。 GTM負責發放事務ID和快照作為其多版本并發控制的一部分。
集群可選地配置一個備用GTM(GTM Standby),以改進可用性。此外,可以在協調器間配置代理GTM, 可用于改善可擴展性,減少GTM的通信量。
GTM Standby
GTM的備節點,在pgxc,pgxl中,GTM控制所有的全局事務分配,如果出現問題,就會導致整個集群不可用,為了增加可用性,增加該備用節點。當GTM出現問題時,GTM Standby可以升級為GTM,保證集群正常工作。
GTM-Proxy
GTM需要與所有的Coordinators通信,為了降低壓力,可以在每個Coordinator機器上部署一個GTM-Proxy。
Coordinator
協調員管理用戶會話,并與GTM和數據節點進行交互。協調員解析,產生查詢計劃,并給語句中的每一個組件發送下一個序列化的全局性計劃。
通常此服務和數據節點部署在一起。
正式安裝
Note: 其實在生產環境,如果你集群的數量少于20臺的話,甚至可以不需要使用gtm-proxy
#1)System?Initialization?Optimization?on?every?nodes cat?>>?/etc/security/limits.conf?<<?EOF *?hard?memlock?unlimited *?soft?memlock?unlimited *?-?nofile?65535 EOF setenforce?0 sed?-i?'s/^SELINUX=.*$/SELINUX=disabled/'?/etc/selinux/config ? systemctl?stop?firewalld.service systemctl?disable?firewalld.service cat?>/etc/hosts?<<EOF 172.31.1.81?neo4j01 172.31.4.146?neo4j02 172.31.3.178?neo4j03 172.31.8.178?neo4j04 EOF #2)create?postgres?user?on?every?nodes useradd?postgres echo?Ad@sd119|passwd?--stdin?postgres echo?'postgres?ALL=(ALL)?NOPASSWD:?ALL'?>>/etc/sudoers ###################################################################### #3)Configure?ssh?authentication?to?avoid?inputing?password?for?pgxc_ctl(run?this?commad?on?every?nodes) ###################################################################### su?-?postgres ssh-keygen?-t?rsa??? cat?~/.ssh/id_rsa.pub?>>?~/.ssh/authorized_keys chmod?600?authorized_keys cat?~/.ssh/id_rsa.pub?|?ssh?neo4j01?'cat?>>?~/.ssh/authorized_keys' ############################################ #4)Install?dependency?packages?on?every?nodes# ############################################ sudo?yum?install?-y?flex?bison?readline-devel?zlib-devel?openjade?docbook-style-dsssl?gcc?bzip2?e2fsprogs-devel?uuid-devel?libuuid-devel?make?wget? wget?-c?http://download.cashalo.com/schema/postgres-xl-9.5r1.6.tar.bz2?&&?tar?jxf?postgres-xl-9.5r1.6.tar.bz2 cd?postgres-xl-9.5r1.6 ./configure?--prefix=/home/postgres/pgxl9.5?--with-uuid=ossp?--with-uuid=ossp?&&?make?&&?make?install?&&?cd?contrib/?&&?make?&&?make?install ? #5)Configuring?environment?variables?on?every?nodes? cat?>>/home/postgres/.bashrc?<<EOF export?PGHOME=/home/postgres/pgxl9.5 export?LD_LIBRARY_PATH=\$PGHOME/lib:\$LD_LIBRARY_PATH export?PATH=\$PGHOME/bin:\$PATH EOF source?/home/postgres/.bashrc #6)create?data?dirsctory?on?every?nodes mkdir?-p?/home/postgres/data/ #7)?just?run?command?pgxc_ctl?on?neo4j01 pgxc_ctl #?then?input?command?prepare? prepare #?finally?run?q?to?exit? q #8)?Edit?file?pgxc_ctl.conf?(just?do?int?on?neo4j01) vim?/home/postgres/pgxc_ctl/pgxc_ctl.conf ########################################### #!/usr/bin/env?bash pgxcInstallDir=/home/postgres/pgxl9.5 pgxlDATA=/home/postgres/data #----?OVERALL?----------------------------------------------------------------------------- # pgxcOwner=postgres??????????????????????#?owner?of?the?Postgres-XC?databaseo?cluster.??Here,?we?use?this ????????????????????????????????????????????????#?both?as?linus?user?and?database?user.??This?must?be ????????????????????????????????????????????????#?the?super?user?of?each?coordinator?and?datanode. pgxcUser=postgres???????????????#?OS?user?of?Postgres-XC?owner tmpDir=/tmp?????????????????????????????????????#?temporary?dir?used?in?XC?servers localTmpDir=$tmpDir?????????????????????#?temporary?dir?used?here?locally configBackup=n??????????????????????????????????#?If?you?want?config?file?backup,?specify?y?to?this?value. #configBackupHost=pgxc-linker???#?host?to?backup?config?file #configBackupDir=$HOME/pgxc?????????????#?Backup?directory #configBackupFile=pgxc_ctl.bak??#?Backup?file?name?-->?Need?to?synchronize?when?original?changed. #----?GTM?------------------------------------------------------------------------------------ #?GTM?is?mandatory.??You?must?have?at?least?(and?only)?one?GTM?master?in?your?Postgres-XC?cluster. #?If?GTM?crashes?and?you?need?to?reconfigure?it,?you?can?do?it?by?pgxc_update_gtm?command?to?update #?GTM?master?with?others.???Of?course,?we?provide?pgxc_remove_gtm?command?to?remove?it.??This?command #?will?not?stop?the?current?GTM.??It?is?up?to?the?operator. #----?GTM?Master?----------------------------------------------- #----?Overall?---- gtmName=gtm1 gtmMasterServer=neo4j01 gtmMasterPort=6666 gtmMasterDir=$pgxlDATA/gtm1 #----?Configuration?--- gtmExtraConfig=none?????????????????????#?Will?be?added?gtm.conf?for?both?Master?and?Slave?(done?at?initilization?only) gtmMasterSpecificExtraConfig=none???????#?Will?be?added?to?Master's?gtm.conf?(done?at?initialization?only) #----?GTM?Slave?----------------------------------------------- #?Because?GTM?is?a?key?component?to?maintain?database?consistency,?you?may?want?to?configure?GTM?slave #?for?backup. #----?Overall?------ gtmSlave=y??????????????????????????????????????#?Specify?y?if?you?configure?GTM?Slave.???Otherwise,?GTM?slave?will?not?be?configured?and ????????????????????????????????????????????????????????#?all?the?following?variables?will?be?reset. gtmSlaveName=gtm2 gtmSlaveServer=neo4j02??????????#?value?none?means?GTM?slave?is?not?available.??Give?none?if?you?don't?configure?GTM?Slave. gtmSlavePort=6666???????????????????????#?Not?used?if?you?don't?configure?GTM?slave. gtmSlaveDir=$pgxlDATA/gtm2??????#?Not?used?if?you?don't?configure?GTM?slave. #?Please?note?that?when?you?have?GTM?failover,?then?there?will?be?no?slave?available?until?you?configure?the?slave #?again.?(pgxc_add_gtm_slave?function?will?handle?it) #----?Configuration?---- gtmSlaveSpecificExtraConfig=none?#?Will?be?added?to?Slave's?gtm.conf?(done?at?initialization?only) #----?GTM?Proxy?------------------------------------------------------------------------------------------------------- #?GTM?proxy?will?be?selected?based?upon?which?server?each?component?runs?on. #?When?fails?over?to?the?slave,?the?slave?inherits?its?master's?gtm?proxy.??It?should?be #?reconfigured?based?upon?the?new?location. # #?To?do?so,?slave?should?be?restarted.???So?pg_ctl?promote?->?(edit?postgresql.conf?and?recovery.conf)?->?pg_ctl?restart # #?You?don't?have?to?configure?GTM?Proxy?if?you?dont'?configure?GTM?slave?or?you?are?happy?if?every?component?connects #?to?GTM?Master?directly.??If?you?configure?GTL?slave,?you?must?configure?GTM?proxy?too. #----?Shortcuts?------ gtmProxyDir=$pgxlDATA/gtm_proxy #----?Overall?------- gtmProxy=n??????????????????????????????#?Specify?y?if?you?conifugre?at?least?one?GTM?proxy.???You?may?not?configure?gtm?proxies ????????????????????????????????????????????????#?only?when?you?dont'?configure?GTM?slaves. ????????????????????????????????????????????????#?If?you?specify?this?value?not?to?y,?the?following?parameters?will?be?set?to?default?empty?values. ????????????????????????????????????????????????#?If?we?find?there're?no?valid?Proxy?server?names?(means,?every?servers?are?specified ????????????????????????????????????????????????#?as?none),?then?gtmProxy?value?will?be?set?to?"n"?and?all?the?entries?will?be?set?to ????????????????????????????????????????????????#?empty?values. #gtmProxyNames=(gtm_pxy1?gtm_pxy2?gtm_pxy3?gtm_pxy4)????#?No?used?if?it?is?not?configured #gtmProxyServers=(neo4j01?neo4j02?neo4j03?neo4j04)??????????????????????#?Specify?none?if?you?dont'?configure?it. #gtmProxyPorts=(6660?6666?6666?6666)????????????????????????????#?Not?used?if?it?is?not?configured. #gtmProxyDirs=($gtmProxyDir?$gtmProxyDir?$gtmProxyDir?$gtmProxyDir)?????#?Not?used?if?it?is?not?configured. #----?Configuration?---- gtmPxyExtraConfig=none??????????#?Extra?configuration?parameter?for?gtm_proxy.??Coordinator?section?has?an?example. gtmPxySpecificExtraConfig=(none?none?none?none) #----?Coordinators?---------------------------------------------------------------------------------------------------- #----?shortcuts?---------- coordMasterDir=$pgxlDATA/coord coordSlaveDir=$pgxlDATA/coord_slave coordArchLogDir=$pgxlDATA/coord_archlog #----?Overall?------------ coordNames=(coord1?coord2?coord3?coord4)????????????????#?Master?and?slave?use?the?same?name coordPorts=(5432?5432?5432?5432)????????????????????????#?Master?ports poolerPorts=(6667?6667?6667?6667)???????????????????????#?Master?pooler?ports #coordPgHbaEntries=(192.168.29.0/24)????????????????????????????#?Assumes?that?all?the?coordinator?(master/slave)?accepts coordPgHbaEntries=(0.0.0.0/0) ????????????????????????????????????????????????????????????????????????????????????????????????#?the?same?connection ????????????????????????????????????????????????????????????????????????????????????????????????#?This?entry?allows?only?$pgxcOwner?to?connect. ????????????????????????????????????????????????????????????????????????????????????????????????#?If?you'd?like?to?setup?another?connection,?you?should ????????????????????????????????????????????????????????????????????????????????????????????????#?supply?these?entries?through?files?specified?below. #?Note:?The?above?parameter?is?extracted?as?"host?all?all?0.0.0.0/0?trust".???If?you?don't?want #?such?setups,?specify?the?value?()?to?this?variable?and?suplly?what?you?want?using?coordExtraPgHba #?and/or?coordSpecificExtraPgHba?variables. #coordPgHbaEntries=(::1/128)????#?Same?as?above?but?for?IPv6?addresses #----?Master?------------- coordMasterServers=(neo4j01?neo4j02?neo4j03?neo4j04)????????????#?none?means?this?master?is?not?available coordMasterDirs=($coordMasterDir?$coordMasterDir?$coordMasterDir?$coordMasterDir) coordMaxWALsernder=0????#?max_wal_senders:?needed?to?configure?slave.?If?zero?value?is?specified, ????????????????????????????????????????????????#?it?is?expected?to?supply?this?parameter?explicitly?by?external?files ????????????????????????????????????????????????#?specified?in?the?following.???If?you?don't?configure?slaves,?leave?this?value?to?zero. coordMaxWALSenders=($coordMaxWALsernder?$coordMaxWALsernder?$coordMaxWALsernder?$coordMaxWALsernder) ????????????????????????????????????????????????#?max_wal_senders?configuration?for?each?coordinator. #----?Slave?------------- coordSlave=n????????????????????#?Specify?y?if?you?configure?at?least?one?coordiantor?slave.??Otherwise,?the?following ????????????????????????????????????????????????#?configuration?parameters?will?be?set?to?empty?values. ????????????????????????????????????????????????#?If?no?effective?server?names?are?found?(that?is,?every?servers?are?specified?as?none), ????????????????????????????????????????????????#?then?coordSlave?value?will?be?set?to?n?and?all?the?following?values?will?be?set?to ????????????????????????????????????????????????#?empty?values. #coordSlaveSync=y???????????????#?Specify?to?connect?with?synchronized?mode. #coordSlaveServers=(node07?node08?node09?node06)????????????????????????#?none?means?this?slave?is?not?available #coordSlavePorts=(20004?20005?20004?20005)??????????????????????#?Master?ports #coordSlavePoolerPorts=(20010?20011?20010?20011)????????????????????????#?Master?pooler?ports #coordSlaveDirs=($coordSlaveDir?$coordSlaveDir?$coordSlaveDir?$coordSlaveDir) #coordArchLogDirs=($coordArchLogDir?$coordArchLogDir?$coordArchLogDir?$coordArchLogDir) #----?Configuration?files--- #?Need?these?when?you'd?like?setup?specific?non-default?configuration #?These?files?will?go?to?corresponding?files?for?the?master. #?You?may?supply?your?bash?script?to?setup?extra?config?lines?and?extra?pg_hba.conf?entries #?Or?you?may?supply?these?files?manually. coordExtraConfig=coordExtraConfig???????#?Extra?configuration?file?for?coordinators. ????????????????????????????????????????????????#?This?file?will?be?added?to?all?the?coordinators' ????????????????????????????????????????????????#?postgresql.conf #?Pleae?note?that?the?following?sets?up?minimum?parameters?which?you?may?want?to?change. #?You?can?put?your?postgresql.conf?lines?here. cat?>?$coordExtraConfig?<<EOF #================================================ #?Added?to?all?the?coordinator?postgresql.conf #?Original:?$coordExtraConfig log_destination?=?'stderr' logging_collector?=?on log_directory?=?'pg_log' listen_addresses?=?'*' max_connections?=?50 EOF #?Additional?Configuration?file?for?specific?coordinator?master. #?You?can?define?each?setting?by?similar?means?as?above. coordSpecificExtraConfig=(none?none?none?none) coordExtraPgHba=none????#?Extra?entry?for?pg_hba.conf.??This?file?will?be?added?to?all?the?coordinators'?pg_hba.conf coordSpecificExtraPgHba=(none?none?none?none) #-----?Additional?Slaves?----- # #?Please?note?that?this?section?is?just?a?suggestion?how?we?extend?the?configuration?for #?multiple?and?cascaded?replication.???They're?not?used?in?the?current?version. # #coordAdditionalSlaves=n????????????????#?Additional?slave?can?be?specified?as?follows:?where?you #coordAdditionalSlaveSet=(cad1)?????????#?Each?specifies?set?of?slaves.???This?case,?two?set?of?slaves?are #???????????????????????????????????????????????????????????????????????????????????????#?configured #cad1_Sync=n????????????????????????????#?All?the?slaves?at?"cad1"?are?connected?with?asynchronous?mode. #???????????????????????????????????????????????????????#?If?not,?specify?"y" #???????????????????????????????????????????????????????#?The?following?lines?specifies?detailed?configuration?for?each #???????????????????????????????????????????????????????#?slave?tag,?cad1.??You?can?define?cad2?similarly. #cad1_Servers=(node08?node09?node06?node07)?????#?Hosts #cad1_dir=$HOME/pgxc/nodes/coord_slave_cad1 #cad1_Dirs=($cad1_dir?$cad1_dir?$cad1_dir?$cad1_dir) #cad1_ArchLogDir=$HOME/pgxc/nodes/coord_archlog_cad1 #cad1_ArchLogDirs=($cad1_ArchLogDir?$cad1_ArchLogDir?$cad1_ArchLogDir?$cad1_ArchLogDir) #----?Datanodes?------------------------------------------------------------------------------------------------------- #----?Shortcuts?-------------- datanodeMasterDir=$pgxlDATA/dn_master datanodeSlaveDir=$pgxlDATA/dn_slave datanodeArchLogDir=$pgxlDATA/datanode_archlog #----?Overall?--------------- #primaryDatanode=datanode1??????????????????????????????#?Primary?Node. #?At?present,?xc?has?a?priblem?to?issue?ALTER?NODE?against?the?primay?node.??Until?it?is?fixed,?the?test?will?be?done #?without?this?feature. primaryDatanode=dn1?????????????????????????????#?Primary?Node. datanodeNames=(dn1?dn2?dn3?dn4) datanodePorts=(5433?5433?5433?5433)?????#?Master?ports datanodePoolerPorts=(6668?6668?6668?6668)???????#?Master?pooler?ports datanodePgHbaEntries=(0.0.0.0/0)????????#?Assumes?that?all?the?coordinator?(master/slave)?accepts ????????????????????????????????????????????????????????????????????????????????#?the?same?connection ????????????????????????????????????????????????????????????????????????????????#?This?list?sets?up?pg_hba.conf?for?$pgxcOwner?user. ????????????????????????????????????????????????????????????????????????????????#?If?you'd?like?to?setup?other?entries,?supply?them ????????????????????????????????????????????????????????????????????????????????#?through?extra?configuration?files?specified?below. #?Note:?The?above?parameter?is?extracted?as?"host?all?all?0.0.0.0/0?trust".???If?you?don't?want #?such?setups,?specify?the?value?()?to?this?variable?and?suplly?what?you?want?using?datanodeExtraPgHba #?and/or?datanodeSpecificExtraPgHba?variables. #datanodePgHbaEntries=(::1/128)?#?Same?as?above?but?for?IPv6?addresses #----?Master?---------------- datanodeMasterServers=(neo4j01?neo4j02?neo4j03?neo4j04)?#?none?means?this?master?is?not?available. ????????????????????????????????????????????????????????????????????????????????????????????????????????#?This?means?that?there?should?be?the?master?but?is?down. ????????????????????????????????????????????????????????????????????????????????????????????????????????#?The?cluster?is?not?operational?until?the?master?is ????????????????????????????????????????????????????????????????????????????????????????????????????????#?recovered?and?ready?to?run. datanodeMasterDirs=($datanodeMasterDir?$datanodeMasterDir?$datanodeMasterDir?$datanodeMasterDir) datanodeMaxWalSender=0??????????????????????????????????????????????????????????#?max_wal_senders:?needed?to?configure?slave.?If?zero?value?is ????????????????????????????????????????????????????????????????????????????????????????????????????????#?specified,?it?is?expected?this?parameter?is?explicitly?supplied ????????????????????????????????????????????????????????????????????????????????????????????????????????#?by?external?configuration?files. ????????????????????????????????????????????????????????????????????????????????????????????????????????#?If?you?don't?configure?slaves,?leave?this?value?zero. datanodeMaxWALSenders=($datanodeMaxWalSender?$datanodeMaxWalSender?$datanodeMaxWalSender?$datanodeMaxWalSender) ????????????????????????????????????????????????#?max_wal_senders?configuration?for?each?datanode #----?Slave?----------------- datanodeSlave=n?????????????????#?Specify?y?if?you?configure?at?least?one?coordiantor?slave.??Otherwise,?the?following ????????????????????????????????????????????????#?configuration?parameters?will?be?set?to?empty?values. ????????????????????????????????????????????????#?If?no?effective?server?names?are?found?(that?is,?every?servers?are?specified?as?none), ????????????????????????????????????????????????#?then?datanodeSlave?value?will?be?set?to?n?and?all?the?following?values?will?be?set?to ????????????????????????????????????????????????#?empty?values. #datanodeSlaveServers=(node07?node08?node09?node06)?????#?value?none?means?this?slave?is?not?available #datanodeSlavePorts=(20008?20009?20008?20009)???#?value?none?means?this?slave?is?not?available #datanodeSlavePoolerPorts=(20012?20013?20012?20013)?????#?value?none?means?this?slave?is?not?available #datanodeSlaveSync=y????????????#?If?datanode?slave?is?connected?in?synchronized?mode #datanodeSlaveDirs=($datanodeSlaveDir?$datanodeSlaveDir?$datanodeSlaveDir?$datanodeSlaveDir) #datanodeArchLogDirs=(?$datanodeArchLogDir?$datanodeArchLogDir?$datanodeArchLogDir?$datanodeArchLogDir?) #?----?Configuration?files?--- #?You?may?supply?your?bash?script?to?setup?extra?config?lines?and?extra?pg_hba.conf?entries?here. #?These?files?will?go?to?corresponding?files?for?the?master. #?Or?you?may?supply?these?files?manually. datanodeExtraConfig=none????????#?Extra?configuration?file?for?datanodes.??This?file?will?be?added?to?all?the ????????????????????????????????????????????????????????#?datanodes'?postgresql.conf datanodeSpecificExtraConfig=(none?none?none?none) datanodeExtraPgHba=none?????????#?Extra?entry?for?pg_hba.conf.??This?file?will?be?added?to?all?the?datanodes'?postgresql.conf datanodeSpecificExtraPgHba=(none?none?none?none) #-----?Additional?Slaves?----- datanodeAdditionalSlaves=n??????#?Additional?slave?can?be?specified?as?follows:?where?you #?datanodeAdditionalSlaveSet=(dad1?dad2)????????????????#?Each?specifies?set?of?slaves.???This?case,?two?set?of?slaves?are ????????????????????????????????????????????????????????????????????????????????????????#?configured #?dad1_Sync=n???????????????????????????#?All?the?slaves?at?"cad1"?are?connected?with?asynchronous?mode. ????????????????????????????????????????????????????????#?If?not,?specify?"y" ????????????????????????????????????????????????????????#?The?following?lines?specifies?detailed?configuration?for?each ????????????????????????????????????????????????????????#?slave?tag,?cad1.??You?can?define?cad2?similarly. #?dad1_Servers=(node08?node09?node06?node07)????#?Hosts #?dad1_dir=$HOME/pgxc/nodes/coord_slave_cad1 #?dad1_Dirs=($cad1_dir?$cad1_dir?$cad1_dir?$cad1_dir) #?dad1_ArchLogDir=$HOME/pgxc/nodes/coord_archlog_cad1 #?dad1_ArchLogDirs=($cad1_ArchLogDir?$cad1_ArchLogDir?$cad1_ArchLogDir?$cad1_ArchLogDir) #----?WAL?archives?------------------------------------------------------------------------------------------------- walArchive=n????#?If?you'd?like?to?configure?WAL?archive,?edit?this?section. ####################################################### #8)?when?u?first?time?to?setup?cluster(run?it?just?on?neo4j01) pgxc_ctl?-c?/home/postgres/pgxc_ctl/pgxc_ctl.conf?init?all? #9)?start?cluster(run?it?just?on?neo4j01) pgxc_ctl?-c?/home/postgres/pgxc_ctl/pgxc_ctl.conf?start?all ? #10)?stop?cluster(run?it?just?on?neo4j01) pgxc_ctl?-c?/home/postgres/pgxc_ctl/pgxc_ctl.conf?stop?all? #11)?check?every?one?is?running(run?it?just?on?neo4j01) pgxc_ctl monitor?all #12)?view?System?Table?(run?it?just?on?neo4j01) psql?-p5432 select?*?from?pgxc_node;
postgres-XL 下存在兩種數據表,分別是replication表和distribute表
REPLICATION復制表:各個datanode節點中,表的數據完全相同,也就是說,插入數據時,會分別在每個datanode節點插入相同數據。讀數據時,只需要讀任意一個datanode節點上的數據。小表采用。
建表語法:
postgres=#?create?table?rep(col1?int,col2?int)distribute?by?replication;
DISTRIBUTE表 :會將插入的數據,按照拆分規則,分配到不同的datanode節點中存儲,也就是sharding技術。每個datanode節點只保存了部分數據,通過coordinate節點可以查詢完整的數據視圖。分布式存儲,大表采用,默認
postgres=#??CREATE?TABLE?dist(col1?int,?col2?int)?DISTRIBUTE?BY?HASH(col1);
如何驗證分布式存儲?
分別插入100行數據:
postgres=#?INSERT?INTO?rep?SELECT?generate_series(1,100),?generate_series(101,?200); postgres=#?INSERT?INTO?dist?SELECT?generate_series(1,100),?generate_series(101,?200);
psql -p 5432,通過Coordinater 訪問查詢完整的數據視圖;
psql -p 5433,5433是Datanode的端口,此時只訪問該單個節點
如何鏈接到指定的數據庫呢?看下面的例子
psql -p 5432 aa? ,aa為指定的庫名,不指定時默認是postgres庫,相當于hive里的default庫
查詢這個分布表數據在每個節點的分布:
postgres=#?SELECT?xc_node_id,?count(*)?FROM?dist?GROUP?BY?xc_node_id; ?xc_node_id?|?count ------------+------- ?-700122826?|????19 ??352366662?|????27 ?-560021589?|????23 ??823103418?|????31 (4?rows)
查詢每個節點的ID信息
postgres=#?select?*?from?pgxc_node;
我們再來看看復制表
postgres=#?select?xc_node_id,count(*)?from?rep?group?by?xc_node_id; ?xc_node_id?|?count ------------+------- ?-560021589?|???100 (1?row)
因為我們是在neo4j01節點上查詢的,所以顯示的id就是neo4j01節點的id;同理,如果我們是在其它節點查詢的話那就顯示其它節點的id。也就是說當我們查詢復制表的時候,它只會走一個節點,不會走多個節點。針對這個特點,如果未來數據量很大,我們查詢的時候,可以走負載均衡。
https://www.cnblogs.com/sfnz/p/7908380.html
Psql是PostgreSQL的一個命令行交互式客戶端工具。PostgreSQL 一些命令、用法、語法,在Postgres xl集群都是通用的。
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。