yhl71
作者yhl71·2019-05-13 09:16
数据库架构师·某公司

在单个RedHat虚拟机中安装Db2开发版中的purescale特性

字数 16196阅读 4967评论 1赞 3

1. 前言

本文阐述了在RedHat上安装Db2开发版中 pureScale的详细过程,描述了仅部署一个member和一个CF的过程,主要目的是让大家在单个虚机上可以安装pureScale,不需要很多的机器、存储和网络设备,帮助大家尽快的熟悉pureScale的安装和使用。
在本次安装中使用的是Db2开发版,这个版本可以从下面的网址获得,在安装结束后,不需要注册许可证,不存在90天过期的问题。
https://www.ibm.com/us-en/marketplace/ibm-db2-direct-and-developer-editions

2.虚机的内存大小

建议配置为6GB左右,作者发现配置为4.5GB时,后面在执行activate database时,CF会报申请内存错误。

3.受支持的操作系统

在RedHat上安装,受支持的操作系统版本是6.7版本到7.4版本;
如果你手上有10.5的版本,受支持的RedHat版本是5.9到6.5;但10.5版本没有开发版,在安装结束后需要注册许可证。

4.在虚机中增加磁盘

在虚拟机的VM/setting中选择Disk,然后点击Add按钮增加一个3GB的磁盘,这个磁盘将作为将来purescale的实例共享目录。如下图所示:
1bjv7rswsss

1bjv7rswsss

在虚机启动之后,执行fdisk -l你将看到/dev/sdb这块盘,purescale的实例将来会存放在这块盘对应的GPFS文件系统中。
[root@node01 Desktop]# fdisk -l
Disk /dev/sdb: 3221 MB, 3221225472 bytes
255 heads, 63 sectors/track, 391 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

5.上传安装包

sftp root@192.168.1.6

put v11.1_linuxx64_dec.tar.gz
然后解压安装包
cd /root/db2
gzip -d v11.1_linuxx64_dec.tar.gz
tar xvf v11.1_linuxx64_dec.tar

6.进行安装之前的准备工作

1)修改操作系统内核参数

编辑/etc/sysctl.conf,增加下列内容
kernel.shmmni=1024
kernel.shmmax=4294967296
kernel.shmall=2097152
kernel.sem=250 256000 32 1024
kernel.msgmni=4096
kernel.msgmax=65536
kernel.msgmnb=65536
vm.dirty_background_ratio = 5
vm.dirty_ratio = 10
vm.swappiness = 5
vm.overcommit_memory = 0
执行sysctl -p 后这些参数马上生效,不需要重启实例。

2)增加创建实例所需要的用户和组

groupadd --gid 1001 db2iadm1
groupadd --gid 1002 db2fadm1

useradd --uid 1004 -g db2iadm1 -m -d /home/db2sdin1 db2sdin1
useradd --uid 1003 -g db2fadm1 -m -d /home/db2sdfe1 db2sdfe1

passwd db2sdin1
passwd db2sdfe1

3)安装和配置open ssh

(1)修改下面的文件,去掉注解:

File: /etc/ssh/ssh_config
Port 22
Protocol 2,1

File: /etc/ssh/sshd_config
PermitRootLogin yes
PasswordAuthentication no

(2)设置root和db2sdin1用户基于公钥的身份验证

[root@node01 ~]# pwd
/root
[root@node01 ~]# cd .ssh
[root@node01 ~]# mkdir .ssh
[root@node01 ~]# ssh-keygen -t dsa
Generating public/private dsa key pair.
Enter file in which to save the key (/root/.ssh/id_dsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_dsa.
Your public key has been saved in /root/.ssh/id_dsa.pub.
The key fingerprint is:
SHA256:nxIn2PWF/w+3ANlLrhLlibu+cTbMW2a7cgbRvHeasfA root@node01
The key's randomart image is:
+---[DSA 1024]----+
| |
| . |
| . + . |
| o . +o= |
| . S =o+oo |
| Bo==.oo.|
| ..=*.Xo=+|
| ++.BoEoo|
| .++o+o...|
+----[SHA256]-----+

[root@node01 ~]# cd .ssh
[root@node01 .ssh]# cat id_dsa.pub >> authorized_keys
[root@node01 .ssh]# chmod 644 authorized_keys

[root@node01 .ssh]# ssh node01 hostname
The authenticity of host 'node01 (fe80::c460:f21f:70bf:afdb%ens33)' can't be established.
ECDSA key fingerprint is SHA256:TnZKcs1N8mOUb4g+Bbx2azWbApf/1ZBe5ILEkbgq7yw.
ECDSA key fingerprint is MD5:9f:b8:49:f3:80:d8:d7:e6:53:49:84:29:b9:1a:8c:46.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'node01,fe80::c460:f21f:70bf:afdb%ens33' (ECDSA) to the list of known hosts.
node01
[root@node01 .ssh]# ssh node01 hostname
node01
最后一步不再需要密码,说明配置无密码登陆成功。
以同样的步骤设置db2sdin1用户的基于公钥的身份验证。

4)配置本地yum源

挂载
mount -o loop /root/RHEL-6.7-20150702.0-Server-x86_64-dvd1.iso /mnt/cdrom

备份/etc/yum.repos.d/下的文件:

mkdir /etc/yum.repos.d/backup

mv /etc/yum.repos.d/rh* /etc/yum.repos.d/backup/

在/etc/yum.repos.d/下新建repo文件,repo的文件名随便命名,但必须要以.repo结尾,并配置如下内容:

vi /etc/yum.repos.d/RHEL-ISO.repo

在vi下,将RHEL-ISO.repo中写下如下内容:

[base]
name=iso
baseurl=file:///mnt/cdrom
enabled=1
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-redhat-release

清除缓存:
yum clean all
测试是否配置成功:
yum install httpd

5)参照下面的网址安装所需要的操作系统包

https://www.ibm.com/support/knowledgecenter/SSEPGG_11.1.0/com.ibm.db2.luw.qb.server.doc/doc/r0057441.html
作者发现,缺省安装的操作系统缺少下面的一些包:
yum groupinstall 'Infiniband Support'
yum install gcc
yum install cpp
yum install gcc-c++
yum install kernel-devel
yum install ksh
yum install ntp
yum install sg3_utils
yum install libstdc++.so.6
yum install pam32*
yum install libcxgb*
yum install m4
yum install binutils-devel
yum install patch

在redhat 6.7的版本中安装,Db2 purescale并不需要操作系统包 perl-Sys-Syslog,但作者发现在7.4版本的redhat中需要这个包,否则TSA的安装会失败。

6)修改主机名

修改 /etc/sysconfig/network 文件.
打开这文件进行编辑.
HOSTNAME=node01
修改之后你需要重启网络
/etc/init.d/network restart
修改 /etc/hosts
192.168.1.6 node01 node01
但作者发现需要reboot之后,hostname才会生效。

7)关闭selinux

在RHEL系统上,如果启用了安全性增强型Linux(SELinux)并且处于强制模式,则安装程序可能会因SELinux限制而失败。
要确定是否已安装SELinux并处于强制模式,您可以使用以下步骤之一:

 检查/etc / sysconfig/selinux文件。
 运行sestatus命令。   

查看selinux状态
[root@node01 lib]# sestatus
SELinux status: enabled
SELinuxfs mount: /sys/fs/selinux
SELinux root directory: /etc/selinux
Loaded policy name: targeted
Current mode: enforcing
Mode from config file: enforcing
Policy MLS status: enabled
Policy deny_unknown status: allowed
Max kernel policy version: 28

要禁用SELinux,您可以使用以下方法:

 将其设置为许可模式并以超级用户身份运行setenforce 0命令。
 修改/etc /sysconfig/selinux并重启机器

[root@node01 server_dec]# sestatus
SELinux status: disabled

8)运行安装包中的db2prereqcheck确认每一步都成功

./db2prereqcheck -v 11.1.4.4 -p -o /tmp/report3.out

Checking prerequisites for DB2 installation. Version "11.1.4.4". Operating system "Linux"

Validating "Linux distribution " ...
Required minimum operating system distribution: "RHEL"; Version: "6"; Service pack: "7".
Actual operating system distribution Version: "6"; Service pack: "7".
Requirement matched.

Validating "kernel level " ...
Required minimum operating system kernel level: "2.6.16".
Actual operating system kernel level: "2.6.32".
Requirement matched.

Validating "C++ Library version " ...
Required minimum C++ library: "libstdc++.so.6"
Standard C++ library is located in the following directory: "/usr/lib64/libstdc++.so.6.0.13".
Actual C++ library: "CXXABI_1.3.1"
Requirement matched.

Validating "32 bit version of "libstdc++.so.6" " ...
Found the 32 bit "/usr/lib/libstdc++.so.6" in the following directory "/usr/lib".
Requirement matched.

Validating "libaio.so version " ...
DBT3553I The db2prereqcheck utility successfully loaded the libaio.so.1 file.
Requirement matched.

Validating "libnuma.so version " ...
DBT3610I The db2prereqcheck utility successfully loaded the libnuma.so.1 file.
Requirement matched.

Validating "/lib/libpam.so*" ...
Requirement matched.
DBT3533I The db2prereqcheck utility has confirmed that all installation prerequisites were met.

作者后来在redhat7.4安装的过程中发现,在操作系统包m4没有安装的情况下,db2prereqcheck能够通过,但创建实例的时候会报错;因此建议严格按照官网上的要求检查purescale的安装前置条件,否则在创建实例时就会失败。
需要注意的是db2sdin1目录下的.ssh目录的权限为700,否则db2prereqcheck运行会有报错。

9)修改文件/var/ct/cfg/netmon.cf

增加下面一行内容
!REQD eth0 192.168.1.6
其他的内容不要修改
这个文件用于TSA监控各个节点的网络状态

7.安装Db2产品

执行./db2_install
./db2_install
Read the license agreement file in the db2/license directory.


To accept those terms, enter "yes". Otherwise, enter "no" to cancel the install process. [yes/no]
yes

Default directory for installation of products - /opt/ibm/db2/V11.1


Install into default directory (/opt/ibm/db2/V11.1) ? [yes/no]
yes

Specify one of the following keywords to install DB2 products.

SERVER
CLIENT
RTCL

Enter "help" to redisplay product names.

Enter "quit" to exit.


SERVER


Do you want to install the DB2 pureScale Feature? [yes/no]
yes
在安装结束后,检查/tmp下的日志文件,确认没有错误发生

8.创建实例共享目录

[root@node01 instance]# ./db2cluster_prepare -instance_shared_dev /dev/sdb -instance_shared_mount /db2sd -t /tmp/sd_prepare.trc -l /tmp/sd_prepare.log

/db2sd就是后面db2icrt所使用的实例共享目录。
DBI1446I The db2cluster_prepare command is running.

DB2 installation is being initialized.

Total number of tasks to be performed: 1
Total estimated time for all tasks to be performed: 60 second(s)

Task #1 start
Description: Creating IBM General Parallel File System (GPFS) Cluster and Filesystem
Estimated time 60 second(s)
Task #1 end

The execution completed successfully.

For more information see the DB2 installation log at "/tmp/sd_prepare.log".
DBI1070I Program db2cluster_prepare completed successfully.
说明:在实际的生产或者测试环境中,是不需要使用db2cluster_prepare 去创建GPFS的,db2icrt会自动创建GPFS作为实例共享目录

9.创建实例

./db2icrt -m node01 -mnet node01 -cf node01 -cfnet node01 -instance_shared_dir /db2sd -tbdev node01 -u db2sdfe1 -d db2sdin1
这里使用IP作为tiebreak disk;
在创建结束之后,请检查/tmp目录下的db2icrt的有关日志文件,确认创建实例的过程中没有报错。

10.查看许可证

$db2licm -l

Product name: "IBM DB2 Developer-C Edition"
License type: "Community"
Expiry date: "Permanent"
Product identifier: "db2dec"
Version information: "11.1"
Max amount of memory (GB): "16"
Max number of cores: "4"
Max amount of table space (GB): "100"
发现许可是永久有效的,后面3行是开发版的限制。

11.创建数据库

在虚机中安装使用的是TCPIP网络,不是infiniband和万兆的ROCE卡,因此需要设置DB2_SD_ALLOW_SLOW_NETWORK注册变量;
[db2sdin1@node01 ~]$ db2set -lr | grep -i sd
DB2_SD_ALLOW_SLOW_NETWORK
[db2sdin1@node01 ~]$ db2set DB2_SD_ALLOW_SLOW_NETWORK=on
[db2sdin1@node01 ~]$ db2 terminate
DB20000I The TERMINATE command completed successfully.
[db2sdin1@node01 ~]$ db2start
db2sdin1@node01 ~]$ db2 create db psdb
DB20000I The CREATE DATABASE command completed successfully.
[db2sdin1@node01 ~]$ db2 activate db psdb
DB20000I The ACTIVATE DATABASE command completed successfully.

12.查看集群实例状态

[db2sdin1@node01 ~]$ db2instance -list
kr00gj42xpi

kr00gj42xpi

13.在安装中碰到的问题

1)在设置db2sdin1用户的基于公钥的身份验证之后,发现db2reqreqcheck运行不能通过,报下面的错误:

The db2prereqcheck tool detected Interface Adapter "node01" is an Ethernet adapter configured for Sockets. Configure it in dat.conf if it is RDMA capable on host machine named "node01".
The db2prereqcheck tool detected Interface Adapter "node01" is an Ethernet adapter configured for Sockets. Configure it in dat.conf if it is RDMA capable on host machine named "node01".
Validating "passwordless ssh" ...
DBT3567E The db2prereqcheck utility found that db2locssh is not configured and passwordless SSH is not enabled between host "node01" and host "node01".
ERROR : Requirement not matched.

Validating "PING TEST" ...
DBT3572W The db2prereqcheck utility found that netname "node01" is not pingable from host "node01".
DBT3572W The db2prereqcheck utility found that netname "node01" is not pingable from host "node01".
WARNING : Requirement not matched.

DBT3572W The db2prereqcheck utility found that netname "node01" is not pingable from host "node01".
解决办法
发现是.ssh目录的权限不对.
drwxr-xr-x 2 db2sdin1 db2iadm1 4096 May 9 07:26 .ssh
.ssh目录要求的权限是700
[db2sdin1@node01 ~]$ chmod 700 .ssh

2)在执行db2icrt时在日志中报下面的错误

这个问题是作者在redhad7.4上发现的。
ERROR: A reachable IP address could not be automatically determined that did
not belong to one of the hosts in the DB2 pureScale instance. There may be a
problem with the network adapters gateway IP address or with the hosts network
connection. Verify connectivity for the hosts and manually edit the
configuration file /var/ct/cfg/netmon.cf on each host to include an IP on the
network outside of the DB2 pureScale instance that can be reached by the ping
command so that DB2 may ensure network connectivity. Hosts: "node01 ". The
format of /var/ct/cfg/netmon.cf lines is as follows: !REQD eth1 9.26.123.245
虚拟出来的网卡名称为ens33,不是eth1之类的名称。
[root@node01 tmp]# ifconfig -a
ens33: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500

    inet 192.168.1.5  netmask 255.255.255.0  broadcast 192.168.1.255

root@node01 tmp]# cat /var/ct/cfg/netmon.cf
!REQD ens33 192.168.1.5
不清楚为什么purescale不认ens33这样的名称。
这个问题有个解决办法是这支下面的环境变量,在创建实例时,不进行检查
export SKIP_NETMON_VALIDATION=YES

3)在激化数据库时报SQL2049N的错误

db2 activate db psdb
SQL2049N Database activation failed because there is insufficient CF memory.
Reason code = "1".
通过执行 db2 ? SQL2049N可以获得错误的描述信息
解决办法:

db2 update dbm cfg using numdb 1

然后将虚机的内存从4.5GB增加到6GB;

4)在redhat7.4上安装时,执行db2_install的过程中,没法安装GPFS包gpfs.ext-4.2.3-0.x86_64.rpm

解决办法:

  以debug方式运行installGPFS,
   cd server_dec/db2/linuxamd64/gpfs
   ./installGPFS -a -f -d  > /tmp/gpfs.debug.out 2>&1
   打开/tmp/gpfs.debug.out文件,发现有下面的错误:
   + rpm -i gpfs.ext-4.2.3-0.x86_64.rpm

error: Failed dependencies:

m4 is needed by gpfs.ext-4.2.3-0.x86_64
然后执行yum list m4并没有找到这个包
运行下面两个命令之一就可以安装m4操作系统包。
yum install m4
或者
yum groupinstall 'Infiniband Support'

5)因为警告造成db2start报错,无法正常启动实例

$ db2start
05/14/2019 19:18:48 0 0 SQL6036N START or STOP DATABASE MANAGER command is already in progress.
SQL1032N No start database manager command was issued. SQLSTATE=57019
df -h显示GPFS已经正常启动
$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/rhel-root 17G 16G 1.8G 90% /
devtmpfs 2.8G 0 2.8G 0% /dev
tmpfs 2.8G 4.0K 2.8G 1% /dev/shm
tmpfs 2.8G 9.0M 2.8G 1% /run
tmpfs 2.8G 0 2.8G 0% /sys/fs/cgroup
/dev/sda1 1014M 179M 836M 18% /boot
tmpfs 568M 24K 568M 1% /run/user/0
db2fs1 3.0G 1.1G 2.0G 36% /db2sd
[db2sdin1@node01 ~]$ db2instance -list
显示有警告
ID TYPE STATE HOME_HOST CURRENT_HOST ALERT PARTITION_NUMBER LOGICAL_PORT NETNAME


0 MEMBER ERROR node01 node01 YES 0 0 node01
128 CF STOPPED node01 node01 NO - 0 node01

HOSTNAME STATE INSTANCE_STOPPED ALERT


node01 ACTIVE NO YES
There is currently an alert for members, CFs, hosts, cluster file system or cluster configuration in the data-sharing instance. For more information on the alert, its impact, and how to clear it, run the following command: 'db2cluster -list -alert'.

[db2sdin1@node01 ~]$ db2cluster -list -alert
1.
Alert: DB2 member '0' failed to start on its home host 'node01'. The cluster manager will attempt to restart the DB2 member in restart light mode on another host. Check the db2diag.log for messages concerning failures on host 'node01' for member '0'.

Action: Check the member db2diag log files for messages about member failures and the cluster caching facility cfdiag log files for messages about CF failures on the host. If there are alerts about network adapters not responding, this alert cannot be cleared manually. It will be cleared when a network adapter becomes available. If it is not a problem with network adapters, this alert needs to be manually cleared after other alerts are handled. To clear this alert run the following command: 'db2cluster -cm -clear -alert -member 0'. For more information, see the 'Troubleshooting options for the db2cluster command' topic in the DB2 Information Center.

Impact: DB2 member '0' will not be able to service requests until this alert has been cleared and the DB2 member returns to its home host.

[db2sdin1@node01 ~]$ db2cluster -cm -clear -alert -member 0
The alerts have been successfully cleared.
[db2sdin1@node01 ~]$ db2cluster -list -alert
There are no alerts
[db2sdin1@node01 ~]$ db2instance -list
ID TYPE STATE HOME_HOST CURRENT_HOST ALERT PARTITION_NUMBER LOGICAL_PORTNETNAME


0 MEMBER RESTARTING node01 node01 NO 0 0 node01
128 CF STOPPED node01 node01 NO - 0 node01

HOSTNAME STATE INSTANCE_STOPPED ALERT


node01 ACTIVE NO NO
[db2sdin1@node01 ~]$ db2start
05/14/2019 19:24:17 0 0 SQL1026N The database manager is already active.
SQL1026N The database manager is already active.
[db2sdin1@node01 ~]$ db2pd -

Database Member 0 -- Active -- Up 0 days 00:01:12 -- Date 2019-05-14-19.24.24.887219
[db2sdin1@node01 ~]$ db2instance -list
ID TYPE STATE HOME_HOST CURRENT_HOST ALERT PARTITION_NUMBER LOGICAL_PORTNETNAME


0 MEMBER STARTED node01 node01 NO 0 0 node01
128 CF PRIMARY node01 node01 NO - 0 node01

HOSTNAME STATE INSTANCE_STOPPED ALERT


node01 ACTIVE NO NO

如果觉得我的文章对您有用,请点赞。您的支持将鼓励我继续创作!

3

添加新评论1 条评论

tanweitanwei技术支持IBM
2020-03-16 20:40
感谢分享,我按照您的文档安装好了单机。不知道有没有双机或者多机的安装文档呢?Email:89238443@qq.com 谢谢!

yhl71@tanwei 双机的我这没有,需要有共享存储和另外一台机器。

2022-02-22 15:56
Ctrl+Enter 发表

本文隶属于专栏

最佳实践
不同的领域,都有先行者,实践者,用他们的最佳实践来加速更多企业的建设项目落地。

作者其他文章

相关文章

相关问题

相关资料

X社区推广