Thursday, November 18, 2010

How to recover VxVM volumes after SAN erasures

This a reminder of how I recovered a re-initialized SAN filesystem’s vxVM config without the original SAN config info.

This procedure will work if the storage layout and architecture is managed on the SAN level and all vxVM volumes are made of 1 single disk at OS level.

I didn’t know that when you export, modify and re-import config of a storageTek it reinitializes itself with the imported config :-(

Dump current config for each disk group in vxprint format

Copy the /etc/vx/cbr/bk/ to a safe location, then for each group, cd into the disk group config folder print the output like this :

[root@otasrv1 dg_osglb.1258621615.102.otasrv1]# pwd

/product/MAINTENANCE/VXVM/etc/vx/cbr/bk/dg_osglb.1258621615.102.otasrv1

cat 1258621615.102.otasrv1.cfgrec|vxprint -ht -D -

the 1258621615.102.otasrv1 part will vary per disk group and per project.

Save all the outputs to text files named .txt copy all the text files to tmp/ folder.

Here is my bash history:

Check that disks are still seen in OS at same location as before failure

fdisk -l >ha

[root@otasrv1 dg_osglb.1258621615.102.otasrv1]# more ha

Disk /dev/sda: 145.9 GB, 145999527936 bytes

255 heads, 63 sectors/track, 17750 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System

/dev/sda1 * 1 131 1052226 83 Linux

/dev/sda2 132 4308 33551752+ 82 Linux swap

/dev/sda3 4309 7572 26218080 83 Linux

Ignore the above Linux filesystems which were never managed by vxVM. You will get a few unformatted drives like the bellow:

Disk /dev/sdb: 21.4 GB, 21474836480 bytes

64 heads, 32 sectors/track, 20480 cylinders

Units = cylinders of 2048 * 512 = 1048576 bytes

Disk /dev/sdc: 21.4 GB, 21474836480 bytes

64 heads, 32 sectors/track, 20480 cylinders

Units = cylinders of 2048 * 512 = 1048576 bytes

Disk /dev/sdd: 5368 MB, 5368709120 bytes

166 heads, 62 sectors/track, 1018 cylinders

Units = cylinders of 10292 * 512 = 5269504 bytes

Disk /dev/sde: 5368 MB, 5368709120 bytes

166 heads, 62 sectors/track, 1018 cylinders

Units = cylinders of 10292 * 512 = 5269504 bytes

… more disks

To find out if sdb has changed location, I check if the size has changed. I change to tmp/ folder and grep for sdb

grep ^dm *|grep sdb

sg_db1.txt:dm disk00 sdb auto 2048 41932544 –

here I see that sdb used to be 41932544*512=21469462528 almost equal to new size, difference = 5373952 which is only 5Mb, so it is only a difference of reading between vxVM and fdisk. This disk still has the same name.

Must repeat above check operation for every disk in list. If any disk has changed its name, you will need to manually adjust sections when you add disks to disks group and rename them.

Generate commands to initialize disk with vxvm cds

grep ^dm *|sort -k3|while read d_g d_sk_n d_sk au_t p_rv p_ub dash; do echo '/usr/lib/vxvm/bin/vxdisksetup -if '$d_sk" privlen="$p_rv" publen="$p_ub;done>ESD118_vxVM.sh

the shell script ESD118_vxVM.sh will now contain commands to initialize all the disks previously used for Veritas shared fses.

Generate command to init each disk group and add its disks to it

grep ^dm *|sort -k3|while read d_g d_sk_n d_sk r_est p_rv size dash; do disk_g=`echo $d_g|cut -d\. -f1`; echo 'vxdg init '$disk_g" "$d_sk_n"="$d_sk;done>>ESD118_vxVM.sh

Generate commands to recreate the volumes

grep -e'^v ' *|while read dg_n vol_n dash state st size sel dosh fsg;do disk_g=`echo $dg_n|cut -d\. -f1`; echo 'vxassist -g '$disk_g" make "$vol_n" "$size; done>>ESD118_vxVM.sh

you should end up with 3 sets of configuration commands in the shell script.

Note that vxVM sees disks with a funny name. I had to change the disk initialization section like this:

/usr/lib/vxvm/bin/vxdisksetup -if sdp privlen=2048 publen=398440192

/usr/lib/vxvm/bin/vxdisksetup -if sdq privlen=2048 publen=83875584

To

/usr/lib/vxvm/bin/vxdisksetup -if sdap privlen=2048 publen=398440192

/usr/lib/vxvm/bin/vxdisksetup -if sdaq privlen=2048 publen=83875584

I ended up with this one:

Now, for the disk group creation lines, you need to put all the disks on the same line

vxdg init dg_backup disk14=sdap

vxdg init dg_backup disk15=sdaq

becomes

vxdg init dg_backup disk14=sdap disk15=sdaq

Also, I had 2 errors with disks that were too small to match requested sizes:

[root@otasrv1 dg_osglb.1258621615.102.otasrv1]# /usr/lib/vxvm/bin/vxdisksetup -if sdad privlen=2048 publen=10475264

VxVM vxdisksetup ERROR V-5-2-2480 Disk is too small for supplied parameters

[root@otasrv1 dg_osglb.1258621615.102.otasrv1]# /usr/lib/vxvm/bin/vxdisksetup -if sdae privlen=2048 publen=10475264

VxVM vxdisksetup ERROR V-5-2-2480 Disk is too small for supplied parameters

after removing a few kilobytes to size it works:

/usr/lib/vxvm/bin/vxdisksetup -if sdad privlen=2048 publen=10470000

For some reason, I also had to do the following renamings:

vxdg -n dg_db1 import sg_db1

vxedit -g dg_db2 rename vol_sgdb vol_sgbd

vxedit -g dg_db1 rename vol_sgdb vol_sgbd

final shell script with good values is:

Final result

[root@otasrv1 ~]# vxdisk list

DEVICE TYPE DISK GROUP STATUS

sda auto:none - - online invalid

sdab auto:cdsdisk - - online

sdac auto:cdsdisk - - online

sdad auto:cdsdisk - - online

sdae auto:cdsdisk - - online

sdaf auto:cdsdisk - - online

sdag auto:cdsdisk - - online

sdah auto:cdsdisk - - online

sdai auto:cdsdisk - - online

sdaj auto:cdsdisk - - online

sdak auto:cdsdisk - - online

sdal auto:cdsdisk - - online

sdam auto:cdsdisk - - online

sdan auto:cdsdisk - - online

sdao auto:cdsdisk - - online

sdap auto:cdsdisk - - online

sdaq auto:cdsdisk - - online

sdar auto:cdsdisk - - online

sdas auto:cdsdisk - - online

sdat auto:cdsdisk - - online

sdau auto:cdsdisk - - online

sdav auto:cdsdisk - - online

sdaw auto:cdsdisk - - online

sdax auto:cdsdisk - - online

sday auto:cdsdisk - - online

sdaz auto:cdsdisk - - online

The commands in the shell script should run fine but run them manually and use your integrator skills to troubleshoot.

Format the vxVM filesystems for usage

[root@otasrv1 ~]# for f_s in `find /dev/vx/dsk/ -type b`; do mkfs.ext3 $f_s;done

Bring up the mount points and restore databases

for m_p in `hares -display -group sg_backup|grep backup_fs|awk '{print $1}'|sort -u`; do hares -online $m_p -sys otasrv4; done

for m_p in `hares -display -group sg_db2|grep db2_fs|awk '{print $1}'|sort -u`; do hares -online $m_p -sys otasrv3; done

for m_p in `hares -display -group sg_db1|grep db1_fs|awk '{print $1}'|sort -u`; do hares -online $m_p -sys otasrv4; done

After the disks come up on different hosts, freeze the associated groups before starting cold restore.

hagrp –freeze sg_backup; hagrp –freeze sg_db1; hagrp –freeze sg_db1

Troubleshoot.

[Oracle DBA] Kill all sessions for a specific user/application

Select them like:

collumn TOT format A60
COLUMN USERNAME FORMAT A10
select 'Alter system kill session '||CHR(39)||SID||','||SERIAL#||CHR(39) TOT, username, osuser, machine from v$session where username='PMSEEADMIN';

Then

run all the resulting commands

Alter system kill session '55,558' ;
Alter system kill session '66,3003' ;
Alter system kill session '68,3822' ;
Alter system kill session '69,1648' ;
Alter system kill session '73,1320' ;
Alter system kill session '97,3689' ;
Alter system kill session '120,2671';

This blog is public but use at your own risk.