Disappearing DOs -- CaseID: v0149906


Status

Rational closed the case.

We didn't get any occurrence after we upgraded to v 3.1.1


Description

The build seems to succeed. However some objects cannot be accessed from another host than the one where the build was run. After some time, we notice they have "disappeared".

Below some transcripts (note the timestamps):


girod> cd /vob/nnmsc/release/1.57/bin
bin> ct lsdo artustopogen
29-Oct.16:00   "artustopogen@@29-Oct.16:00.10625"
29-Oct.15:21   "artustopogen@@29-Oct.15:21.10623"
28-Oct.14:50   "artustopogen@@28-Oct.14:50.10445"
bin> ct winkin artustopogen@@29-Oct.16:00.10625
Promoting unshared derived object "artustopogen@@29-Oct.16:00.10625".
cleartool: Error: Can't get cleartext pathname of DO b6e2559a.6f3811d2.a4fd.08:00:09:87:72:39 in view "jeeves:/lun0/viewstore/store3/armas.vws": ClearCase object not found.
cleartool: Warning: Unable to locate derived object data in view: ClearCase object not found
cleartool: Error: Unable to promote derived object "artustopogen@@29-Oct.16:00.10625": error detected by ClearCase subsystem.

From the view_log:

10/29/98 10:37:31 view_server(5110): Warning: Vob stale 0x800010fd: Purging jeeves:/lun0/vobs/nnmsc/release.vbs:0x80000FFB in kpicalc!
...
10/29/98 14:29:16 view_server(1577): Error: view_cr_rm_vob_ref of 5a3208f1.6f2b11d2.a18a.08:00:09:87:72:39 failed: Permission denied
10/29/98 15:21:29 view_server(5110): Warning: Cover object jeeves:/lun1/vobs/nnmsc/nms3000.vbs:44c24f82.6f3311d2.a4fd.08:00:09:87:72:39 for 0x800010dc not found in VOB
10/29/98 16:00:31 view_server(5110): Warning: Cover object jeeves:/lun1/vobs/nnmsc/nms3000.vbs:b6e2559a.6f3811d2.a4fd.08:00:09:87:72:39 for 0x800011e8 not found in VOB
...
10/30/98 09:22:20 view_server(3735): Error: PFM_ASSERT:(dbid != TBS_DBID_NULL) failed: '../locate.c':516

Also, tar reported strange errors:


bin> ll artustopogen
-rwxr-x---   1 girod    nms_data  3903296 Oct 30 10:08 artustopogen
bin> tar cf /tmp/topo.tar artustopogen
tar: artustopogen: file name too long
: No such device or address

Other symptoms:


bin> mvfsstorage artustopogen
mvfsstorage: artustopogen - can't fetch cleartext

...and


bin> ct dump artustopogen@@30-Oct.10:08.10626
cleartool: Error: Operation "view_gpath" failed: ClearCase object not found.

artustopogen@@30-Oct.10:08.10626
oid=bc95bd25.6fcf11d2.a3b7.08:00:09:87:72:39  dbid=10626 (0x2982)
mtype=derived object  
stored fstat:
ino: 0; type: 1; mode: 0750; uid: 1225; gid: 175
nlink: 1; size: 3903296; blsz: 0; blks: 0 fsid: 0; rdev 0
atime: Thu Jan  1 02:00:00 1970
mtime: Fri Oct 30 10:08:36 1998
ctime: Fri Oct 30 10:08:36 1998
returned fstat:
ino: 10626; type: 1; mode: 0750; uid: 1225; gid: 175
nlink: 1; size: 3903296; blsz: 1024; blks: 0 fsid: 0; rdev 0
atime: Fri Oct 30 10:08:36 1998
mtime: Fri Oct 30 10:08:36 1998
ctime: Fri Oct 30 10:08:36 1998
master replica dbid=1
nsdir_elem=581  name="1.57/bin/artustopogen"  idstr="30-Oct.10:08.10626"
view="jeeves:/lun0/viewstore/store3/armas.vws" (ed24f6d1.367711d2.a654.08:00:09:70:ae:85)
config rec=536888955
Predefined dependency hash=0
Build script hash=3576171723
ref count=1

We ran a dbcheck on the vob database, with no result.


Analysis

Because of what is diagnosed as an NFS stale file handle, the DO is not anymore available using the inital patch, after which ClearCase moves the data to the view's lost+found directory (creates it if non existing): <view-storage-dir>/.s/lost+found


Workarounds

Carefully removing the DOs and rebuilding may affect.

Switching view in addition may affect.


Log

This case was reported to Rational on October 30th.
It affected us with two different DOs, in two views (/lun0/viewstore/store3/armas.vws and /lun2/viewstore/store1/girod_beta.vws)
It stopped after we switched to a new view, and has not occurred anymore after the upgrade to 3.1.1.


Soft mounts,
NFS problems ToC
Marc Girod
Last modified: Thu Jun 17 09:25:45 EETDST 1999