Forum Discussion

Nitesh_Bhat's avatar
13 years ago

oprd returned abnormal status (96)

Hi,
  I am not able to get media manager daemons after bouncing Netbackup Daemons.
And i got this error

# ./vmoprcmd -d
oprd returned abnormal status (96)
IPC Error: Daemon may not be running

Logs of daemon & reqlib & ltid as below:-

Ltid:-

01:21:23.746 [23332] <4> CheckShutdownWhileInit: Pid=1, Data.Pid=25157, Type=100, Param1=0, Param2=-5056, LongParam=-23490544
01:21:24.281 [23332] <16> InitLtidDeviceInfo: Drive DLT8000-01 is ACTIVE
01:21:24.281 [23332] <16> InitLtidDeviceInfo: Drive DLT8000-02 is ACTIVE
01:21:24.281 [23332] <16> InitLtidDeviceInfo: Drive DLT8000-03 is ACTIVE
01:21:24.281 [23332] <16> InitLtidDeviceInfo: Drive DLT8000-06 is ACTIVE
01:21:24.282 [23332] <16> InitLtidDeviceInfo: Drive DLT8000-07 is ACTIVE
01:21:24.282 [23332] <16> InitLtidDeviceInfo: Drive DLT8000-08 is ACTIVE
01:21:24.282 [23332] <16> InitLtidDeviceInfo: Drive DLT8000-10 is ACTIVE
01:21:24.282 [23332] <16> InitLtidDeviceInfo: Drive DLT8000-12 is ACTIVE
01:21:24.282 [23332] <16> InitLtidDeviceInfo: Drive IBM_ULTRIUM2_Drv07 is ACTIVE

reqlib:-

01:00:01.207 [23580] <2> EndpointSelector::select_endpoint: performing call with the only endpt available!(Endpoint_Selector.cpp:431)
01:00:01.220 [23580] <2> EndpointSelector::select_endpoint: performing call with the only endpt available!(Endpoint_Selector.cpp:431)
01:01:40.430 [23805] <4> vmoprcmd: INITIATING
01:01:40.488 [23805] <2> vmoprcmd: argv[0] = vmoprcmd
01:01:40.488 [23805] <2> vmoprcmd: argv[1] = -d
01:01:40.488 [23805] <2> vmdb_start_oprd: received request to start oprd, nosig = yes
01:01:40.527 [23805] <2> vnet_vnetd_service_socket: vnet_vnetd.c.2046: VN_REQUEST_SERVICE_SOCKET: 6 0x00000006
01:01:40.527 [23805] <2> vnet_vnetd_service_socket: vnet_vnetd.c.2060: service: vmd
01:01:40.623 [23805] <2> getrequestack: server response to request:  REQUEST ACKNOWLEDGED 650
01:01:40.649 [23805] <2> getrequeststatus: server response:  EXIT_STATUS 0
01:01:40.649 [23805] <4> vmdb_start_oprd: vmdb_start_oprd request status:  successful (0)
01:02:57.220 [23805] <2> wait_oprd_ready: oprd response:  EXIT_STATUS 278
01:02:57.221 [23805] <2> put_string: cannot write data to network:  Broken pipe (32)
01:02:57.221 [23805] <16> send_string: unable to send data to socket:  Broken pipe (32), stat=-5

Deamon:-

01:07:42.537 [24144] <4> rdevmi: INITIATING
01:07:42.537 [24144] <2> mm_getnodename: cached_hostname tcppapp001, cached_method 3
01:07:42.583 [24144] <2> mm_ncbp_gethostname: GetNBUName <tcppapp001-bip>
01:07:42.583 [24144] <2> mm_getnodename:  (5) hostname tcppapp001-bip (from mm_ncbp_gethostname)
01:07:42.584 [24144] <2> rdevmi: got CONTINUE, connecting to ltid
01:08:02.755 [24098] <16> oprd: device management error: IPC Error: Daemon may not be running
01:08:03.241 [23355] <2> vmd: TCP_NODELAY
01:08:03.242 [23355] <4> peer_hostname: Connection from host tcppvmg265-bip, 172.16.6.7, port 6735
01:08:03.335 [23355] <4> process_request: client requested command=43, version=4
01:08:03.335 [23355] <4> process_request: START_OPRD requested
01:08:03.369 [23355] <4> start_oprd: /usr/openv/volmgr/bin/oprd, pid=24164
01:08:03.638 [23355] <2> vmd: TCP_NODELAY
01:08:03.638 [23355] <4> peer_hostname: Connection from host tcppvmg265-bip, 172.16.6.7, port 18394
01:08:03.685 [23355] <4> process_request: client requested command=43, version=4
01:08:03.685 [23355] <4> process_request: START_OPRD requested

Please help me out how to get rid of this issue and what is this issue about.

  • Hi Nitesh,

    I have faced this error quite a few time in my environment and most;ly it happens when the load is more on the master or the media server.

    running the command nbrbutil -resetMediaServer <mediaserver name> on the affected media server fixes the problem for me.The next step i do if the command doesnt fix the problem is recycle netbackup services on the affected media server.But most of the time nbrbutil will do the trick.

    You might want to try the nbrbutil command.The command Resets all nbrb EMM and MDS allocations related to ltid on the media server. Your backup on that media server might be affected though.

    If the command works for you.Please look at your backup scheduling and consider load balancing.

     

     

    • LostNeko's avatar
      LostNeko
      Level 1

      THANK YOU! This solved my issue perfectly after everything else failed. Your amazing and a lifesaver even 11 years later.

  • Master or media server?  OS?  NBU version?

    What processes are currently running? Please post output of 'bpps -x' from cmd on master.

  • Thanx Kisad

    We were getting the same error, we tried the solution provided by Kisad, stopped NBU processes on media server and then rest the media server by nbrbutil command

    nbrbutil -resetMediaServer media_server_name

    and again started the netbackup, it resolved the issue and now everything is working fine.

    Please share your experience also.

  • Hi champ35,

    From my experience I have seen this kind of situation occuring when large number of backups are running in the environment.In other words it would be worthwhile to review you backup policy schedules and do a load balancing.Hope it helps.

    Wish you a Happy and Prosperous New year.

  • Thanx Kisad,

     It had help me a lot.

    nbrbutil -resetMediaServer media_server_name  command has resolved the issue.

    Regards,

    Nitesh Bhat