e-Checklist nusoft works again

  • Backup e-Checklist: http://nusoft.fnal.gov/minerva/echecklist/mininfo.php 
  • Modified scripts 
    • setup_nearline_software.sh 
    • nearline_bluearc_copy.sh 
  • nusoft uses NEARLINE_BLUEARC_GMPLOTTER_AREA, which needs to be under “/minerva/app” NOT under “/minerva/data”
    • NEARLINE_BLUEARC_GMPLOTTER_AREA=/minerva/app/users/nearonline/gmbrowser
  • Changes committed to CVS.
  • Manually synchronized nearline1 with the mnvonlinelogger

Pontificia MINOS: “Permission Denied”

Problem

  • Pontificia UROC having “Permission Denied” Error when they run the commands
    • ~/opt/rms/rms service rc near
    • ~/opt/rms/rms service om near

Attempt 1

  • I suggested them to use the  “kinit” command
    • ~/opt/rms/rms kinit
  • They were already tried kinit and did not worked

Solution

  • I connected to the “minos@minos-gateway-nd.fnal.gov” and checked the “.k5login” file.
  • Their principal listed in the list but in the wrong format
    • Listed as: “Principal”
    • Correct Format: “Principal@FNAL.GOV”
  • e-mailed Stefano and other MINOS people. Stefano modified the file.

GMBrowser Update – Uses all gates now

  • Previously GMBrowser that shifters look at only uses a fraction of the gates, because the early processing stages (particularly DecodeRawEvent) were slow.
  • Now that we have a faster version of DecodeRawEvent, and we modified GMBrowser to use all gates
  • Modified following parameters in NearlineCurrent.opts in Tools/DaqRecv/options on mnvonlinelogger, to be 100 percent:
    • PdstlPrescaler.PercentPass          = 25;
    • LinjcPrescaler.PercentPass          = 25;
    • NumibPrescaler.PercentPass       = 20;
  • Ran the “nearline_software_sync.sh” script in all Nearline Machines to get the update
    • mnvnearline1
    • mnvnearline2
    • mnvnearline3
    • mnvnearline4
  • Informed Current Shifter about the update and started GMBrowser at Tufts UROC
    • Will investigate the behavior for some time, until we make this change permanent.

Nearline File Management Problems

We still have problems for nearline file management and I listed the ones I found. Here is the list of folders need to be managed.

  1. Synchronize /scratch/nearonline/var/job_dump/ with /minerva/data/online_processing/swap_area/
  2. Synchronize /scratch/nearonline/var/gmplotter/plotter/ with /minerva/data/users/nearonline/gmbrowser/plotter/
  3. Synchronize /scratch/nearonline/var/gmplotter/www/ with /minerva/data/users/nearonline/gmbrowser/www/
  4. Copy Files from /scratch/nearonline/var/gmplotter/www to minerva@minerva-wbm.fnal.gov:/opt/if-wbm/htdoc/minerva/echecklist/gmb_hists

Here is the status of each section

1) I modified the script to use rsync command to sync between /scratch/nearonline/var/job_dump/ with /minerva/data/online_processing/swap_area/ For now, we have a stable synchronization between two folders however,this method copies the .log files also, which is unnecessary.

2,3) No USER “nearonline” under /minerva/data/users Setup script assigns the following export NEARLINE_BLUEARC_GMPLOTTER_AREA=/minerva/data/users/nearonline/gmbrowser There is a “nearonline” user under /minerva/app, however we should not copy any data file to the /minerva/app.

4) e-Checklist works, I conclude this section works. I did not checked the details.

We should organize a plan to solve all the problems in nearline file management. I propose the following,

  • Lets use rsync command for 1,2,3
  • We need to create a folder “/minerva/data/users/nearonline” and let other systems know where we are copying the files.
  • If there is a folder I forgot to sync between nearline and bluearc, that folder also needs to be added to the script.

Control Room Computer Update to v10r9p1

  • Updated minerva-cr-01 and minerva-cr-02 to v10r9p1
  • Modified setupFiles for initiating setup with v10r9p1
    • tempSetup.sh
    • .bashrc
  • Modified setup.min.soft.sh for v10r9p1
    • Python was a problem for v10r9p1 setup script
    • Created an alias to use the local version python instead of Framework Version
      • alias python=”/usr/bin/python”
  • Tested and Tagged ControlRoomTools under v10r9p1 as “stable_v10r9p1″
  • Updated the .bashrc for using the setup file under v10r9p1
  • Linked .k5login with “cmtuser/Minerva_v10r9p1/Tools/ControlRoomTools/authenticate/k5login-master”
  • Updated the documentation on wiki: “Control_Room_Setup_Manual”

Software Update

  • mnvonlinelogger updated
  • Slave Nodes will receive update automatically
    • Updated Packages under cmtuser area
      • Tools/DaqRecv [croce_v3]
        • cvs co -r croce_v3 Tools/DaqRecv
    • Installed Packages under cmtuser area
      • Event/MinervaKernel [croce_v3]
        • This package required for Event/MinervaEvent
        • getpack -u Event/MinervaKernel
      • Event/MinervaEvent [croce_v3]
        • cvs co -r croce_v3 Event/MinervaEvent
    • Built All Packages in the following order
      1. Tools/DaqRecv
      2. Event/MinervaKernel
      3. Event/MinervaEvent
    • Building Commands:
      • cmt config
      • cmt make
      • source setup.sh

Problem after restarting Nearline Machines

Problem:

The automated “nearline_bluearc_copy.sh” script on mnvnearline1 fails to copy necessary files from local_dump_area to online_processing/swap_area
(from /scratch/nearonline/var/job_dump to /minerva/data/online_processing/swap_area)

Investigations:

  • Investigating mnvnearline1:scripts/nearline_bluearc_copy.sh
  • Script runs automatically every 5 minutes.
  • Log file for the script: /scratch/var/nearonline/logs
  • Local copy from HEAD to following folders WORKS
    • NEARLINE_DUMP_AREA /scratch/nearonline/var/job_dump
    • NEARLINE_LOCAL_GMPLOTTER_LOCATION /scratch/nearonline/var/gmplotter
  • The problem is with the python script “filechecklist.py”
  • It does not generate the file list for files from the following folders:
    • $NEARLINE_DUMP_AREA
    • $NEARLINE_LOCAL_GMPLOTTER_LOCATION/plotter
  • It works for the following folder
    • $NEARLINE_LOCAL_GMPLOTTER_LOCATION/www
  • Since there is no file list generated by the python script “filechecklist.py”, NO files copied to the swap_area

Temporary Solution:

  • I modified the script to use rsync command.
    • Now it synchronizes the local_dump_area and online_processing/swap_area
  • Inside the script Jeremy notes that, “using rsync for this stage incurs a lot of overhead on the BlueArc disk”, therefor,  he writes a more efficient script “filechecklist.py” for this task

 Permanent Solution:

  • The Problem is confirmed.
    •  .fileindex under /scratch/nearonline/var/job_dump got corrupted and causing “file checklist.py” to crash for that folder
  • Using rsync manually fixed the .fileindex
  • Software sync between mnvonlinelogger and mnvnearline1 updates the nearline_bluearc_copy.sh script to the original version
  • Now everything works as before. The near ine_bluearc_copy.sh script copies the changed files to bluearc area using “file checklist.py”

 

GMBrowser Problem

  • GMBrowser Live works but shifter can not access to the previous runs and subruns
    • GMBrowser -r xx -s xxx does not work
  • The files are not copied automatically to the /minerva/data/online_processing/swap_area
  • I copied the files manually:
    • Connect to the mnvnearline1 – it has the BlueArc /minerva/ mount
    • Necessary files located: /scratch/nearonline/var/job_dumb
  • This solved the issue for non copied files.
  • I checked the log file for nearline_bluearc_copy.sh script under /scratch/nearonline/var/logs
    • After sometime the auto-script seems to be working.
  • Currently, we are not running any runs, I will check the status again tomorrow

Minerva Software Installed on new CR Machines (ROC-West)

MINERvA Software Installation on minerva-cr-01 and minerva-cr-02

  • Firefox configured for Shifter Bookmarks
  • Special Kerberos Principal Installed
  • ROOT 5.34/21 installed and tested
  • ControlRoomTools Installed and Configured
    • setup.sh file and .k5login file configured with ControlRoomTools
    • GMBrowser installed and tested
  • mnvdaqrunscripts installed and configured
    • registered new hostnames(minerva-cr-01 and minerva-cr-02) into “mnvdaqrunscripts/install.sh”
    • committed and tagged changes as “oaltinok_2014_09_17″
  • mnvruncontrol installed and configured
    •  I tried connecting using run control but could not connected. Will test again tomorrow.
  • MINOS Software is NOT installed

Major Update to RunControl Software (v6r1)

  1. Killed all processes
  2. Jeremy updated the
    • mnvonline0.fnal.gov
    • mnvonline1.fnal.gov
    • mnvonlinelogger.fnal.gov
    • minerva-rc.fnal.gov
  3. I updated the remaining Control Room Computers
    • minerva-evd.fnal.gov
    • minerva-bm.fnal.gov
    • minerva-om-02.fnal.gov
  4. Testing the Updates
    1. Successful Test on Control Room Computers
    2. Successful Test on Rochester UROC
    3. Successful Test on Tufts UROCs
  5. I updated the UROC_sw_manager.py script and notified UROC Users

Fermilab Power Failure

  • On Sunday 03:30 am there was a power failure affecting MINOS and MINERvA underground machines
  • Control Room Computers lost network mount to /minerva/data/
    • GMBrowser needs /minerva/data mounted and it was not working
    • minerva-evd is used by UROCs to mount /minerva/data and they are also affected.
    • Carrie opened a service ticket to ask Computer Division Help for Control Room Computers
    • Computer Division solved the incident and all machines and UROCs working properly.
  • mnvonlinebck1.fnal.gov machine is still down and we have no access to Veto Wall HV Monitoring.
  • e-Checklist can be used either one of the following servers: (minerva-wbm was down due to power glitch)
    • http://minerva-wbm.fnal.gov/minerva/echecklist/mininfo.php
    • http://nusoft.fnal.gov/minerva/echecklist/mininfo.php

minerva-om Update

  • minerva-om no longer support MINERvA Software
  • minerva-om has latest MINOS RMS Installed for om near check
  • .bashrc script modified to prompt users to use the “start_MinosOm.sh” command to start the minos om GUI
  • Nothing removed and every file and software are recoverable.

Validation

  • University of Minnesota Duluth group updating a pretty old UROC
  • They installed all Minerva and MINOS Software and have some problems with the ValidationTools, GMBrowser and RunVetoHVMonitor
  • Other than these 3 everyhing working and they started their shadow shift.

ValidationTools

  • UROC_sw_manager.py fail to “make clean” and “make” package
  • I removed the ValidationTools folder and ran UROC_sw_manager again – It worked!

GMBrowser

  • GMBrowser crashes (not responding)

Attempts

  • Update ControlRoomTools/gmbrowser folder to HEAD
    • Not worked!
  • Remove gmbrowser folder and run UROC_sw_manager.py
    • Not worked!

After these failed attempts, I concluded that the problem seems to be related with ROOT version they are using.

  • Rick installed ROOT 5.34 by building and tried running GMBrowser
    • It worked!

RunVetoHVMonitor

  • Script fail to open GUI
  • Problem is the missing “-x” option from SSH Command, they locally modified the script.
  • I need to modify and upload the correct version to CVS