We still have problems for nearline file management and I listed the ones I found. Here is the list of folders need to be managed.
- Synchronize /scratch/nearonline/var/job_dump/ with /minerva/data/online_processing/swap_area/
- Synchronize /scratch/nearonline/var/gmplotter/plotter/ with /minerva/data/users/nearonline/gmbrowser/plotter/
- Synchronize /scratch/nearonline/var/gmplotter/www/ with /minerva/data/users/nearonline/gmbrowser/www/
- Copy Files from /scratch/nearonline/var/gmplotter/www to email@example.com:/opt/if-wbm/htdoc/minerva/echecklist/gmb_hists
Here is the status of each section
1) I modified the script to use rsync command to sync between /scratch/nearonline/var/job_dump/ with /minerva/data/online_processing/swap_area/ For now, we have a stable synchronization between two folders however,this method copies the .log files also, which is unnecessary.
2,3) No USER “nearonline” under /minerva/data/users Setup script assigns the following export NEARLINE_BLUEARC_GMPLOTTER_AREA=/minerva/data/users/nearonline/gmbrowser There is a “nearonline” user under /minerva/app, however we should not copy any data file to the /minerva/app.
4) e-Checklist works, I conclude this section works. I did not checked the details.
We should organize a plan to solve all the problems in nearline file management. I propose the following,
- Lets use rsync command for 1,2,3
- We need to create a folder “/minerva/data/users/nearonline” and let other systems know where we are copying the files.
- If there is a folder I forgot to sync between nearline and bluearc, that folder also needs to be added to the script.
The automated “nearline_bluearc_copy.sh” script on mnvnearline1 fails to copy necessary files from local_dump_area to online_processing/swap_area
(from /scratch/nearonline/var/job_dump to /minerva/data/online_processing/swap_area)
- Investigating mnvnearline1:scripts/nearline_bluearc_copy.sh
- Script runs automatically every 5 minutes.
- Log file for the script: /scratch/var/nearonline/logs
- Local copy from HEAD to following folders WORKS
- NEARLINE_DUMP_AREA /scratch/nearonline/var/job_dump
- NEARLINE_LOCAL_GMPLOTTER_LOCATION /scratch/nearonline/var/gmplotter
- The problem is with the python script “filechecklist.py”
- It does not generate the file list for files from the following folders:
- It works for the following folder
- Since there is no file list generated by the python script “filechecklist.py”, NO files copied to the swap_area
- I modified the script to use rsync command.
- Now it synchronizes the local_dump_area and online_processing/swap_area
- Inside the script Jeremy notes that, “using rsync for this stage incurs a lot of overhead on the BlueArc disk”, therefor, he writes a more efficient script “filechecklist.py” for this task
- The Problem is confirmed.
- .fileindex under /scratch/nearonline/var/job_dump got corrupted and causing “file checklist.py” to crash for that folder
- Using rsync manually fixed the .fileindex
- Software sync between mnvonlinelogger and mnvnearline1 updates the nearline_bluearc_copy.sh script to the original version
- Now everything works as before. The near ine_bluearc_copy.sh script copies the changed files to bluearc area using “file checklist.py”