Table of contents
- 1. Check-In Procedure
- 1.1. Wait for scheduled GLOW mode time
- 1.2. Log into the interface computer (glow-control)
- 1.3. Log into the gateway machine
- 1.4. Log into the DE601 LCU
- 1.5. Check that no-one else is on the station
- 1.6. Check that no more remote mounts are there
- 1.7. Check software level (should be 0)
- 1.8. Check the station configuration files in /opt/lofar/etc/
- 1.9. Start up the station to test LBA and HBA antennas
- 1.10. Do some simple tests
- 1.11. Start up the beam-server
- 1.12. Some more tests
- 1.13. Check that beamlet data are arriving
- 1.13.1. killpointing
- 1.14. Send the e-mail to ASTRON and our users that we have taken over the station
- 1.15. Turn the station over to the users
Check-In Procedure
Notes from 22.7.2013:
- run stuff in a VNC session
- have a local account "operator" on glow-control and glow60X
Wait for scheduled GLOW mode time
An e-mail from ILT operations should come, informing us that the GLOW stations have been set to local mode operations. Check the DE601 monitoring software at https://glow-control.mpifr-bonn.mpg.de/cacti/graph_view.php to check that the station is in swlevel 0. (The display for the other GLOW stations will only update once per hour, so there is little use in checking them for changes on short notice.)
Log into the interface computer (glow-control)
ssh -X observer@glow-control
If this asks for a password then there is something wrong. You can try running kinit on the machine from where you are trying to log into glow-control.
For handover we use a seperate screen, connect to that with:
screen -r Handover
In there we have different terminals for the different stations (DE601 -> terminal 1, DE602 -> terminal 2 etc.)
Log into the gateway machine
(E.g. for DE601:)
ssh glow601
Log into the DE601 LCU
ssh de601c
Check that no-one else is on the station
w
If there is more than one login active (you), then check who else is logged into the station. If people from ASTRON are still logged in, contact ASTRON. Their processes may need to be killed (sudo access required).
Check that no more remote mounts are there
With the new LCUs (as of December 2016) it is usually an automounter mount that causes problems. You can check for that with:
mount | grep autofs
On all stations there is one line starting with:
systemd-1 on /proc/sys/fs/binfmt_misc type autofs ...
This is not a problem. But if there is more than one line, usually one starting with:
/etc/auto.misc on /misc type autofs ...
then there is an external mount, which does not work in GLOW mode and will cause the control programs to stall. You'll have to ask ASTRON to remove that.
Check software level (should be 0)
swlevel
If not in software level 0, bring it down.
Make sure that only 3 levels are listed. If more are available, the station hasn't been handed over correctly.
Verify that no station processes are running. A report of swlevel 0 from the swlevel program does not necessarity mean that everything is stopped.
( ps -AF )
(We need some scripts that generate easier to read output! For the time being, just use: )
ps -AF | grep lofarsys
This should print out nothing or one line containing "grep lofarsys"
Check the station configuration files in /opt/lofar/etc/
less /opt/lofar/etc/RSPDriver.conf
Check that the RSP lanes are sending data to the correct recording computers. For recording beamlet data on lofarBN, this should have
RSPDriver.LANE_00_BLET_OUT= 0
RSPDriver.LANE_00_XLET_OUT= 5
RSPDriver.LANE_00_SRCMAC = 00:22:86:06:01:00
RSPDriver.LANE_00_SRCIP = 10.211.1.1
RSPDriver.LANE_00_DSTMAC = 00:19:99:ba:42:cb # lofarB1
RSPDriver.LANE_00_DSTIP = 10.211.1.3
RSPDriver.LANE_01_BLET_OUT=8
RSPDriver.LANE_01_XLET_OUT=5
RSPDriver.LANE_01_SRCMAC = 00:22:86:06:01:01
RSPDriver.LANE_01_SRCIP = 10.212.1.1
RSPDriver.LANE_01_DSTMAC = 00:19:99:e2:21:46 # lofarB2
RSPDriver.LANE_01_DSTIP = 10.212.1.3
RSPDriver.LANE_02_BLET_OUT=4
RSPDriver.LANE_02_XLET_OUT=5
RSPDriver.LANE_02_SRCMAC = 00:22:86:06:01:02
RSPDriver.LANE_02_SRCIP = 10.213.1.1
RSPDriver.LANE_02_DSTMAC = 00:19:99:e2:21:ba # lofarB3
RSPDriver.LANE_02_DSTIP = 10.213.1.3
RSPDriver.LANE_03_BLET_OUT=12
RSPDriver.LANE_03_XLET_OUT=5
RSPDriver.LANE_03_SRCMAC = 00:22:86:06:01:03
RSPDriver.LANE_03_SRCIP = 10.214.1.1
RSPDriver.LANE_03_DSTMAC = 00:19:99:e2:20:ce # lofarB4
RSPDriver.LANE_03_DSTIP = 10.214.1.3
When you are done type "q" to exit less.
If the computers that the data is being sent to needs to be changed (depends on the whishes of the observer), you need tp copy the correct version of the file. As of 2013 May 13, current versions of the RSPDriver.conf file can be found at
~user9/StationConfigs/RSPDriver.conf.local_lofarAN_20130513 for lofarAN, copy that with:
cp ~user9/StationConfigs/RSPDriver.conf.local_lofarAN_20130513 /opt/lofar/etc/RSPDriver.conf
~user9/StationConfigs/RSPDriver.conf.local_lofarBN_20130423 for lofarBN, copy that with:
cp ~user9/StationConfigs/RSPDriver.conf.local_lofarBN_20130423 /opt/lofar/etc/RSPDriver.conf
less /opt/lofar/etc/BeamServer.conf
Check that the HBA update interval is reasonable. For pulsar observations, where the noise from communicating with the HBA antennas is a bad thing, set this to a large value such as 300 seconds. For other observations, check the reasonable default values
BeamServer.HBAUpdateInterval= 300
Start up the station to test LBA and HBA antennas
swlevel 1
This starts up the ServiceBroker. Wait 10 seconds, then continue
swlevel 2
This starts up the RSPDriver and TBBDriver. This takes quite a long time. Follow the instructions for the next command to see when this finishes.
tail -f /log/RSPDriver.log
This continually checks the last log entries for the RSPDriver.
Wait till no more lines: "port ??? has not yet completed sync or had errors, trying to continue..." show up for at least 5 seconds. This may take a minute or two. Then enter <CTRL-C> to kill the tail program.
If the RSPDriver terminates at this point, then there is a problem with the hardware, please contact an expert. (Or go back to swlevel 0 and try again.)
Do some simple tests
Allow for a few seconds after each rspctl command to allow the settings to be distributed to and accepted by the hardware!
rspctl --rcumode=3
rspctl --rcuenable=1
This switches on all LBAs.
rspctl --rcu
This gives a list of all rcus. They should all be in rcumode 3 (or whatever you set). If there are some stuck in rcumode -1 (i.e. fail), then try again with setting the rcumode. If that doesn't help go back to swlevel 0 and try again.
ASTRON now recommends that the HBAs are powered up using the following command:
poweruphba.sh -m 5
(This is only avaialble from the 2.15 firmware http://www.lofar.org/operations/doku...tes_lofar_2.15 )
This ensures that the tiles are turned on in a sequence rather than all simultaneously and all the RCUs identified as broken are disabled. To power on all the RCUs use:
poweruphba.sh 5
Regardless of which of the above you used you can now check the status:
rspctl --rcu
After this is finished, disable the rcus again:
rspctl --rcumode=0
Start up the beam-server
rspctl --bitmode=8
This switches the station to the 8-bit mode.
swlevel 3
This starts up the CalServer and BeamServer software. Check if all services are running.
Some more tests
You can do some more tests, they are not really needed.
beamctl --antennaset=LBA_OUTER --rcus=0:191 --rcumode=3 --beamlets=0:243 --subbands=100:343 --digdir=0,1.5708,J2000&
This starts up the beam software to make a beam pointing at zenith usign RCUMODE 3 (LBA, 10--30 MHz). This will be used to test antennas. Wait until the message "All pointings sent and accepted" appears (< 3 seconds).
(Need to fix the "stopped in background" issue!)
rspctl --rcu
This checks that the RCU boards have been set in RCUMODE 3. Check that the RCUs are enabled (look for "=> ON" for all RCUs). Check that all RCUs are in RCUMODE 3 (look for "mode:3" for all RCUs).
If some RCUs are OFF or have the wrong RCUMODE, wait a few seconds and try again. If the RCUs still are not correct, something is wrong. If that is the case first try going to "swlevel 0" and back to this point. If it still doesn't work, call the Bonn people...
(We need to explain that better!)
rspctl --statistics --integration=3
This checks the spectrum (0--100 MHz) for each RCU, with all RCU spectra plotted in a single gnuplot window.
When the program starts up, check the initial log report. It should print out:
Taking subscription on the clockvalue
Current clockvalue is 200 Mhz
Taking subscription on the state of the splitter
The splitter is currently OFF
Taking subscription on the bitmode
The bitmode is currently 16
(We need to put in a plot of the spectra!)
If the spectra looks O.K., then end by typing <CTRL-C> in the (text-)terminal. Due to a "feature" in the software this might produce lots of lines of output in the terminal, ignore that!
rspctl --statistics=beamlet --integration=3
This checks the beamlet spectrum, using the specified subbands.
(We need to put in a plot of the spectra!)
If the spectra looks O.K., then end by typing <CTRL-C> in the (text-)terminal. Due to a "feature" in the software this might produce lots of lines of output in the terminal, ignore that!
killpointing
This turns off the beam command. (If you hit <enter> or at the next command, you'll get a line that beamctl was killed.)
beamctl --antennaset=HBA_JOINED --rcus=0:191 --rcumode=5 --beamlets=0:243 --subbands=100:343 --anadir=0,1.5708,J2000 --digdir=0,1.5708,J2000&
This checks out the HBA tiles, making a beam pointing at zenith usign RCUMODE 5 (HBA, 110--190 MHz). This will be used to test tiles. Wait until the message "All pointings sent and accepted" appears (< 3 seconds).
rspctl --rcu
This checks that the RCU boards have been set in RCUMODE 5. Check that the RCUs are enabled (look for "=> ON" for all RCUs). Check that all RCUs are in RCUMODE 5 (look for "mode:5" for all RCUs).
If some RCUs are OFF or have the wrong RCUMODE, wait a few seconds and try again. If the RCUs still are not correct, something is wrong.
rspctl --statistics --integration=3
This checks the spectrum (100--200 MHz, but incorrectly plotted by the software as 0--100 MHz) for each RCU, with all RCU spectra plotted in a single gnuplot window.
(We need to put in a plot of the spectra!)
If the spectra looks O.K., then end by typing <CTRL-C> in the (text-)terminal. Due to a "feature" in the software this might produce lots of lines of output in the terminal, ignore that!
rspctl --statistics=beamlet --integration=3
This checks the beamlet spectrum, using the specified subbands.
(We need to put in a plot of the spectra!)
If the spectra looks O.K., then end by typing <CTRL-C> in the (text-)terminal. Due to a "feature" in the software this might produce lots of lines of output in the terminal, ignore that!
Check that beamlet data are arriving
Using another terminal, log into the recording computers and check that data are arriving. For example, do
(use a VNC for all of this?)
ssh observer@lofarb1
sudo tcpdump -pni eth1
stop it with <CTRL-C>
exit
ssh observer@lofarb2
sudo tcpdump -pni eth1
stop it with <CTRL-C>
exit
ssh observer@lofarb3
sudo tcpdump -pni eth1
stop it with <CTRL-C>
exit
ssh observer@lofarb4
sudo tcpdump -pni eth1
stop it with <CTRL-C>
exit
In the original terminal do:
killpointing
This turns off the beam command.
Log out of the LCU:
exit
Send the e-mail to ASTRON and our users that we have taken over the station
(We have to set up the script, so that the operators can run it.)
Turn the station over to the users
If a user is going to immediately use the station, you can leave it in swlevel 3 (nothing needs to be done in this case). Otherwise, put it in swlevel 0.
(Need to describe that better!)
Comments