|
|
|
[<< Home](/home#6-firmware-and-software-design-panel-charge-5)
|
|
|
|
|
|
|
|
[<< Section 6.1](./6.1)
|
|
|
|
|
|
|
|
## 6.2 Beamformer XB-Engine Software
|
|
|
|
The ALPACA XB-engine will be powered by 50 NVIDIA A10s evenly distributed over
|
|
|
|
25 rack mount 2U servers. See [Section 5.3](../5-dbe/5.3) for the physical hardware
|
|
|
|
specifications of these servers. ALPACA will be using software frameworks and
|
|
|
|
packages which are actively used in many radio observatories around the world.
|
|
|
|
Some of the processing libraries are BYU developed and have been used and
|
|
|
|
verified in past PAF radio astronomical observations as part of the 150 MHz
|
|
|
|
digital back end for Focal L-band Array on the GBT (FLAG)[^1-flag-pulsar]
|
|
|
|
[^2-flag-hi].
|
|
|
|
|
|
|
|
### 6.2.1 Hashpipe
|
|
|
|
The High Availability Shared Pipeline Engine ([Hashpipe][hashpipe-git]) is an open-source
|
|
|
|
software framework for real-time data processing applications. Hashpipe is an
|
|
|
|
adaptation of software from the GUPPI (Green Bank Ultimate Pulsar Processing
|
|
|
|
Instrument) back end developed by NRAO and has been modified to be a more
|
|
|
|
general pipeline tool used particularly for GPU-based correlators.
|
|
|
|
|
|
|
|
Hashpipe achieves its real-time processing capabilities by providing a framework
|
|
|
|
for dividing consecutive processing tasks into different threads, with the
|
|
|
|
creation and management of shared memory segments and semaphore arrays for data
|
|
|
|
movement and communication between threads. The shared memory segments are
|
|
|
|
implemented as a ring-buffer and are conceptually thought of as being placed
|
|
|
|
between two consecutive threads. This creates a shared buffer relationship where
|
|
|
|
a thread's input data buffer is the output data buffer of the preceding thread.
|
|
|
|
|
|
|
|
Each buffer consists of an application-defined number of blocks compatible for
|
|
|
|
the data needed between threads. Each block has an associated semaphore that
|
|
|
|
marks it is as free or filled. This is used for data access and flow control to
|
|
|
|
indicate that a block is available to accept more data (free), or that all the
|
|
|
|
data is present (filled) and that the downstream thread can proceed to process
|
|
|
|
the data. Methods are available to block a thread's access until a free or filled
|
|
|
|
event occurs.
|
|
|
|
|
|
|
|
A collection of threads form a pipeline and are compiled to create a shared
|
|
|
|
library object called a "plugin". Hashpipe loads these plugins at run-time and
|
|
|
|
will map to the operational modes that the ALPACA digital back end will deliver.
|
|
|
|
The generalized Hashpipe plugin for an ALPACA operational mode is shown in the
|
|
|
|
following figure.
|
|
|
|
|
|
|
|
<div align="center">
|
|
|
|
<img src="../img/dbe/hashpipe-plugin.png" width="600">
|
|
|
|
</div>
|
|
|
|
|
|
|
|
Each mode will have an instance of a network thread capturing F-engine UDP
|
|
|
|
formatted GbE packets from the NIC, parsing, and placing the channelized element
|
|
|
|
data for a subset of frequencies into the data buffer for the GPU accelerator
|
|
|
|
thread to process. At startup, this accelerator thread initializes the GPU and
|
|
|
|
concurrently manages the copying of data from the input shared buffer to the
|
|
|
|
device and launching kernel calls for the core compute of that mode (e.g.,
|
|
|
|
beamformer, correlator). The resulting GPU output products are placed in an
|
|
|
|
output buffer to be received by a writer thread that performs final data
|
|
|
|
formatting and sends the data off to be stored to disk.
|
|
|
|
|
|
|
|
The ALPACA back end will deliver 3 principal operational modes:
|
|
|
|
* Coarse Spectrometer Mode
|
|
|
|
* Fine Spectrometer Mode
|
|
|
|
* Calibration Correlator Mode
|
|
|
|
|
|
|
|
The coarse and fine spectrometer modes will each feature use of the
|
|
|
|
beamformer to linearly combine the 69 antenna elements for each polarization (X and Y)
|
|
|
|
into 40 simultaneous beams for processing. The calibration correlator is used
|
|
|
|
for computating beamformer weights used for each of the 40 distinct beam pointings on the sky, and for each of 1,250 coarse frequency channels.
|
|
|
|
|
|
|
|
### 6.2.2 GPU Software
|
|
|
|
The beamformer, PFB, and correlator libraries are the three GPU codes used for signal processing done in the ALPACA operational modes, and are collectively
|
|
|
|
referred to as the "GPU Software". The following provides a brief descriptions
|
|
|
|
for these libraries.
|
|
|
|
|
|
|
|
#### Beamformer
|
|
|
|
The beamformer linearly combines the complex antenna element data vector
|
|
|
|
$`\mathbf{x}`$ weighted by a precomputed vector of complex weights
|
|
|
|
$`\mathbf{w}`$ according to the following:
|
|
|
|
|
|
|
|
```math
|
|
|
|
P_{k,l,b,p,q} = \frac{1}{N} \sum_{n=0}^{N} \mathbf{w}_{k,b,p}^H\mathbf{x}_{k}[n+lN]\mathbf{x}^H_{k}[n+lN]\mathbf{w}_{k,b,q},
|
|
|
|
```
|
|
|
|
|
|
|
|
where $`k`$ is the coarse frequency channel index, $`b`$ the beam index, $`l`$
|
|
|
|
the STI index, and polarization selection $`(p,q)`$.
|
|
|
|
|
|
|
|
The following figure presents example output of actual beam patterns for the
|
|
|
|
existing GBT FLAG receiver (BYU, WVU, NRAO, and GBO), using this BYU developed
|
|
|
|
beamformer library. Seven beams are chosen from a grid scan of 3C48 in the
|
|
|
|
$`x`$-polarization at 1404.74 MHz. The $`x`$ and $`y`$ axes are the
|
|
|
|
cross-elevation and elevation offsets, respectively.
|
|
|
|
|
|
|
|
<div align="center">
|
|
|
|
<img src="../img/dbe/beamformer-beams.png" width="600">
|
|
|
|
</div>
|
|
|
|
|
|
|
|
#### Correlator
|
|
|
|
|
|
|
|
Calibrating the amplitude and phase of beamformer coefficients for a phased
|
|
|
|
array using a bright astronomical source requires an array output voltage
|
|
|
|
correlator (X-engine). The ALPACA correlator is based on the open-source GPU
|
|
|
|
library [xGPU][xgpu-git]. It has been written specifically to target the
|
|
|
|
resources available on a GPU to be an optimized implementation that parallelizes
|
|
|
|
the correlator process by computing several two-by-two correlations and
|
|
|
|
accumulating. The output data structure is a block lower-triangular matrix with
|
|
|
|
two-by-two block matrices in row-major order, and the block entries are ordered
|
|
|
|
similarly. The following figure depicts the correlator output format.
|
|
|
|
|
|
|
|
<div align="center">
|
|
|
|
<img src="../img/dbe/xgpu.png" width="600">
|
|
|
|
</div>
|
|
|
|
|
|
|
|
#### Polyphase filter bank (PFB)
|
|
|
|
In order to provide the finer channel resolution needed for the detection and
|
|
|
|
analysis of narrowband emissions (e.g. HI observations) a subset of the first
|
|
|
|
stage "coarse" channel outputs from the oversampled PFB of the F-engine are
|
|
|
|
followed by a second stage critically sampled PFB implemented in the GPU
|
|
|
|
providing a "zoom" mode spectrometer. This library is implemented to parallelize
|
|
|
|
the PFB operation of calling $`N`$ independent data streams of a PFB and
|
|
|
|
collapsing them all into a single call to the PFB kernel with all data streams
|
|
|
|
running iterations simultaneously.
|
|
|
|
|
|
|
|
The following two figures show the measured single bin and passband performance
|
|
|
|
profiles, respectively, for an 8-tap, 32-point second-stage GPU PFB. The single
|
|
|
|
bin frequency response plot compares the performance of the window used in the
|
|
|
|
PFB to that of a traditional FFT. This is to illustrate that the PFB response
|
|
|
|
per channel is narrower with lower sidelobes, as desired. The measured passband
|
|
|
|
response of the PFB shows that the prototype LPF used in the PFB is designed
|
|
|
|
such that there is no scalloping loss across the passband.
|
|
|
|
|
|
|
|
<div align="center">
|
|
|
|
<img src="../img/dbe/gpfb.png" width="600">
|
|
|
|
</div>
|
|
|
|
|
|
|
|
[Section 6.3 >>](./6.3)
|
|
|
|
|
|
|
|
### Footnotes
|
|
|
|
[^1-flag-pulsar]: K. Rajwade, et. al., “A 21 cm pilot survey for
|
|
|
|
pulsars and transients using the focal L-band array for the green bank
|
|
|
|
telescope,” Monthly Notices of the Royal Astronomical Society, vol. 489, no. 2,
|
|
|
|
pp. 1709–1718, 2019.
|
|
|
|
|
|
|
|
[^2-flag-hi]: N. M. Pingel, et al. "Commissioning the HI Observing Mode of the
|
|
|
|
Beam Former for the Cryogenically Cooled Focal L-band Array for the GBT (FLAG)".
|
|
|
|
The Astronomical Journal, vol. 161. no. 4, Mar. 2021.
|
|
|
|
|
|
|
|
[hashpipe-git]: https://github.com/david-macmahon/hashpipe
|
|
|
|
[xgpu-git]: https://github.com/GPU-correlators/xGPU |