This is an old revision of the document!
This page is dedicated to the details of building an OFDM radar on a ZCU102 + AD9081 with GNURadio and IIO.
If you just want to get the software and hardware running, the following section covers the setup instructions:
To get started, in terms of hardware you will need:
Required software:
It is usually a good idea to start out by installing a recent image of Kuiper Linux onto the ZCU102's SD card, so you don't have to rebuild the rootfs.
Depending on the age of your release, you may need to build a more recent kernel:
# First clone the repository git clone https://github.com/analogdevicesinc/linux.git cd linux # First we need to make the Vitis arm64 gcc toolchain available and enable cross compilation export PATH=$PATH:/opt/Xilinx/Vitis/2020.2/gnu/aarch64/lin/aarch64-linux/bin/ export ARCH=arm64 export CROSS_COMPILE=aarch64-linux-gnu- # Then we can initialize the .config to something that enables most ADI drivers make adi_zynqmp_defconfig # And finally build our image make -j$(nproc) Image UIMAGE_LOADADDR=0x8000 # Finally you can copy arch/arm64/boot/Image into the boot directory of the ZCU102 sdcard: cp arch/arm64/boot/Image /mnt/boot/
If you're having trouble building the Linux image, there are more detailed articles describing the process (WHERE!).
Finally you will have to build the correct device tree blob:
# Build the device tree blob make xilinx/zynqmp-zcu102-rev10-ad9081-m8-l4-tdd.dtb # Copy to ZCU102 boot directory cp arch/arm64/boot/dts/xilinx/zynqmp-zcu102-rev10-ad9081-m8-l4-tdd.dtb /mnt/boot/system.dtb
# Source your Vivado 2020.2 (or later, depends on the adi/hdl release) settings source /opt/Xilinx/Vivado/2020.2/settings64.sh # Clone the HDL git clone https://github.com/Yamakaja/hdl.git git switch data_offload # Navigate to the ad9081 / ZCU102 project cd projects/ad9081/ad9081_fmca_ebz/zcu102/ # Build the project with TDD support. Note that enabling TDD support is only possible if you also enable shared device clocks, which means that your IO rates will be symmetrical. make TDD_SUPPORT=1 SHARED_DEVCLK=1
The HDL build should take around 15 - 30 mins, and leave you with a projects/ad9081_fmca_ebz/zcu102/ad9081_fmca_ebz_zcu102.sdk/system_top.xsa when its done.
This guide describes how you can use the system_top.xsa to build the BOOT.BIN, which also needs to be copied into the sdcard's boot partition: build-the-zynqmp-boot-image
Once you've got an updated linux Image, BOOT.BIN and system.dtb installed and the AD9081 eval board mounted on the ZCU102, you can start to hook up a receive and transmit antenna / or other RF components.
To use the gr-iio AD9081 and TDD blocks, you will have to build this GNU Radio fork/branch, which is fairly close to master: Yamakaja/GNURadio
# Checkout code git clone https://github.com/Yamakaja/gnuradio.git git switch feature/gr-iio-tdd # Note, you should adjust the cmake build command according to your local environment! This one was created to work with my environment mkdir -p build cmake -DCMAKE_INSTALL_PREFIX=/usr/local \ -DPYTHON_EXECUTABLE=$(which python3) \ -DPYTHON_INCLUDE_DIR=/usr/include/python3.9 \ -DPYTHON_LIBRARY=/usr/lib/libpython3.9.so \ -DGR_PYTHON_DIR=/usr/lib/python3.9/site-packages \ -DENABLE_GRC=ON \ -DENABLE_GR_QTGUI=ON \ -DQWT_LIBRARIES=/usr/lib/libqwt.so \ -DCMAKE_BUILD_TYPE=Debug \ -B build \ -S . make -C build -j$(nproc) # Install GNU Radio into system directories sudo make -C build install # Make sure the installation was successful by opening gnuradio-companion LD_LIBRARY_PATH=/usr/local/lib /usr/local/bin/gnuradio-companion
On account of being a GNU Radio module, the process to build gr-ofdmradar is quite similar:
# Checkout code git clone https://github.com/Yamakaja/gr-ofdmradar.git cd gr-ofdmradar # Make sure this build command matches that of your GNU Radio installation! mkdir -p build cmake -DCMAKE_INSTALL_PREFIX=/usr/local \ -DPYTHON_EXECUTABLE=$(which python3) \ -DPYTHON_INCLUDE_DIR=/usr/include/python3.9 \ -DPYTHON_LIBRARY=/usr/lib/libpython3.9.so \ -DGR_PYTHON_DIR=/usr/lib/python3.9/site-packages \ -DCMAKE_BUILD_TYPE=Debug \ -B build \ -S . make -C build sudo make -C build install
When you now start GNU Radio companion once again, the gr-ofdmradar blocks should show up in the block-list:
LD_LIBRARY_PATH=/usr/local/lib /usr/local/bin/gnuradio-companion
To validate that the ofdm radar module has been installed properly, you can launch the ofdmradar_test example in the examples directory of gr-ofdmradar:
Running the flowgraph should leave you with a radar screen simulating four targets:
To test the OFDM radar with real hardware and signals, open the ofdmradar_ad9081.grc flowgraph in the gr-ofdmradar example directory.
iio_target
variable must point to the IP address of your ZCU102 target!
The following video shows a test where we covered a distance of 30-40m:
For more details in general about the theoretical underpinnings of OFDM radar, please check out Martin Brauns dissertation: https://publikationen.bibliothek.kit.edu/1000038892
For more information about gr-ofdmradar system parameters check out the gr-ofdmradar/README.md
The system deep dive is meant to cover the details of the entire radar system from top to bottom. Unless you're trying to recreate a similar system from scratch or trying to debug an issue, this section may not be too interesting.
Subsystems which will be covered:
Some notes on the device that were used:
The AD9081 is a 4-channel RF DAC/ADC in a single package, with multi-chip synchronization and various other interesting capabilities. Data is transferred from the DAC to the ADC over up to 8+8 (RX+TX) SERDES lanes running JESD 204 B/C. Unfortunately the transceivers of the ZU9EG on the ZCU102 only go up to ~15 Gbps, as such only JESD 204B operation is available with this hardware configuration.
Before getting started with the implementation details, we need to establish why hardware changes are necessary: To be able to implement a reliable and accurate radar system, our interfaces must provide certain guarantees. Take a look at this picture illustrating an ordinary pulse radar:
In this monostatic setup, the transmitter produces a small pulse, and then listens for the return signal. To determine the distance to the target, the time of flight is calculated by taking the time at which the signal returned, and subtracting that from the transmit time. More information like the doppler shift, RCS estimation, etc. can be estimated later on in the signal chain, but these aren't relevant right now. In this situation only the time of flight is actually interesting to us, not the absolute time at which the signal was received. To determine the time of flight, we need to rely on a known, fixed timing relationship between the signal was sent out, and our input samples at the receiver. There exist a number of solutions to this problem, for example a strongly attenuated version of the TX signal could be looped back to the receiver. In systems where the full data stream is available, and reliably so, this may be an adequate solution, but the iio link does not lend itself to such an approach - mostly because the data rates are much too high:
For the remainder of this page, we will assume that the default ZCU102 / AD9081 JESD configuration has been selected:
Rate / direction = 4 * 250e6 MS/s * 32 bit/S = 32 Gbps
While the memory links and the FPGA can deal with these rates, the processing system and/or the Gigabit ethernet link clearly cannot.
An issue that result from this bandwidth bottleneck should be fairly obvious: We cannot transmit or receive a continuous sample stream from GNU Radio. This problem isn't unique to the direct RF platform we're working with here, but also applies to many other devices like the Pluto SDR, for all sample rates exceeding the ~3-8 MS/s (?) supported by the USB 2.0 link.
So what are the guarantees provided by iio?
By default this will result in a situation like the following, where RX and TX buffers are sampled completely independently, and the distance between each buffer is also apparently random:
To summarize, the two issues which we need to address:
The first issue may be addressed fairly easily by increasing buffer sizes. On one hand this means that our entire transmit and receive waveform need to fit into memory at once, on the other we are guaranteed that this waveform will be continuous!
The second problem is much more tricky to solve, and requires modifications to the HDL. The basic idea is as follows: What if we don't stream continuously at the hardware level, but only in small bursts at predetermined times. This means that we're effectively using hardware to cut out small windows of the transmit and receive signals and only allowing those to pass onto the DMA (Or from the DMA to the signal chain). This results in a greatly reduced data rates (in a controlled manner), and known relationships between RX and TX. The following picture illustrates what this system should do (Which is very similar to the triggering mechanism in an oscilloscope):
On the HDL side we will be using a Timing Division Duplexing (TDD) core, which was originally developed to control the en/products/ad9361.html family of transceivers: See the reference_hdl for more information about what the TDD core can do.
Because the TDD engine was previously not available as a standalone IP core, i created a small wrapper which just references the existing tdd files from the util and/or common directory: ''axi_tdd''.
Now we should take a look at the “data offload”, which is responsible for sampling the stream when triggered by the TDD engine (It's basically a glorified FIFO). The data offload is a rather complex block that offers a multitude of functions and configuration options, for more information see the readme: README.md
The interesting part for us are the synchronization modes, which allows the data offload to remain in a waiting state, until it is triggered either by a write to a register or externally. The integration into the HDL is as follows:
sync_ext
signal, the value of that signal is only relevant in the initial waiting phase. Once the data offload has been triggered, the external sync signal may go LOW immediately. While the exact time that the TDD engine keeps that synchronization signal high isn't too important, keeping it high for too long may trigger a second buffer!
As you can tell, the tdd_tx_valid
line is simply connected to the external synchronization input of the TX data offload. This means that we can precisely control the start of the sample replay using the TDD engine. Once it has been triggered, it will fill up its internal buffer (Who's size *must* be an integer multiple of the iio buffer size. If it's not, you can use the transfer length register of the data offload to make sure things line up), and then play it back to the upack core that takes the packed sample stream and deinterleaves them into a parallel bus (two * 16 bit per complex channel, see JESD's M parameter) with 128-bit @ 250 MHz
.
For a more detailed look at the datapath with the M=8, L=4 configuration, take a look at the following illustration. Now however, that for our purposes the UTIL_DACFIFO
and UTIL_ADCFIFO
have both been replaced by data offloads.
Synchronization on the RX side is a little more involved, to explain why we need to take a look at the data packing format and the cpack cores. Imagine the following situation (Which will be quite common ;) ), you've got the default JESD configuration running (Four complex channels), but only one of those channels is actually in use. It would obviously be a waste to transfer data which isn't used, which is why the [cu]pack cores take a parallel stream of samples (All four channels, no matter which are in use) and turn that into an interleaved stream of a lower rate. That is if only a single channel out of four is active, in the above situation the output of the CPACK core will be valid only once for every four samples, and that one valid sample of 128-bit will contain all 4 complex samples for that single channel. Because the cpack core is located before the data offload in the sample stream, this means that we can store more samples when fewer channels are active.
Now finally the problem i've been trying to hint at: The tdd synchronization signal is not in any way related to the cpacks sample timer. That is depending on where the cpack core is in its process of collecting four samples compared to when the sync signal arrives, we may receive an apparently random shift of zero to three samples. To correct for this phenomenon, currently the cpack core is just reset with every sync signal. While this certainly isn't a great solution, it was the least invasive at the time …