Wiki

This version is outdated by a newer approved version.DiffThis version (21 Sep 2021 10:33) is a draft.
Approvals: 0/1

This is an old revision of the document!


gr-ofdmradar - OFDM Radar on MxFE Platforms using IIO

This page is dedicated to the details of building an OFDM radar on a ZCU102 + AD9081 with GNURadio and IIO.

If you just want to get the software and hardware running, the following section covers the setup instructions:

Software / Hardware Quickstart

To get started, in terms of hardware you will need:

Required software:

  • A development device running x86_64 Linux
  • Vivado 2020.2 (Or whatever the current hdl master branch requires) including the Vitis SDK, including a License for MPSoC parts (Included with evaluation kit)
  • A recent software build toolchain. (Usually provided by your Linux distribution. build-essential on debian stable+, base-devel on ArchLinux, etc.)

Preparing the ZCU102 boot files

It is usually a good idea to start out by installing a recent image of Kuiper Linux onto the ZCU102's SD card, so you don't have to rebuild the rootfs.

Linux Kernel

Depending on the age of your release, you may need to build a more recent kernel:

# First clone the repository
git clone https://github.com/analogdevicesinc/linux.git
cd linux

# First we need to make the Vitis arm64 gcc toolchain available and enable cross compilation
export PATH=$PATH:/opt/Xilinx/Vitis/2020.2/gnu/aarch64/lin/aarch64-linux/bin/
export ARCH=arm64
export CROSS_COMPILE=aarch64-linux-gnu-

# Then we can initialize the .config to something that enables most ADI drivers
make adi_zynqmp_defconfig

# And finally build our image
make -j$(nproc) Image UIMAGE_LOADADDR=0x8000

# Finally you can copy arch/arm64/boot/Image into the boot directory of the ZCU102 sdcard:
cp arch/arm64/boot/Image /mnt/boot/

If you're having trouble building the Linux image, there are more detailed articles describing the process (WHERE!).

Finally you will have to build the correct device tree blob:

This step is always necessary if you installed the default Kuiper image, even if your Kernel is up to date!
# Build the device tree blob
make xilinx/zynqmp-zcu102-rev10-ad9081-m8-l4-tdd.dtb

# Copy to ZCU102 boot directory
cp arch/arm64/boot/dts/xilinx/zynqmp-zcu102-rev10-ad9081-m8-l4-tdd.dtb /mnt/boot/system.dtb
The device tree blob must be renamed to system.dtb!
Building the HDL
# Source your Vivado 2020.2 (or later, depends on the adi/hdl release) settings
source /opt/Xilinx/Vivado/2020.2/settings64.sh

# Clone the HDL
git clone https://github.com/Yamakaja/hdl.git
git switch data_offload

# Navigate to the ad9081 / ZCU102 project
cd projects/ad9081/ad9081_fmca_ebz/zcu102/

# Build the project with TDD support. Note that enabling TDD support is only possible if you also enable shared device clocks, which means that your IO rates will be symmetrical.
make TDD_SUPPORT=1 SHARED_DEVCLK=1

The HDL build should take around 15 - 30 mins, and leave you with a projects/ad9081_fmca_ebz/zcu102/ad9081_fmca_ebz_zcu102.sdk/system_top.xsa when its done.

This guide describes how you can use the system_top.xsa to build the BOOT.BIN, which also needs to be copied into the sdcard's boot partition: build-the-zynqmp-boot-image

Once you've got an updated linux Image, BOOT.BIN and system.dtb installed and the AD9081 eval board mounted on the ZCU102, you can start to hook up a receive and transmit antenna / or other RF components.

Building GNU Radio

To use the gr-iio AD9081 and TDD blocks, you will have to build this GNU Radio fork/branch, which is fairly close to master: Yamakaja/GNURadio

# Checkout code
git clone https://github.com/Yamakaja/gnuradio.git
git switch feature/gr-iio-tdd

# Note, you should adjust the cmake build command according to your local environment! This one was created to work with my environment
mkdir -p build
cmake -DCMAKE_INSTALL_PREFIX=/usr/local \
    -DPYTHON_EXECUTABLE=$(which python3) \
    -DPYTHON_INCLUDE_DIR=/usr/include/python3.9 \
    -DPYTHON_LIBRARY=/usr/lib/libpython3.9.so \
    -DGR_PYTHON_DIR=/usr/lib/python3.9/site-packages \
    -DENABLE_GRC=ON \
    -DENABLE_GR_QTGUI=ON \
    -DQWT_LIBRARIES=/usr/lib/libqwt.so \
    -DCMAKE_BUILD_TYPE=Debug \
    -B build \
    -S .

make -C build -j$(nproc)

# Install GNU Radio into system directories
sudo make -C build install

# Make sure the installation was successful by opening gnuradio-companion
LD_LIBRARY_PATH=/usr/local/lib /usr/local/bin/gnuradio-companion

Building gr-ofdmradar

On account of being a GNU Radio module, the process to build gr-ofdmradar is quite similar:

# Checkout code
git clone https://github.com/Yamakaja/gr-ofdmradar.git
cd gr-ofdmradar

# Make sure this build command matches that of your GNU Radio installation!
mkdir -p build
cmake -DCMAKE_INSTALL_PREFIX=/usr/local \
    -DPYTHON_EXECUTABLE=$(which python3) \
    -DPYTHON_INCLUDE_DIR=/usr/include/python3.9 \
    -DPYTHON_LIBRARY=/usr/lib/libpython3.9.so \
    -DGR_PYTHON_DIR=/usr/lib/python3.9/site-packages \
    -DCMAKE_BUILD_TYPE=Debug \
    -B build \
    -S .

make -C build
sudo make -C build install

When you now start GNU Radio companion once again, the gr-ofdmradar blocks should show up in the block-list:

LD_LIBRARY_PATH=/usr/local/lib /usr/local/bin/gnuradio-companion

Testing the OFDM Radar

Simulation

To validate that the ofdm radar module has been installed properly, you can launch the ofdmradar_test example in the examples directory of gr-ofdmradar:

Running the flowgraph should leave you with a radar screen simulating four targets:

On ZCU102 / AD9081

To test the OFDM radar with real hardware and signals, open the ofdmradar_ad9081.grc flowgraph in the gr-ofdmradar example directory.

The iio_target variable must point to the IP address of your ZCU102 target!
The default configuration of the example flowgraph may be in violation of your local regulations! Make sure not to transmit on bands which are not allocated to you, and keep power limits in mind!

The following video shows a test where we covered a distance of 30-40m:

Useful resources

For more details in general about the theoretical underpinnings of OFDM radar, please check out Martin Brauns dissertation: https://publikationen.bibliothek.kit.edu/1000038892

For more information about gr-ofdmradar system parameters check out the gr-ofdmradar/README.md


System deep dive

The system deep dive is meant to cover the details of the entire radar system from top to bottom. Unless you're trying to recreate a similar system from scratch or trying to debug an issue, this section may not be too interesting.

Subsystems which will be covered:

  • The Transceiver / RF ADC/DAC (AD9081)
  • Hardware, HDL
  • Linux drivers
  • gr-ofdmradar and its blocks

ZCU102 / AD9081

Some notes on the device that were used:

The AD9081 is a 4-channel RF DAC/ADC in a single package, with multi-chip synchronization and various other interesting capabilities. Data is transferred from the DAC to the ADC over up to 8+8 (RX+TX) SERDES lanes running JESD 204 B/C. Unfortunately the transceivers of the ZU9EG on the ZCU102 only go up to ~15 Gbps, as such only JESD 204B operation is available with this hardware configuration.

Problem statement

Before getting started with the implementation details, we need to establish why hardware changes are necessary: To be able to implement a reliable and accurate radar system, our interfaces must provide certain guarantees. Take a look at this picture illustrating an ordinary pulse radar:

In this monostatic setup, the transmitter produces a small pulse, and then listens for the return signal. To determine the distance to the target, the time of flight is calculated by taking the time at which the signal returned, and subtracting that from the transmit time. More information like the doppler shift, RCS estimation, etc. can be estimated later on in the signal chain, but these aren't relevant right now. In this situation only the time of flight is actually interesting to us, not the absolute time at which the signal was received. To determine the time of flight, we need to rely on a known, fixed timing relationship between the signal was sent out, and our input samples at the receiver. There exist a number of solutions to this problem, for example a strongly attenuated version of the TX signal could be looped back to the receiver. In systems where the full data stream is available, and reliably so, this may be an adequate solution, but the iio link does not lend itself to such an approach - mostly because the data rates are much too high:

For the remainder of this page, we will assume that the default ZCU102 / AD9081 JESD configuration has been selected:

  • 4 RX + 4 TX channels active @ 250 MS/s, 32-bit complex samples (16 + 16 bit)
  • Rate / direction = 4 * 250e6 MS/s * 32 bit/S = 32 Gbps

While the memory links and the FPGA can deal with these rates, the processing system and/or the Gigabit ethernet link clearly cannot.

An issue that result from this bandwidth bottleneck should be fairly obvious: We cannot transmit or receive a continuous sample stream from GNU Radio. This problem isn't unique to the direct RF platform we're working with here, but also applies to many other devices like the Pluto SDR, for all sample rates exceeding the ~3-8 MS/s (?) supported by the USB 2.0 link.

The following section assumes you have a basic understanding of iio buffers. If you'd like a refresher, take a look at the IIO internals wiki page.

So what are the guarantees provided by iio?

  • All samples which are part of a single buffer will be played as a continuous stream.
  • Buffers will not be reordered

By default this will result in a situation like the following, where RX and TX buffers are sampled completely independently, and the distance between each buffer is also apparently random:

To summarize, the two issues which we need to address:

  • The data rates supported by the link make continuous transmissions impossible, we need to work with individual buffers
  • When not transmitting continuously, the relationship between RX and TX is completely unknown.

The first issue may be addressed fairly easily by increasing buffer sizes. On one hand this means that our entire transmit and receive waveform need to fit into memory at once, on the other we are guaranteed that this waveform will be continuous!

The second problem is much more tricky to solve, and requires modifications to the HDL. The basic idea is as follows: What if we don't stream continuously at the hardware level, but only in small bursts at predetermined times. This means that we're effectively using hardware to cut out small windows of the transmit and receive signals and only allowing those to pass onto the DMA (Or from the DMA to the signal chain). This results in a greatly reduced data rates (in a controlled manner), and known relationships between RX and TX. The following picture illustrates what this system should do (Which is very similar to the triggering mechanism in an oscilloscope):

The HDL

As of writing this document, not all HDL changes have made it into the upstream repository yet, thus you may either use my development branch, or make sure that the most recent commits from that branch have made it into master (Though at that point this paragraph should be updated): https://github.com/Yamakaja/hdl/tree/data_offload

On the HDL side we will be using a Timing Division Duplexing (TDD) core, which was originally developed to control the en/products/ad9361.html family of transceivers: See the reference_hdl for more information about what the TDD core can do.

Because the TDD engine was previously not available as a standalone IP core, i created a small wrapper which just references the existing tdd files from the util and/or common directory: ''axi_tdd''.

While the terminology of the TDD engine registers and signals is derived from their use with the AD9361, the different channels function completely independently, and may as well just have been numbered in this case.
The data offload engine

Now we should take a look at the “data offload”, which is responsible for sampling the stream when triggered by the TDD engine (It's basically a glorified FIFO). The data offload is a rather complex block that offers a multitude of functions and configuration options, for more information see the readme: README.md

The interesting part for us are the synchronization modes, which allows the data offload to remain in a waiting state, until it is triggered either by a write to a register or externally. The integration into the HDL is as follows:

While the data offload engine is triggered by the positive level of the sync_ext signal, the value of that signal is only relevant in the initial waiting phase. Once the data offload has been triggered, the external sync signal may go LOW immediately. While the exact time that the TDD engine keeps that synchronization signal high isn't too important, keeping it high for too long may trigger a second buffer!

As you can tell, the tdd_tx_valid line is simply connected to the external synchronization input of the TX data offload. This means that we can precisely control the start of the sample replay using the TDD engine. Once it has been triggered, it will fill up its internal buffer (Who's size *must* be an integer multiple of the iio buffer size. If it's not, you can use the transfer length register of the data offload to make sure things line up), and then play it back to the upack core that takes the packed sample stream and deinterleaves them into a parallel bus (two * 16 bit per complex channel, see JESD's M parameter) with 128-bit @ 250 MHz.

For a more detailed look at the datapath with the M=8, L=4 configuration, take a look at the following illustration. Now however, that for our purposes the UTIL_DACFIFO and UTIL_ADCFIFO have both been replaced by data offloads.

Synchronization on the RX side is a little more involved, to explain why we need to take a look at the data packing format and the cpack cores. Imagine the following situation (Which will be quite common ;) ), you've got the default JESD configuration running (Four complex channels), but only one of those channels is actually in use. It would obviously be a waste to transfer data which isn't used, which is why the [cu]pack cores take a parallel stream of samples (All four channels, no matter which are in use) and turn that into an interleaved stream of a lower rate. That is if only a single channel out of four is active, in the above situation the output of the CPACK core will be valid only once for every four samples, and that one valid sample of 128-bit will contain all 4 complex samples for that single channel. Because the cpack core is located before the data offload in the sample stream, this means that we can store more samples when fewer channels are active.

Now finally the problem i've been trying to hint at: The tdd synchronization signal is not in any way related to the cpacks sample timer. That is depending on where the cpack core is in its process of collecting four samples compared to when the sync signal arrives, we may receive an apparently random shift of zero to three samples. To correct for this phenomenon, currently the cpack core is just reset with every sync signal. While this certainly isn't a great solution, it was the least invasive at the time …

resources/eval/user-guides/ad9081_fmca_ebz/radar.1632213236.txt.gz · Last modified: 21 Sep 2021 10:33 by David Winter