mach.kernel

Contents

mach.kernel#

Python bindings and wrapper for the CUDA kernel.

Functions

beamform(channel_data, rx_coords_m, ...[, ...])

CUDA ultrasound beamforming with automatic GPU/CPU dispatch.

mach.kernel.beamform(
channel_data: Num[Array, 'n_rx n_samples n_frames'],
rx_coords_m: Real[Array, 'n_rx xyz=3'],
scan_coords_m: Real[Array, 'n_scan xyz=3'],
tx_wave_arrivals_s: Real[Array, 'n_scan'],
out: Num[Array, 'n_scan n_frames'] | None = None,
*,
rx_start_s: float,
sampling_freq_hz: float,
f_number: float,
sound_speed_m_s: float,
modulation_freq_hz: float | None = None,
tukey_alpha: float = 0.5,
) Array#

CUDA ultrasound beamforming with automatic GPU/CPU dispatch.

This function implements delay-and-sum beamforming with the following features: - Dynamic aperture growth based on F-number - Tukey apodization with adjustable taper width - Support for both RF and IQ data - Multi-frame processing - Automatic array protocol detection and GPU/CPU dispatch - Mixed CPU/GPU array handling with automatic memory management

For theoretical background on delay-and-sum beamforming, see: Perrot et al., “So you think you can DAS? A viewpoint on delay-and-sum beamforming” https://www.biomecardio.com/publis/ultrasonics21.pdf

beamform is a wrapper around nb_beamform, a nanobind-generated Python/C++/CUDA function. beamform adds more helpful error-messages than nb_beamform, but it adds about 0.1ms overhead (on AMD Ryzen Threadripper Pro). If your inputs are properly shaped/typed, or you are okay reading nanobind’s slightly-confusing type-check error messages, you can use nb_beamform directly.

Parameters:
  • channel_data – RF/IQ data with shape (n_rx, n_samples, n_frames). For I/Q data: use complex64 dtype. For RF data: use float32 dtype. Note: this layout order improves memory-access patterns for the CUDA kernel.

  • rx_coords_m – Receive element positions with shape (n_rx, 3) where each row is [x, y, z] in meters. Each element represents the physical location of a transducer element on the probe.

  • scan_coords_m – Scan grid point coordinates with shape (n_scan, 3) where each row is [x, y, z] in meters. These are the spatial locations where beamformed values will be computed. Note: n_scan = number of points_m in the imaging grid where you want beamformed output.

  • tx_wave_arrivals_s

    Transmit wave arrival times with shape (n_scan,) in seconds. This represents the time when the transmitted acoustic wave arrives at each scan grid point. For different transmit types: - Plane wave: arrivals computed from wave direction and grid positions - Focused/diverging wave: arrivals computed from focal point and grid positions

    Use mach.wavefront.plane() / sound_speed_m_s or mach.wavefront.spherical() / sound_speed_m_s to compute these values.

  • out – Optional output array with shape (n_scan, nframes). Must match input type: complex64 for I/Q, float32 for RF.

  • rx_start_s – Receive start time offset in seconds. This corresponds to t0 in the literature (biomecardio.com/publis/ultrasonics21.pdf) - the time when the 0th sample was recorded relative to the transmit event. When rx_start_s=0, the wave is assumed to pass through the coordinate origin_m at t=0.

  • sampling_freq_hz – Sampling frequency in Hz.

  • f_number – F-number for aperture calculations. Controls the size of the receive aperture based on depth. Typical values range from 1.0 to 3.0.

  • sound_speed_m_s – Speed of sound in m/s. Typical value for soft tissue is ~1540 m/s.

  • modulation_freq_hz – Center frequency in Hz (only used for I/Q data; ignored for RF data). For I/Q data: required parameter, set to 0 if no demodulation was used. For RF data: automatically defaults to 0.0 if not provided.

  • tukey_alpha – Tukey window alpha parameter for apodization. Range [0, 1]: - 0.0: no apodization (rectangular window) - 0.5: moderate apodization (default) - 1.0: maximum apodization (Hann window)

Returns:

Beamformed data with shape (n_scan, nframes). Will be out if provided, otherwise a new array will be created. Output dtype matches input dtype (complex64 or float32).

Notes

  • All spatial coordinates should be in meters.

  • All time values should be in seconds.

  • All frequencies should be in Hz.

  • For optimal performance, use contiguous arrays with appropriate dtypes.

  • If the input arrays are not contiguous, the function automatically handles memory layout conversion.

  • Arrays can be on different devices (CPU/GPU); automatic copying will be performed with performance warnings.

Examples

Basic plane wave beamforming:

>>> import numpy as np
>>> from mach import beamform, wavefront
>>>
>>> # Set up geometry
>>> rx_positions = np.array([[0, 0, 0], [1e-3, 0, 0]])  # 2 elements, 1mm spacing
>>> scan_points = np.array([[0, 0, 10e-3], [0, 0, 20e-3]])  # 2 depths: 10mm, 20mm
>>>
>>> # Compute transmit arrivals for 0° plane wave
>>> arrivals_dist = wavefront.plane(
...     origin_m=np.array([0, 0, 0]),
...     points_m=scan_points,
...     direction=np.array([0, 0, 1])  # +z direction
... )
>>> tx_arrivals = arrivals_dist / 1540  # Convert to time (assuming 1540 m/s)
>>>
>>> # Beamform (assuming you have channel_data)
>>> result = beamform(
...     channel_data=channel_data,  # shape: (2, n_samples, n_frames)
...     rx_coords_m=rx_positions,
...     scan_coords_m=scan_points,
...     tx_wave_arrivals_s=tx_arrivals,
...     rx_start_s=0.0,
...     sampling_freq_hz=40e6,
...     f_number=1.5,
...     sound_speed_m_s=1540
... )