For more than 60 years, scientists have searched the cosmos for possible signs of radio transmission that would indicate the existence of extraterrestrial intelligence (ETI). During this time, technology and methods have evolved considerably, but the greatest challenges remain. In addition to never having detected a radio signal of extraterrestrial origin, there is a wide range of possible forms that such a broadcast could take.
In short, SETI researchers have to guess what a signal would look like, but without the benefit of any known examples. Recently, an international team led by the University of California, Berkeley and the SETI Institute developed a new machine learning tool that simulates what a message from an extraterrestrial intelligence (ETI) might look like. Known as Setigen, this open-source library could be a game-changer for future SETI research.
The research team was led by Bryan Brzycki, a graduate student in astronomy at UC Berkeley. He was joined by Andrew Siemion, director of the Berkeley SETI Research Center, and researchers from the SETI Institute, Breakthrough Listen, Dunlap Institute for Astronomy & Astrophysics, Institute of Space Sciences and Astronomy, International Center for Radio Astronomy Research (ICRAR) and the Goergen Institute for Data Science.
Since the 1960s, the most common method of SETI has been to search the cosmos for radio signals of artificial origin. The first such experiment was Project Ozma (April to July 1960), led by famed Cornell astrophysicist Frank Drake (creator of the Drake equation). This study used the 25-meter dish at the National Radio Astronomy Observatory in Green Bank, West Virginia to monitor Epsilon Eridani and Tau Ceti at frequencies of approximately 400 kHz around 1.42 GHz.
These searches have since expanded to cover larger areas of the night sky, wider frequency ranges, and greater signal diversity. As Brzycki explained to Universe Today via email:
“In the 1960s, the idea was to focus on a region around a well-known frequency where neutral hydrogen emits radiation into interstellar space, namely 1.42 GHz. As this natural emission is widespread throughout the galaxy, the idea is that any intelligent civilization would know of it and could target this frequency to transmit it to maximize the chance of detection. Since then, thanks in particular to the rapid advances in technology, radio SETI has developed on all measurement axes.
“We can now take measurements over a bandwidth of several GHz instantaneously. With improved storage, we can collect huge amounts of data, allowing higher resolution observations in the time and frequency directions. Likewise, we took surveys of nearby stars and other directions in the galaxy, to maximize exposure to potentially interesting directions in the sky. »
Another major change was the incorporation of machine learning-based algorithms designed to find transmissions amid the radio noise floor of the cosmos and correct for radio frequency interference (RFI). The algorithms employed in SETI studies fall into two categories: those that measure voltage time series data and those that measure frequency spectrogram data.
Plots of radio spectrograms created from Setigen frames. Credit: Brzycki et al.
“The raw data collected by a radio antenna are voltage measurements; a radio wave induces a current in the antenna, which is read and recorded as a voltage,” Brzycki said. “A radio telescope is basically just an antenna augmented with a dish to concentrate a larger area of light, which increases resolution and luminosity. It turns out that the intensity is proportional to the square of the voltage. Additionally, we are interested in intensity as a function of frequency and time (the when and where of a potential signal). »
To achieve this, says Brzycki, astronomers start by using algorithms that calculate the power of each observed frequency relative to the input time series data. In other words, the algorithm transforms the radio signal data from a function of space and/or time into a function depending on the spatial frequency or the temporal frequency, i.e. a transformation of Fourier (TF). By squaring the latter, astronomers can measure the intensity of each frequency during the data collection period.
“To get a complete spectrogram, a chart of intensity versus time and frequency, we take a section of the voltage-time series, get the TF, and then repeat this process across the observation, so we can effectively stack a series of arrays of TF data on top of each other in the direction of time,” Brzycki adds. “Once the time resolution is chosen, we determine the number of time samples needed and calculate the TF to see what power is in each frequency slot. »
The main search algorithm used by SETI researchers is known as the “incoherent tree Doppler” algorithm, which shifts the spectrum of radio waves to correct for frequency drift and maximize the signal-to-noise ratio of a signal. . The most comprehensive SETI search program ever, Breakthrough Listen, uses a free version of this algorithm, known as TurboSETI, which has served as the basis for many searches for “technosignatures” (i.e. say signs of technological activity). As Brzycki explains, there are a few drawbacks to this method: “The algorithm assumes that a potential SETI signal is continuous with a high duty cycle (meaning it is almost always on).
“Because TurboSETI is targeted at straight-line signals that are always ‘on’, it may struggle to detect other morphologies, such as broadband and pulsating signals. Other algorithms are being developed to attempt to detect these other types of signals, but as always, the effectiveness of our algorithms depends on the assumptions we make about the signals they are targeting. »
For SETI researchers, machine learning is a way to identify transmissions in raw radio frequency data and classify multiple types of signals. The main problem, according to Brzycki, is that the astronomical community lacks a data set of ET signals, which makes supervised learning in the traditional sense difficult. To that end, Brzycki and his colleagues have developed an open-source Python-based library called Setigen that makes it easy to produce synthetic radio observations.
“Setigen facilitates the production of synthetic SETI signals, which can be used in fully synthetic data or added to real observational data to provide more realistic background noise and radio interference,” Brzycki said. “In this way, we can produce large datasets of synthetic signals to analyze the sensitivity of existing algorithms or to serve as a basis for machine learning. »
This library standardizes synthesis methods for analyzing search algorithms, especially for existing radio observation data products like those used by Breakthrough Listen. “These come in both spectrograms and complex voltages (time series) so having a method of producing mock data can be really useful for testing production code and developing new procedures. added Mr. Brzycki.
One of 42 antennas in the Allen Telescope Array that searches for signals from space. Credit: Seth Shostak / SETI Institute.
Currently, algorithms for multibeam observations are being developed using Setigen to produce dummy signals. The library is also constantly updated and improved as SETI research progresses. Brzycki and his colleagues also hope to add support for wideband signal synthesis to aid search algorithms that target non-narrowband signals. More robust SETI surveys will be possible in the near future, when next-generation radio telescopes are operational.
This is the case of Breakthrough Listen, which will integrate data from the MeerKAT network in South Africa. There is also the Square Kilometer Array (SKA), a massive radio telescope project that will combine data from observatories located in South Africa and Australia. These are the MeerKAT network and the HERA (Hydrogen Epoch of Reionization Array) network in South Africa, as well as the ASKAP (Australian SKA Pathfinder) network and the MWA (Murchison Widefield Array) network in Australia.
Alas, there remains the most limiting factor regarding SETI, namely our extremely limited frame of reference. Ultimately, astronomers have no idea what an extraterrestrial signal would look like, because we’ve never seen one before. Paradoxically, this makes it harder to find technosignatures in the background noise of the cosmos. Astronomers are therefore forced to adopt the “ripest fruit” approach, that is, to seek technological activity as we know it.
However, by setting parameters based on what is theoretically possible, scientists can narrow the search and increase the chances of finding something one day. As Brzycki summarizes:
“The only potential solution to this problem is some sort of unsupervised machine learning survey that minimizes our assumptions; work is underway on this front. Setigen certainly relies on this assumption – the synthetic signals one can produce are heuristic in nature, in that it is the user who decides what they should look like.
“Ultimately, the library provides a way to evaluate our existing algorithms and create datasets of potential signals to develop new research methods, but the fundamental questions of where and when will always remain relevant.
At times like this, it’s good to remember that the Fermi Paradox only needs to be solved once. As soon as we detect a radio transmission in the cosmos, we will know for sure that we are not alone in the Universe, that intelligent life can and does exist beyond Earth and that it communicates to the using technologies that we can detect.