Experts from Google AI Research have described a way to digitize odors by creating molecular maps. They presented the "basic odor map" (POM), which gives a vector representation of each odorous molecule in space.
Left: An example of a color map in which coordinates can be directly translated into hue and saturation values. Right: a map of major odors, where individual molecules correspond to points, and the location of these points reflects the nature of their odor
See also: best laptops for machine learning
Odors are produced by molecules that spread through the air. There are potentially billions of molecules that can produce an odor, so figuring out which ones are responsible for certain scents is difficult. Molecular maps can solve this problem. Making them is complicated by the lack of good olfactory "cameras" and olfactory "monitors."
In 2019, Google AI developed a graph neural network (GNN) model that began examining thousands of examples of different molecules combined with the names of the smells they evoke, such as "meaty," "floral," or "mint." This was required in order to study the relationship between the structure of a molecule and the probability that it would have a particular odor label. The embedding space of this model contains a representation of each molecule as a fixed length vector describing it in terms of smell, much like the RGB value of a visual stimulus that describes color.
The POM presented pairs of similarly perceived odors as nearby points of similar hue. The researchers demonstrate that the map can be used to prospectively predict the odor properties of molecules, understand these properties in terms of fundamental biology, and address pressing global health issues. The map has already undergone a number of tests.
Test 1: Testing the model with molecules that did not correlate with odors
Researchers tried to see if the basic model could correctly predict the odors of new molecules that were not used in its development.
The predictions made by the two models, the GNN model (orange) and the baseline model (blue), compared with the average estimates given by humans (green). Each bar corresponds to one odor symbol label. The top five are indicated by color; the GNN model correctly identifies four of the top five with high confidence compared to only three of the five with low confidence for the baseline model
To test this, they collected the largest dataset with odor descriptions for the new molecules. The Monell Center researchers asked people to rate the smell of each of the 400 molecules using 55 different labels (such as "mint"), which were chosen to cover the space of possible smells without being either redundant or too rare. In the process, it turned out that the participants in the experiment gave different characteristics of the same molecule. However, it turned out that ROM gave the prediction that was closest to the consensus within the group. The model has also demonstrated its capabilities in alternative human olfaction tasks, such as determining the strength of an odor or the similarity of different smells. Thus, it should be possible to predict the odor qualities of any of the billions of as yet unknown odorant molecules.
Test 2: Linking odor quality to fundamental biology
Researchers tried to see if the odor map could also predict odor perception in animals and the underlying brain activity. They found that it successfully predicted sensory receptor activity, neuronal activity and behavior in most animals studied by neurobiologists, including mice and insects.
Featured book: Synthetic Data for Deep Learning PDF version
The researchers collected data on metabolic responses in dozens of species and found that the map matches the metabolic process itself very closely. When two molecules are far apart by smell, according to the map, it takes a long series of metabolic reactions to turn one into the other. Even long reaction pathways consisting of many steps are smoothly traced on the map. And molecules that occur in the same natural substances (e.g., orange) are often very tightly clustered on the map. POM shows that the sense of smell is connected to our natural world through metabolic structure and, surprisingly, reflects fundamental principles of biology.
Left: Combined metabolic reactions found in 17 species in 4 kingdoms. Each circle represents an individual metabolite molecule, and the arrow indicates the presence of a metabolic reaction that turns one molecule into another. Some metabolites are odorous (color) while others are not (gray), and the metabolic distance between two odorous metabolites is the minimum number of reactions needed to turn one into the other. In the pathway in bold, the distance is 3. Right: metabolic distance was strongly correlated with distance in POM, an estimate of perceived odor difference
Test 3: Extending the model to address the global health problem
The odor map, closely related to the perception and biology of the animal kingdom, opens up new possibilities. Because the POM can be used to predict animal olfaction in general, they decided to retrain it to address one of humanity's biggest problems: the spread of mosquito- and tick-borne diseases.
The model was trained on digitized data from the U.S. Department of Agriculture on the repellency of thousands of molecules. It was also trained to predict repellency and improve analysis predictions when performing computational screenings for potential repellents
The original model was improved by "feeding it" a set of experiments conducted by the U.S. Department of Agriculture on human volunteers, as well as a new data set collected at TropIQ using high-throughput laboratory mosquito analysis. Both data sets measure how well a particular molecule repels mosquitoes. Together, the resulting model can predict the repellency of almost any molecule against mosquitoes.
Many of the molecules showing repellency against mosquitoes in laboratory studies have also shown repellency when applied to humans. Some have shown greater efficacy than the most common repellents used today (DETA and picaridin)
Filter tested experimentally, using brand new molecules, and found more than a dozen of them with repellency at least as good as DETA, the active ingredient in most insect repellents. Less expensive, longer-lasting and safer repellents, for example, will help reduce the incidence of malaria.
Featured book: Taming Big Data with Apache Spark and Python PDF version