Unlocking the Genetic Secrets of Durum Wheat
Cattivelli directs the Genomics Research Center in Fiorenzuola, which is part of the Italian government’s Council for Agricultural and Economic Research (CREA). Alongside his colleagues and teams of crop geneticists from around the world, Cattivelli is utilizing high-performance computing in the Microsoft Azure cloud to unravel the genetic mysteries of durum and other varieties of wheat. In the Pangenome Project, they are sifting through the genomes of approximately 40 varieties of wheat and its ancient ancestors to identify traits that would enable the crop to thrive in extreme conditions, be more efficient in the use of natural resources, and be resistant to disease and pests, thereby reducing the need for fertilizers and pesticides.
The Urgent Quest for Sustainable Food Production
It’s not just a matter of pasta for Italians; it’s an urgent quest because growing enough staples like wheat, rice, and corn is essential to human survival. Wheat makes up about 20% of the calories consumed globally by humans. Moreover, climate change poses a direct threat to crop production worldwide, from drought and heat to torrential rains and other extreme weather events, such as the recent floods in eastern Spain.
Collaboration and Cloud-Based Computing
Working together with Microsoft, CREA built a framework in the Azure cloud that could eventually house and analyze multiple petabytes of genetic data from the genomes of many varieties of wheat from multiple sources. To put this into perspective, one petabyte could hold up to 2,000 years’ worth of digital music, if played continuously. Curtis Pozniak, a geneticist who directs the Crop Development Center at the University of Saskatchewan, Canada, is among the founders of the Pangenome Project.
Filtering Down the Data
"We’re generating petabytes of information that we need to filter down into something meaningful," Pozniak says. "The only efficient way to do that is through cloud-based platforms where the same data can be shared with a whole range of experts at the same time." The data, stored in Microsoft’s Northern Italy Data Center Region, is then processed and analyzed in what is known as a "pipeline," also housed in Azure. A pipeline is a series of data processing stages, in this case created with open-source coding. This particular genomic pipeline is designed to deal with billions of small sequences that have to be ordered to make the 14 chromosomes of the durum wheat genome.
Piecing Together the Genomic Puzzle
The genomic puzzle can be seen and worked on by teams of scientists worldwide. Knowledge and information extracted from the genomic puzzle will be embedded in new varieties that will be made available to farmers in the coming years.
Source Link