Navigating a sea of data
“Right now, we have hundreds of grapevine genomes that have been sequenced, including many wild species,” says Matus. “This is an enormous resource for all of us because here we have the genetic source of the traits we’re looking for.”
At present, genome sequences can be accessed through sites such as grapegenomics and Gramene, but the problem with genomes is that they’re like books written in a foreign language – you might be able to read the letters, but that doesn’t tell you what the words mean. Knowing a genome sequence similarly doesn’t tell you the function of its genes.
“We now need to integrate literature and functional data of genes into genomes,” comments Matus.
Other parts of this puzzle reside in databases within various institutions. Matus’ group houses several omics data-visualisation tools on their VitViz platform. Examples include Gene Cards, where users can search by gene name to discover the gene’s known function and see a graphic of the tissues in which it’s been found to be most active. Another example is an app that constructs a map of genes that tend to be expressed together.
“The problem is that all the available resources aren’t connected,” explains Matus. “So, you need to learn how each database is constructed so you can navigate it, and then you can’t jump from one to another.”
He also highlights that there are no databases for some types of information. “You can look at publications and see tonnes of data, but we need to find a place to put all these data to extract the most information from it. And right now, there are also no databases regarding differences in grapevine responses to the environment and climate.”
Omics in action
Grapevine breeders want to know which genes affect specific characteristics so they can tell which plants have genes for desirable or undesirable traits. Knowing the path from a gene to a trait also enables biologists to manipulate plant behaviour and solve problems.
Dr Justin Lashbrooke of the Department of Genetics at Stellenbosch University draws on apps developed by Matus’ group to study grapevine genetics for breeding purposes. “Our lab quantifies metabolites linked to enzymes to identify genetic variants linked to traits such as flavour and aroma, disease resistance, and postharvest quality,” he says.
What does this mean? Lashbrooke explains using the example of cuticular waxes. Berry waxes cover their surface and provide the first line of defence against environmental and microbial threats. Research from Lashbrooke’s laboratory also indicates that cuticular waxes may be involved in berry cracking.
To uncover the genes involved in forming these waxes, Lashbrooke worked with crosses between vines with waxy berries and vines with polished berries. Crosses like these can help geneticists figure out roughly where on the grapevine’s 19 chromosomes the genes for a trait might be found. Cuticular waxes are associated with 31 areas on different chromosomes.
By analysing the waxes in the grapevines with waxy berry coatings, Lashbrooke learned that they were mostly made of compounds called triterpenes. One of these, oleanolic acid, occurs in many plants and has been widely studied due to its antitumor and antiviral effects. The precursors and enzymes involved in oleanolic acid synthesis are therefore known in some plants — but the pathways in grapes are less well-characterised.
Lashbrooke analysed existing data and utilised apps on the VitViz platform to track down the triterpene synthase gene that makes oleanolic acid in his grapevines. This turned out to be a previously unknown gene. His laboratory can now investigate variations of this gene in different cultivars to see whether some variants are linked to desirable – or undesirable – traits.
Grapedia
The gene involved in producing triterpene cuticular waxes in Lashbrooke’s experimental vines could have been any one of 12 genes hiding among 3 500 genes in 31 chromosomal areas. So, tracking it down was quite an achievement, facilitated by apps developed by Matus’ and other groups.
“In total, this was an amazingly quick route to narrow hundreds of potential genes down to one gene without having to do our own experiments,” says Lashbrooke. “We could rely on the masses of publicly available data that’s out there.”
Matus wants to make this even easier. He believes that users should be able to access raw data in a form that’s fair: findable, accessible, interoperable, and reusable. His group has followed this approach to develop a new platform called Grapedia – short for The Grapevine Genomics Encyclopaedia.
Grapedia is a portal to integrate existing and new data in a cloud-based database that users can access through different interfaces. The first version is expected to be available for testing in early 2024.
“We are in the era of big data and data-bases,” says Matus. “There’s an exponential growth in the data being published by the grapevine community worldwide. We need to standardise and unify our resources – we could use all this data to generate models of grapevine behaviour and design new breeding strategies.”