Genomics data is the combined product of an expanding network of biologists, geneticists, and data scientists all over the world. These scientists and professionals are seeking answers about the structure, function, evolution, mapping, and editing of DNA, genes, and the human genome.
The discoveries they make are leading to breakthroughs in medicine, disease control, and data storage. Genomics data is helping the larger scientific community understand and unravel the chemical structures of what makes life possible.
What Does Genomics Mean?
In short, genomics is a relatively recent term that refers to the study of a genome. But to fully describe genomics and genomics data, there are four key terms we must define first:
What are cells?
Cells are the building blocks of life. They’re microscopic units that contain all the tools necessary to power themselves and reproduce. They can also work with other cells to build bones, blood, muscle, skin, and everything else in our body. Cells build themselves and act according to instructions from DNA.
What is DNA?
DNA is a chemical compound that contains all the instructions a living organism needs to develop. Different living things have different DNA, but the fundamental blocks of DNA are four chemicals represented by four single letters.
- Adenine (A)
- Cytosine (C)
- Guanine (G)
- Thymine (T)
The structure of DNA is incredibly complex. It’s similar to a book, which has millions of letters combined into words, sentences, and paragraphs. One copy of human DNA contains 3 billion distinct combinations of the four letters that represent the building blocks of DNA. Just about every cell in your body contains a copy of your DNA, providing tens of thousands of instructions for how to build you through genes.
What is a gene?
A gene is a specific, unique sequence of the long DNA strand that creates traits you might inherit, like eye color. In biology, the gene is considered the basic unit of heredity.
If DNA is like all the letters the body has to write with, a gene might be a word. Some are shorter than others and some have more complicated effects on the body. Some genes have a few hundred DNA letter combinations and others have more than 2 million.
The human body is estimated to have between 20,000 and 25,000 genes. The study of genes is called genetics.
What is genomics?
Genomics and genetics are different, but they’re related. Genetics focuses on individual genes and inheritance of traits, but genomics tries to characterize all of an organism’s genes and how they interrelate to one another and influence the body.
A genome is the collection of all of a person’s genes taken together. If we’re thinking of genes as words, genomics is looking at the entire book and understanding how the chapters, characters, and themes all relate to one another.
What Is Genomics Data Analysis?
Genomics data analysis is a field of study that relies on computational technologies to analyze and help visualize the genome and information about it. Genomics data analysis includes processing huge amounts of data in the search for relationships between genes and then storing not only all the raw data but also those relationships and context.
Understanding human genetics and genomics requires much more than discovering that there are 3 billion combinations of four chemical letters in a specific, ordered sequence (that happened in 2003). Genomics is also about discovering each of the nearly 25,000 combinations of those 3 billion letters, what they do, how they relate to one another, and how they interact with the environment.
One of the exciting things about genomics data analysis is that our ability to visualize and sequence the letters in DNA has developed faster than our ability to decipher and understand what the letters actually do. So, genomics data analysis is an attempt to take that wealth of information we have about the language of our genes and translate it into medicine and more.
Genomics data analysis is a massive undertaking in more ways than one. Genomics data analysis is one of the fastest-growing big data domains in the world. Genomics data analysis could generate as much as 40 exabytes of data by 2025. To put the size of genomic data into perspective, if one gigabyte is the size of the earth, an exabyte is nearly the size of the sun. Genomic data would be 40 suns.
Why Do We Study Genomics?
Genetics plays a crucial role in many of the leading causes of death in the United States. Heart disease, diabetes, and cancer are all caused by factors that have something to do with different genes in our bodies and the relationships between them. We study genomics to try to prevent some of the most common forms of illness, disease, and death.
Genomics helps us better understand how our bodies function, which can hopefully help more people keep them functioning better for longer. Genetically speaking, humans are 99.9% identical. That means that genomic breakthroughs for one person will almost certainly apply to nearly everyone on the planet.
Why Is It Important That Genomic Databases Are Shared Among Researchers?
It’s important to share genomic data and genomic databases among researchers so that more accurate results can be found more quickly. As more and more researchers explore the relationships between genes and diseases, sharing those findings and inviting others to collaborate on them can lead to better outcomes in knowledge, products, and procedures that improve human health.