How Phase Genomics' DNA Sequencing Technology is Transforming Modern Genomics

Guru Singh
May 23
16 min read

Genomics will transform how we live, from preventing diseases to improving food security and fighting climate change for a better world. This bold vision, articulated by Ivan Liachko, Founder and CEO of Phase Genomics, frames the promise of modern genomics. In a recent episode of the "talk is biotech!" podcast, Guru Singh, Founder and CEO of Scispot, hosts a discussion with Liachko about how Phase Genomics' innovative DNA sequencing technology is driving this transformation. Scispot is renowned for providing the best AI-driven technology stack for life science laboratories, empowering biotech research and development with cutting-edge digital tools.

The interview delved into Phase Genomics' approach of treating genomics as a large-scale systems biology problem rather than a single-gene endeavor, and explored the far-reaching implications of their advanced sequencing methods in healthcare, agriculture, environmental science, and beyond.

Genomics as a Large-Scale Systems Biology Problem

Genomics today goes far beyond the old one-gene-at-a-time paradigm. As Liachko explained in the interview, the field is fundamentally about extracting as much information as possible from biological systems by looking at all genes together, rather than isolating single genes. Biology is exceedingly complex; in most cases there isn't just "one gene" that single-handedly causes a trait or disease. Instead, multiple genetic factors and their interactions influence outcomes. Viewing genomes as information systems, researchers aim to uncover the full story hidden in DNA sequences. This systems biology perspective acknowledges that a gene is just one piece of a much larger puzzle of interacting elements. By analyzing entire genomes (and even multi-organism community genomes), scientists can understand how genes work in concert, which is crucial for tackling complex traits and diseases. Liachko emphasized that the more genomic information we gather, the more we can achieve in terms of discovery and innovation. Modern genomics therefore involves high-throughput technologies and large-scale data integration, combining DNA sequences with other data types (RNA, proteins, epigenetic marks, etc.) to build a comprehensive picture of living systems. In fact, integrating diverse molecular datasets has become an essential strategy in biomedical research, since no single data type captures the whole story. Studies note that understanding gene and protein functions often "requires more information than provided by each dataset" alone, making data integration crucial. This holistic, data-driven approach defines the current era of genomics and is enabling breakthroughs that were impossible when scientists focused only on one gene or one data point at a time.

Phase Genomics' Innovative Sequencing Approach

Phase Genomics uses proximity ligation (Hi-C) technology to add a new dimension to DNA sequencing data, treating the genome as a 3D information system rather than a linear code. Phase Genomics was founded in 2015 with a mission to maximize the impact of genomics on society by exploring the vast swathes of biological information still left untapped. The company's cornerstone technology is based on proximity ligation (Hi-C), an advanced form of DNA sequencing that captures which pieces of the genome are near each other inside the cell's nucleus. By leveraging this ultra-long-range sequencing method, Phase Genomics can detect how DNA fragments are physically connected or arranged, adding context that traditional sequencing misses. In essence, their approach combines DNA sequencing with 3D structural information about the genome, unlocking a "new dimension" of insight from genetic material.

This innovative sequencing strategy was born from an unlikely starting point, a berry. "It all started with a berry genome," Liachko has noted. Phase Genomics first demonstrated their Hi-C based assembly by putting together the genome of a berry (strawberry), proving that proximity-guided sequencing could achieve what short-read methods alone could not. As Liachko vividly described, DNA may be just "a sequence of letters, but it makes shapes, colors—it makes a strawberry look like a strawberry." In other words, the genome sequence encodes the richness of life's form and function, and Phase Genomics set out to capture that full complexity.

Phase Genomics' Proximo™ Hi-C kits and analytics software bring this capability to researchers worldwide. These kits use the Hi-C technique to map DNA proximity and have been used to create reference-quality, chromosome-scale genome assemblies for all kinds of organisms. The company's platforms (such as Proximo for eukaryotic genomes, ProxiMeta™ for metagenomes, and new cytogenomics tools like CytoTerra and OncoTerra) pair lab reagents with AI-powered computational pipelines to assemble genomes and detect structural variants at unprecedented resolution. By building on proximity data, Phase Genomics can "unlock new types of information from plant, animal and human genomes." Researchers have used these methods to gain world-changing genomic insights across domains, from plants and animals to entire microbial communities and even cancer biology. Crucially, this technology treats genomics as a large-scale information problem. As Liachko explained, Phase Genomics' special genome sequencing method allows scientists to "do things you just couldn't do before." By capturing which DNA segments are connected inside cells, their approach enables a suite of breakthrough applications: assembling complete genomes more easily, discovering previously unknown species in microbiome samples, and detecting complex chromosomal rearrangements in diseases like cancer.

Breakthrough #1: Improved Genome Assembly (Chromosome-Scale Genomes)

Assembling a genome, putting together millions or billions of DNA letters in correct order, is like solving a massive jigsaw puzzle. Traditional sequencing (short reads) often produces fragmented assemblies, especially for large genomes with many repetitive sequences. Phase Genomics' proximity-guided sequencing has changed the game for genome assembly. By using Hi-C link information, scientists can scaffold genome pieces into complete chromosomes, achieving near-reference quality assemblies even for "non-model" organisms that lacked good genomes before. In fact, Phase Genomics first made waves by delivering the first chromosome-scale assemblies for certain non-model species, effectively providing end-to-end genomic maps where none existed.

The benefit of high-resolution sequencing here is profound: researchers can now capture structural complexities that were previously missed. Long-range techniques and long-read sequencing have largely solved the challenge of assembling repetitive or complex regions. For example, the last remaining gaps in the human genome were finally filled by 2021 (the Telomere-to-Telomere consortium's effort), thanks to a combination of advanced long-read sequencing and novel assembly methods. The cost of sequencing has also plummeted, from roughly $95 million per genome in 2001 to around $600 in 2023. This means generating the data needed for high-quality assemblies is easier and cheaper than ever.

Using proximity ligation data, Phase Genomics and others can join scaffolds correctly into chromosomes and even phase haplotypes (distinguish maternal vs. paternal copies). The result is chromosome-scale assemblies for any organism of interest. Such complete genomes are invaluable, they reveal structural variations, gene arrangements, and regulatory elements in full context. Researchers in agriculture and ecology, for instance, have leveraged these methods to sequence plants, animals, and even fungi that are important for crop production. In one notable project, scientists working with Phase Genomics sequenced the genomes of four strains of a symbiotic fungus critical for plant health, uncovering surprising simplicity in its genome structure. This knowledge paves the way for engineering better crop-fungi partnerships for improved nutrient uptake and soil remediation. Clearly, better genome assembly is accelerating biological discovery in fields from conservation genomics to synthetic biology, by providing the complete "blueprints" of life forms.

Breakthrough #2: Discovery of New Microbes in Microbiomes

Perhaps one of the most exciting applications of Phase Genomics' technology is in metagenomics, the study of genetic material recovered directly from environmental or clinical samples containing many organisms. Microbiomes (like the community of microbes in your gut or in soil) often contain hundreds or thousands of species, many of which are unknown or cannot be grown in the lab. Traditional shotgun sequencing can read pieces of many genomes but struggles to assemble them into individual organisms, leaving a tangled mess of data. Here, Phase Genomics' proximity sequencing provides a revolutionary advantage: it links DNA fragments that originated from the same cell. This allows researchers to sort metagenomic sequences by organism, effectively reconstructing individual microbial genomes out of the mixed soup.

The impact is dramatic. Unknown microbes are being discovered routinely using these methods. For example, a Phase Genomics team applied their ProxiMeta™ Hi-C kit to a single human gut sample and managed to isolate 252 distinct microbial genomes, including 50 that were near-complete. Among these were 14 novel species, near-complete genomes of microbes that were never seen before in any database. This demonstrates that even well-studied environments like the human gut still hide countless new organisms, which can now be identified with advanced sequencing approaches. The Hi-C-based method vastly outperformed conventional binning algorithms in the study, recovering more high-quality genomes (including many the conventional method missed) and with far less contamination. In short, by adding proximity information, scientists can assemble high-quality microbial genomes from complex samples that stump other techniques.

The discovery of new microbes isn't just an academic exercise, it has practical implications. Newly identified bacteria, viruses, and plasmids can reveal novel biochemical pathways (for drug discovery or industrial enzymes), shed light on disease mechanisms, or help improve environmental management. For instance, researchers have used metagenomic Hi-C to uncover virus-host interactions in ecosystems, revealing how bacteriophages (viruses that infect bacteria) shape microbial populations, knowledge relevant to both environmental science and the development of phage therapy in medicine. By mastering the microbiome with such tools, Phase Genomics' technology is enabling a more complete understanding of microbial ecology. We can now tap into the "dark matter" of biology, those uncharted organisms, which is yielding everything from new antibiotics to probiotics. As Liachko noted, with new kinds of sequencing data "you could do things you just couldn't do before", like discover new microbes that standard methods would overlook.

Breakthrough #3: Detection of Chromosomal Rearrangements in Cancer

Cancer is fundamentally a disease of the genome, and not just point mutations in single genes. Often, large-scale chromosomal rearrangements, such as translocations (chunks of DNA swapping between chromosomes), inversions, duplications, or deletions, drive cancer development. However, detecting these structural variants can be challenging, especially in solid tumors where DNA is highly rearranged and samples are often limited. Traditional cytogenetic methods (karyotyping, FISH, microarrays) have limitations in resolution, throughput, or require living dividing cells. Phase Genomics' approach offers a powerful solution: by sequencing proximity-ligated DNA, one can capture structural information about the genome and identify rearrangements even in mixed-cell populations or in difficult sample types like preserved tumor biopsies.

Phase Genomics has developed dedicated cytogenomics platforms (CytoTerra and OncoTerra) that apply its ultra-long-range sequencing and AI analysis to oncology samples. This approach can map the breadth of chromosome rearrangements in a single assay, without needing any prior assumption of what to look for. Impressively, it works on formalin-fixed paraffin-embedded (FFPE) tissue, the common way biopsies are stored, which previously was very hard to sequence due to DNA damage. By removing the need for sequential single-gene tests and culturing cells, this method provides a comprehensive, genome-wide view of a cancer's structural variations in one go.

The importance of this capability is underscored by recent research. Large-scale analyses using Hi-C have shown that chromosomal rearrangements occur throughout cancer evolution and characterizing them can reveal key drivers of tumor progression. For example, Hi-C-based analysis in melanoma was able to identify copy-number-neutral translocations disrupting tumor suppressor genes and whole-arm exchanges causing loss of vital genes, events that would be missed by looking only for point mutations. By painting a complete picture of the cancer genome's architecture, these techniques yield insights into how the cancer developed and where it might be vulnerable.

In clinical terms, this means better diagnostics and the potential to guide targeted therapies. Structural variations like gene fusions are often targets of precision drugs (e.g., the famous Philadelphia chromosome translocation in leukemia is targeted by Gleevec). Hi-C has proven effective at detecting gene fusions and other rearrangements with high precision, even when other methods fail. Phase Genomics reported that its platform has already uncovered novel, clinically relevant chromosomal aberrations in oncology research, discoveries that are critical for patient care. As structural variant detection becomes more routine, we can expect more cancers to be characterized by their unique genomic "fingerprints" of rearrangements. In short, high-resolution sequencing of the 3D genome is helping scientists and clinicians see the structural mutations in cancer with greater clarity. This could lead to earlier detection of aggressive variants, more informed treatment decisions, and new therapeutic targets based on the rearrangement patterns specific to a patient's tumor.

Market Trends in Genomics and DNA Sequencing

The technological breakthroughs pioneered by companies like Phase Genomics are unfolding amid a period of rapid growth and change in the genomics industry. As of 2025, the global DNA sequencing market is experiencing robust expansion, driven by declining costs and an expanding range of applications in both medicine and industry. Sequencing has become a cornerstone of precision medicine, creating surging demand in areas like oncology, rare disease diagnosis, and even consumer genomics. Government investments are pouring in as well, for example, the U.S. NIH's All of Us program aims to sequence over 1 million genomes to fuel research, and national initiatives in the UK, Australia, and others are integrating genomics into healthcare systems.

One major trend is the plummeting cost of sequencing. The first human genome (completed in 2003) cost roughly $2.7 billion and took 13 years; today, a human genome can be sequenced for a few hundred dollars in a day or two. Illumina, the industry leader in sequencing technology, announced in 2022 that its new NovaSeq X instruments can deliver genomes for approximately $200 each. Meanwhile, competitors are driving costs down even further, in 2023, BGI/MGI introduced a high-throughput sequencer claiming a sub-$100 genome at scale. This precipitous cost decline (far outpacing Moore's Law) has been a game-changer, enabling huge projects like the UK Biobank's 500,000-genome initiative. It also means smaller labs and startups can afford to sequence and analyze genomes as a routine part of research.

Another key trend is the shift toward long-read sequencing technologies. Traditional short-read sequencers (exemplified by Illumina) produce reads ~150-300 bases long, which are highly accurate but often insufficient to resolve complex regions of genomes. Long-read platforms such as Pacific Biosciences' HiFi sequencing and Oxford Nanopore's MinION/GridION devices generate reads in the tens of kilobases or even megabases. These longer reads vastly improve the ability to detect structural variants and to assemble genomes without gaps. Companies like PacBio and Oxford Nanopore have been at the forefront of this shift, delivering new platforms that rival short-read systems in accuracy and throughput. The result is that researchers can now choose the right tool for the job: short reads for cost-effective variant screening, long reads for tackling structural complexity, and even "third-generation" real-time sequencers for niche applications. Indeed, long-read and linked-read approaches were crucial to achieving the first complete, gapless human genome sequence and continue to be vital for assembling complex plant and animal genomes.

In addition to hardware advances, the genomics sector is seeing a boom in AI and bioinformatics solutions. The deluge of data from modern sequencers (a single genome is ~100 gigabytes of raw data) requires sophisticated software and machine learning to interpret. AI-powered bioinformatics tools are now pivotal in variant calling, genome annotation, and identifying patterns in genomic data. Machine learning algorithms can scan entire genomes for disease-causing variants faster and more accurately than manual methods, and they help integrate genomic data with clinical information. This is very much aligned with Scispot's focus, providing an AI-driven platform for labs to automate data handling and analysis. (Notably, Phase Genomics' own pipelines incorporate AI for tasks like identifying chromosomal abnormalities from Hi-C data.) The convergence of ML and genomics is reducing analysis times and enhancing the reliability of results, which is critical as genomics moves into clinics where quick turnaround and accuracy are paramount.

Lastly, the market landscape is teeming with new entrants and competitive innovation. Illumina still holds a significant share of the sequencing market, but companies like BGI (with its DNBSEQ technology) and Ultima Genomics (with bold claims of ultra-cheap sequencing) are challenging the status quo. Meanwhile, an ecosystem of startups is emerging around specific applications, e.g., 10x Genomics (single-cell and spatial genomics), Element Biosciences (benchtop sequencers), and informatics companies like Gencove or Seven Bridges for cloud genomic analysis. This vibrant environment is fueling collaboration and driving costs down further, benefiting end-users. It's also catalyzing integration, for example, clinical labs are combining genomics with other diagnostics (like imaging and electronic health records) to build comprehensive patient profiles. All these trends point to an inexorable truth: genomics is becoming pervasive across many sectors, backed by a rapidly maturing industry.

The Importance of Data Integration and Systems Biology in Research

With great data comes great responsibility (and opportunity). The flood of genomic data now available would be overwhelming without a systems-level approach to make sense of it. This is where the principles of systems biology and large-scale data integration come into play. Modern researchers are increasingly tasked with stitching together data from genomes, transcriptomes (RNA), proteomes (proteins), epigenomes, and even environmental and clinical datasets to get a holistic view of biology. The goal is to move from data to knowledge, understanding not just single gene effects, but how networks of genes and molecular players interact to yield phenotypes (traits, diseases, etc.).

Integrative analysis has proven powerful. For instance, combining genetic variation data with gene expression and protein interaction networks can pinpoint critical disease pathways that might be invisible to genome data alone. In cancer, integrating DNA sequencing with data on tumor gene expression and immune cell markers gives a much clearer picture of what drives a tumor and how to treat it. Large-scale projects are reflecting this ethos: the NIH's recent programs emphasize multi-omics and longitudinal data (following many data points from the same individuals over time) to understand complex conditions like Alzheimer's or diabetes. Coordinated efforts to collect large-scale datasets are providing a basis for system-level understanding of complex diseases, underscoring that siloed analysis of one gene or one omics layer is often insufficient.

Phase Genomics' philosophy of treating genomics as an information problem fits squarely into this trend. By adding proximity ligation data, they integrate a spatial dimension with sequence data, effectively merging genomics with 3D structural biology. This approach exemplifies how new data layers can be combined to yield insights that neither layer could provide alone (e.g., sequence + 3D contact = structural variant detection). Moreover, the company recognizes the need for robust computational infrastructure to handle such integration. It's telling that Guru Singh noted Phase Genomics "sees the value in Scispot", a nod to the importance of having advanced data management and AI tools in place. Guru Singh, as the founder of Scispot, leads a company that provides AI-driven lab automation platforms designed to streamline data handling and analysis in biotech research. Platforms like Scispot's lab operating system help researchers standardize and connect their data, making large-scale analysis feasible in everyday lab workflows.

Another facet of systems biology in genomics is population-scale analysis. Instead of studying one genome, projects now analyze tens of thousands or even millions of genomes together. This yields insight into human genetic diversity and the system-level factors behind health and disease. For example, population genomics initiatives are identifying polygenic risk scores, where the combined effect of many genes (each with small impact) can predict risk for diseases like heart disease or diabetes better than any single gene test. These advances rely on integrating data across individuals and across data types (genetic, environmental, clinical). The payoff is personalized medicine: the ability to tailor prevention and treatment to an individual's unique genetic makeup and life context.

In summary, large-scale data integration is not just a buzzword; it is now a prerequisite for cutting-edge research. The complexity of biology demands a systems approach. Genomics provides the foundational code, but one must layer on other data and use smart algorithms to interpret that code. This is why life science companies and research institutions are heavily investing in data infrastructure and why AI-driven lab platforms (like Scispot) are gaining traction. The future of biotech will belong to those who can harness diverse datasets to see the bigger picture, just as Phase Genomics is doing by uniting genome sequencing with physical genomic context. It's an exciting convergence of disciplines, effectively blending computer science, biology, and engineering into a new paradigm for discovery.

Applications of Modern Genomics Across Sectors

The reach of advanced genomics extends into virtually every domain of the life sciences. Below is a summary of some main applications of modern genomics and the benefits that high-resolution sequencing technologies (such as long-read sequencing, Hi-C proximity ligation, and other next-gen methods) bring to each:

Application	Benefits of High-Resolution Sequencing Technology
Genome Assembly (new species & reference genomes)	Enables chromosome-scale assembly with near-complete coverage, even for complex or repeat-rich genomes. Captures structural variants and large gene families that short reads miss, providing a more accurate reference genome for research and breeding.
Cancer Genomics (tumor DNA analysis)	Allows detection of large chromosomal rearrangements (translocations, inversions, copy-neutral events) crucial for understanding tumor progression. Improves identification of gene fusions and structural mutations for targeted therapies; works on challenging samples (e.g. FFPE tumors) in a single sequencing assay.
Microbiome Discovery (metagenomics)	Enables assembly of complete microbial genomes from mixed samples by linking reads from the same organism. Facilitates discovery of novel species and plasmids, and maps virus-bacteria interactions, by preserving in vivo DNA proximity information.
Clinical Genomics / Precision Medicine (human healthcare)	Provides comprehensive genomic profiling of patients: can detect rare variants, structural changes, and polygenic risk factors in one test. Helps identify gene variants that predispose to diseases or affect drug response, enabling personalized prevention and therapy.
Agricultural Genomics (crop and livestock improvement)	Accelerates breeding by pinpointing genomic markers for yield, disease resistance, and stress tolerance; high-res sequencing finds key trait loci even in large, complex plant genomes. Supports genome editing and engineering (e.g. CRISPR) by providing complete genome maps of crops, and helps harness beneficial microbes (soil microbiome, plant symbionts) for sustainable agriculture.
Environmental Genomics (ecology & conservation)	Uses eDNA and metagenomics to monitor biodiversity and track invasive or endangered species through genetic traces in water/soil. Discovers enzymes and metabolic pathways in extremophiles or soil microbes that could be used for biofuel production or pollution mitigation; and helps study how species adapt genomically to climate change for conservation strategies.

A Healthier, Greener Future Powered by Genomics

From the conversation between Guru Singh and Ivan Liachko on the talk is biotech! podcast, one takeaway is clear: genomics is no longer just about reading DNA, it's about understanding systems. Phase Genomics' work exemplifies how innovative sequencing technologies can elevate genomics into a true information science, revealing connections and complexities that were previously invisible. By approaching genomes as interconnected networks (whether within a single organism or an entire microbial community), we unlock powerful capabilities: we can assemble genomes to completion, unveil new life forms in our environments and bodies, and decode the large-scale mutations that underlie diseases like cancer.

The broader genomics landscape supports this vision. Costs are dropping, data is exploding, and advanced tools from sequencing machines to AI software are converging to make genomic insights more accessible. Importantly, these developments are not happening in isolation. They are feeding into real-world applications, from clinics that deliver precision cancer treatments based on a patient's unique genomic alterations, to farms that are breeding climate-resilient crops with the help of genomic selection, to global initiatives tracking pathogens and preserving biodiversity using DNA sequencing. The systems biology mindset ensures that we leverage all this information together, rather than in silos.

Liachko's assertion that genomics will shape a "healthier, greener future" now seems more tangible than ever. Healthcare stands to be revolutionized through early genetic diagnosis and personalized medicine. Our food supply and environment will benefit from crops and microbes engineered for sustainability. And countless scientific questions, from evolution to epidemiology, are being answered by sifting through genomic data. Phase Genomics is a shining example of a startup translating lab research into impactful biotech innovation, as highlighted in the "talk is biotech!" podcast. By building on foundational science (like the Hi-C method) and pushing it into new frontiers, they have shown how to turn genomic data into actionable knowledge.

In the years ahead, success in biotech will require this blend of technical innovation and systems thinking. Platforms like Scispot provide the digital backbone for labs to manage the complexity, while companies like Phase Genomics provide the tools to generate and interpret the data at scale. As these pieces come together, we edge closer to a future where genomics truly does transform how we live, preventing diseases before they start, enhancing the food we grow, and helping heal the planet we share. The genomic revolution is well underway, and its momentum suggests that the most profound changes are yet to come. Each new genome sequenced and each new connection made in the data brings us a step closer to that future.