Open source program identifiers are synthetic, naturally occurring gene sequences

SeqScreen can reveal

Computer scientists at Rice University and their colleagues have developed SeqScreen, a program to screen for short DNA sequences, whether synthetic or natural, to determine their toxicity. Credit: Treangen Lab / Rice University

Of course, some bacteria and viruses can cause disease and illness, but the real culprits are the sequences of anxiety that lurk in the genomes of these microbes.

Calling them is about to become easier.

Years of work by computer scientists at Rice University and their colleagues have led to an improved platform for DNA screening and characterization of pathogenic sequences, whether natural or synthetic, before they have a chance to affect public health.

Computer scientist Todd Traengen of Rice’s George R. Brown School of Engineering and genome specialist Christa Ternus of Signature Science LLC are leading the study, which produces SeqScreen, a program to accurately characterize short DNA sequences, often called oligonucleotides.

Treangen said SeqScreen aims to improve the detection and tracking of a wide range of pathogenic sequences.

“SeqScreen is the first open source software toolkit to be available for synthetic DNA screening,” Treangen said. “Our program improves the state of the art for companies, individuals and government agencies for their DNA screening practices. “

The study, which began as a high-risk, high-profit research project, appears in the journal Genome biology.

SeqScreen takes advantage of partners at Austin, Texas-based Signature Science to oversee website of thousands of gene sequences representing 32 types of virulent functions. “The development of this selected database took years of biocurrent and review and is the basis for the training data of the SeqScreen machine learning algorithm,” said Treangen.

The company partnered with Treangen last year to detect mutations in SARS-CoV-2 that may have made the Omicron variant more resistant to antibodies, including those from vaccinations. “SeqScreen came first and some of its ideas were transferred to the COVID project,” he said. “But SeqScreen is much wider in scope.”

“We are focusing on identifying the functions of the sequences of concern – which we call FunSoCs – while previous screening approaches were more concerned with looking at ‘are you this bacterium?’ Or ‘are you this virus?'” Treangen said. “SeqScreen does not focus on the names of bacteria or viruses in your sample. Rather, we want to know if there are sequences in that sample that could be harmful, such as toxins that can destroy human cells.”

Focusing on the functions of concern is important, he said, because bacteria easily exchange DNA through horizontal gene transfer.

“We have cited examples in the publication of bacteria whose genomes are essentially identical, except that one has a sequence of anxiety, such as a toxin, that the other does not,” said Treangen. “SeqScreen really improves the presence or absence of features that are virulence factors.”

He said SeqScreen would also help detect new or emerging environmental pathogens.


Computer scientists develop program to find “low-frequency” variants in sequence data


More info:
Advait Balaji et al, SeqScreen: accurate and sensitive functional screening of pathogenic sequences through ensemble training, Genome biology (2022). DOI: 10.1186 / s13059-022-02695-x

provided by
Rice University


Quote: identifiers of open source programs synthetic, naturally occurring gene sequences (2022, June 21), retrieved on June 22, 2022 from https://phys.org/news/2022-06-open-source-ids -synthetic-naturally-gene.html

This document is subject to copyright. Except for any fair transaction for the purpose of private research or study, no part may be reproduced without written permission. The content is provided for informational purposes only.

Related Posts

Leave a Reply

Your email address will not be published.