Publications

Results 1–50 of 72

Evaluating causal-based feature selection for fuel property prediction models

Statistical Analysis and Data Mining

Nguyen, Bernard; Whitmore, Leanne S.; George, Anthe G.; Hudson, Corey H.

In-silico screening of novel biofuel molecules based on chemical and fuel properties is a critical first step in the biofuel evaluation process due to the significant volumes of samples required for experimental testing, the destructive nature of engine tests, and the costs associated with bench-scale synthesis of novel fuels. Predictive models are limited by training sets of few existing measurements, often containing similar classes of molecules that represent just a subset of the potential molecular fuel space. Software tools can be used to generate every possible molecular descriptor for use as input features, but most of these features are largely irrelevant and training models on datasets with higher dimensionality than size tends to yield poor predictive performance. Feature selection has been shown to improve machine learning models, but correlation-based feature selection fails to provide scientific insight into the underlying mechanisms that determine structure–property relationships. The implementation of causal discovery in feature selection could potentially inform the biofuel design process while also improving model prediction accuracy and robustness to new data. In this study, we investigate the benefits causal-based feature selection might have on both model performance and identification of key molecular substructures. We found that causal-based feature selection performed on par with alternative filtration methods, and that a structural causal model provides valuable scientific insights into the relationships between molecular substructures and fuel properties.

More Details

TYPE Conference Poster YEAR 2021

Scopus OSTI

Benchmarking blockchain-based gene-drug interaction data sharing methods: A case study from the iDASH 2019 secure genome analysis competition blockchain track

International Journal of Medical Informatics

Kuo, Tsung T.; Bath, Tyler; Ma, Shuaicheng; Pattengale, Nicholas D.; Yang, Meng; Cao, Yang; Hudson, Corey H.; Kim, Jihoon; Post, Kai; Xiong, Li; Ohno-Machado, Lucila

More Details

TYPE Journal Article YEAR 2021

Scopus OSTI DOI

Emulytics in Genome Security: Use Cases

Hudson, Corey H.

Abstract not provided.

More Details

TYPE Presentation YEAR 2021

OSTI

RetSynth aids in the identification of model platform organisms for the biochemical synthesis of MCCI and SI fuels

Nguyen, Bernard; Davis, Ryan D.; George, Anthe G.; Hudson, Corey H.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2020

OSTI

Causal inference modeling for feature selection of QSAR machine learning models

Nguyen, Bernard; Whitmore, Leanne W.; George, Anthe G.; Hudson, Corey H.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2020

OSTI

From Buffer-Overflowing Genomic Tools to Securing Biomedical File Formats

Hudson, Corey H.

Abstract not provided.

More Details

TYPE Presentation YEAR 2020

OSTI

Biodefense at Sandia National Laboratories

Hudson, Corey H.

Abstract not provided.

More Details

TYPE Presentation YEAR 2019

OSTI

iDASH 2019 ? Track 1 Methods

Pattengale, Nicholas D.; Hudson, Corey H.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2019

OSTI

RetSynth: Determining all optimal and sub-optimal synthetic pathways that facilitate synthesis of target compounds in chassis organisms

BMC Bioinformatics

Whitmore, Leanne S.; Nguyen, Bernard; Pinar, Ali P.; George, Anthe G.; Hudson, Corey H.

Background: The efficient biological production of industrially and economically important compounds is a challenging problem. Brute-force determination of the optimal pathways to efficient production of a target chemical in a chassis organism is computationally intractable. Many current methods provide a single solution to this problem, but fail to provide all optimal pathways, optional sub-optimal solutions or hybrid biological/non-biological solutions. Results: Here we present RetSynth, software with a novel algorithm for determining all optimal biological pathways given a starting biological chassis and target chemical. By dynamically selecting constraints, the number of potential pathways scales by the number of fully independent pathways and not by the number of overall reactions or size of the metabolic network. This feature allows all optimal pathways to be determined for a large number of chemicals and for a large corpus of potential chassis organisms. Additionally, this software contains other features including the ability to collect data from metabolic repositories, perform flux balance analysis, and to view optimal pathways identified by our algorithm using a built-in visualization module. This software also identifies sub-optimal pathways and allows incorporation of non-biological chemical reactions, which may be performed after metabolic production of precursor molecules. Conclusions: The novel algorithm designed for RetSynth streamlines an arduous and complex process in metabolic engineering. Our stand-alone software allows the identification of candidate optimal and additional sub-optimal pathways, and provides the user with necessary ranking criteria such as target yield to decide which route to select for target production. Furthermore, the ability to incorporate non-biological reactions into the final steps allows determination of pathways to production for targets that cannot be solely produced biologically. With this comprehensive suite of features RetSynth exceeds any open-source software or webservice currently available for identifying optimal pathways for target production.

More Details

TYPE Journal Article YEAR 2019

Scopus OSTI DOI

Blue Ribbon Panel Remarks

Hudson, Corey H.; Oehman, Chris O.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2019

OSTI

Cybersecurity in DNA Design and Verification Tools: Risks and Solutions

Hudson, Corey H.

Abstract not provided.

More Details

TYPE Presentation YEAR 2019

OSTI

From Buffer-Overflowing Genomic Tools to Securing Biomedical File Formats

Hudson, Corey H.; Fracchia, Charles F.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2019

OSTI

Packet2Vec: Utilizing Word2Vec for Feature Extraction in Packet Data

Goodman, Eric G.; Zimmerman, Chase P.; Hudson, Corey H.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2019

OSTI

Retrosynthesis of All Available Pathways to Microbial Production of Precursors to Target Chemicals Based on Chemical Separation Characteristics

Whitmore, Leanne S.; Hudson, Corey H.; George, Anthe G.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2019

OSTI

Computational tools for advancing discovery of bio-fuels

Landera, Alexander L.; Whitmore, Leanne S.; George, Anthe G.; Hudson, Corey H.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2019

OSTI

Optimizing genetic manipulation of microbial organisms for production of multiple target chemical compounds

Whitmore, Leanne S.; Hudson, Corey H.; George, Anthe G.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2019

OSTI

BioRetroSynthesis: A tool for identifying optimal metabolic pathways for production of a target compound

Whitmore, Leanne S.; George, Anthe G.; Hudson, Corey H.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2019

OSTI

Genomic and Synthetic Biology Cybersecurity

Hudson, Corey H.

Abstract not provided.

More Details

TYPE Presentation YEAR 2019

OSTI

Packet2Vec: Utilizing Word2Vec for Feature Extraction in Packet Data

Goodman, Eric G.; Zimmerman, Chase P.; Hudson, Corey H.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2019

OSTI

Targeted Assembly of cas Gene Clusters from High Throughput Sequencing Data

Hudson, Corey H.; Podlevsky, Joshua P.; Williams, Kelly P.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2019

OSTI

Genomic Privacy and Security

Pattengale, Nicholas D.; Ting, Christina T.; Hudson, Corey H.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2019

OSTI

Modeling realistic genomic and synthetic biology facilities at scale

Hudson, Corey H.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2019

OSTI

Packet2Vec: Utlizing Word2Vec for Feature Extraction in Packet Data

Goodman, Eric G.; Zimmerman, Chase P.; Hudson, Corey H.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2019

OSTI

Computational tools for advancing discovery of biofuels

Whitmore, Leanne S.; Landera, Alexander L.; Hudson, Corey H.; George, Anthe G.

Abstract not provided.

More Details

TYPE Presentation YEAR 2018

OSTI

Designing for Interpretability and Adaptability by Using Weighted Averages

Agarwal, Sapan A.; De La Cruz, Andrew F.; Hudson, Corey H.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2018

OSTI

Discovering Novel Biofuels Using Machine Learning Software BioCompoundML

Tse, Michelle T.; Whitmore, Leanne S.; Hudson, Corey H.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2018

OSTI

Exploiting Time and Subject Locality for Fast, Efficient, and Understandable Alert Triage

2018 International Conference on Computing, Networking and Communications, ICNC 2018

Kavaler, David; Hudson, Corey H.; Bierma, Michael B.

In many organizations, intrusion detection and other related systems are tuned to generate security alerts, which are then manually inspected by cyber-security analysts. These analysts often devote a large portion of time to inspecting these alerts, most of which are innocuous. Thus, it would be greatly beneficial to reduce the number of innocuous alerts, allowing analysts to utilize their time and skills for other aspects of cyber defense. In this work, we devise several simple, fast, and easily understood models to cut back this manual inspection workload, while maintaining high true positive and true negative rates. We demonstrate their effectiveness on real data, and discuss their potential utility in application by others.

More Details

TYPE Conference Poster YEAR 2018

Scopus OSTI

CasAnn: Targeted Identification of CRISPR Associated Gene Operons from Next Generation Sequences

Hudson, Corey H.; Williams, Kelly P.; Timlin, Jerilyn A.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2018

OSTI

Designing for Interpretability by Using Weighted Averages

Agarwal, Sapan A.; De La Cruz, Andrew F.; Hudson, Corey H.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2018

OSTI

Explicating feature contribution using Random Forest proximity distances

Whitmore, Leanne S.; George, Anthe G.; Hudson, Corey H.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2018

OSTI

CasANN: Targeted Identification of CRISPR Associated Gene Operons from Next Generation Sequences

Hudson, Corey H.; Timlin, Jerilyn A.; Williams, Kelly P.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2018

OSTI

Discovering and converting temperate phages for therapy

Lau, Britney L.; Krishnakumar, Raga K.; Wagner, Julian W.; Sinha, Anupama S.; Hudson, Corey H.; Schoeniger, Joseph S.; Branda, Steven B.; Williams, Kelly P.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

Probability Series Expansion Classifier that is Interpretable by Design

Agarwal, Sapan A.; Hudson, Corey H.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

Probability Series Expansion Classifier that is Interpretable by Design

Agarwal, Sapan A.; Hudson, Corey H.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

BioRetroSynth: Identifying All Optimal Routes for Synthetic Biological and Hybrid Synthetic Biological/Chemical Production

Hudson, Corey H.; Whitmore, Leanne S.; Pinar, Ali P.

Abstract not provided.

More Details

TYPE Presentation YEAR 2017

OSTI

Emulating Genome Security Risks in Realistic Genomics Data Ecosystems

Hudson, Corey H.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

Emulating Genome Security Risks in Realistic Genomics Data Ecosystems

Hudson, Corey H.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

Active learning in cybersecurity

Hudson, Corey H.

Abstract not provided.

More Details

TYPE Presentation YEAR 2017

OSTI

Developing a high- and low-cetane classifier for biologically produced chemicals using variable quality training data

Hudson, Corey H.; Whitmore, Leanne S.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

Reconsidering temperate phages for therapy

Williams, Kelly P.; Lau, Britney Y.; Krishnakumar, Raga K.; Wagner, Julian W.; Hudson, Corey H.; Schoeniger, Joseph S.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

Building cyberthreat models around genomic security

Hudson, Corey H.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

BioRetroSynthesis: A tool for identifying the best metabolic routes for production of a target compound

Hudson, Corey H.; Whitmore, Leanne S.; George, Anthe G.; Hudson, Corey H.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

The Bacterial and Archaeal Pan-Mobilome

Williams, Kelly P.; Krishnakumar, Raga K.; Wagner, Julian W.; Hudson, Corey H.; Schoeniger, Joseph S.; Williams, Kelly P.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI

Exploiting Time and Subject Locality for Fast Efficient and Understandable Alert Triage

Kavaler, David; Hudson, Corey H.; Bierma, Michael B.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2017

OSTI DOI

SNL Capabilities for IV&V - DARPA BTO

Ruffing, Anne R.; Bachand, George B.; Timlin, Jerilyn A.; Manginell, Ronald P.; Hudson, Corey H.; Williams, Kelly P.; Rempe, Susan R.; Brinker, C.J.; Olszewska-Wasiolek, Maryla A.; Hanson, Donald J.; VanderNoot, Victoria A.

Abstract not provided.

More Details

TYPE Presentation YEAR 2016

OSTI

Mapping chemical performance on molecular structures using locally interpretable explanations

Whitmore, Leanne S.; George, Anthe G.; Hudson, Corey H.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

OSTI

Cheap and Fast Multiplexed Bacterial Genomic Sequencing: Enabling Comparative Genomic Analysis of Antibiotic Resistance in BWA and Hospital Acquired Infections

Schoeniger, Joseph S.; Ray, Debjit R.; Branda, Steven B.; Williams, Kelly P.; Hudson, Corey H.; Polage, Christopher P.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2016

OSTI

BioCompoundML a general screening tool for biological compound property prediction using machine learning

Hudson, Corey H.; Whitmore, Leanne S.

Abstract not provided.

More Details

TYPE Presentation YEAR 2016

OSTI

Experimental single-strain mobilomics reveals events that shape pathogen emergence

Nucleic Acids Research

Schoeniger, Joseph S.; Hudson, Corey H.; Bent, Zachary W.; Sinha, Anupama S.; Williams, Kelly P.

Virulence genes on mobile DNAs such as genomic islands (GIs) and plasmids promote bacterial pathogen emergence. Excision is an early step in GI mobilization, producing a circular GI and a deletion site in the chromosome; circular forms are also known for some bacterial insertion sequences (ISs). The recombinant sequence at the junctions of such circles and deletions can be detected sensitively in high-throughput sequencing data, using new computational methods that enable empirical discovery of mobile DNAs. For the rich mobilome of a hospital Klebsiella pneumoniae strain, circularization junctions (CJs) were detected for six GIs and seven IS types. Our methods revealed differential biology of multiple mobile DNAs, imprecision of integrases and transposases, and differential activity among identical IS copies for IS26, ISKpn18 and ISKpn21. Using the resistance of circular dsDNA molecules to exonuclease, internally calibrated with the native plasmids, showed that not all molecules bearing GI CJs were circular. Transpositions were also detected, revealing replicon preference (ISKpn18 prefers a conjugative IncA/C2 plasmid), local action (IS26), regional preferences, selection (against capsule synthesis) and IS polarity inversion. Efficient discovery and global characterization of numerous mobile elements per experiment improves accounting for the new gene combinations that arise in emerging pathogens.

More Details

TYPE Journal Article YEAR 2016

Scopus OSTI DOI

BioCompoundML: a general biofuel property screening tool for biological molecules using Random Forest Classifiers

Whitmore, Leanne S.; Davis, Ryan W.; L., McCormick R.; Hudson, Corey H.

Abstract not provided.

More Details

TYPE Presentation YEAR 2016

OSTI DOI

Results 1–50 of 72

Results 1–50 of 72