I've spent my career at the intersection of cutting-edge data science and life-saving cancer research. My passion lies in transforming complex, high-dimensional biological datasets into actionable insights that accelerate the fight against cancer. Currently at the Huntsman Cancer Institute, I architect robust ETL pipelines and data infrastructure that empowers researchers across the institute to unlock the potential hidden within their datasets. My technical expertise spans the full data science pipeline – from managing and processing over 15 terabytes of genomics data using parallel computing across 60+ processors, to developing innovative machine learning models and topological data analysis techniques that reveal patterns invisible to traditional methods. I've conceptualized and published four groundbreaking software packages, including TECAT (patent pending), minimapR, and RPointCloud, which leverage advanced statistical modeling and parallel processing to solve complex biological problems at scale. What drives me is the potential to revolutionize cancer diagnostics and treatment. My work with single-cell RNA sequencing, chronic lymphocytic leukemia datasets, and telomere dynamics analysis isn't just about processing data – it's about identifying the next generation of cancer biomarkers that could save lives. Every algorithm I optimize, every ETL workflow I design, and every visualization I create brings us one step closer to personalized medicine that can outsmart cancer. My approach combines military precision with scientific innovation. Drawing from my experience in the 82nd Airborne, I build data solutions that are robust, scalable, and mission-critical ready. I believe the most powerful algorithms are worthless if they can't be understood and implemented by the researchers who need them most. When I'm not transforming terabytes of data into breakthrough insights, I enjoy beach vacations and spending time with my son, always returning refreshed and ready to tackle the next impossible dataset.
I am currently working as a Data Scientist at the Huntsman Cancer Institute where I lead the development of advanced data science tools to streamline data management processes for researchers. My primary focus involves designing and implementing automated ETL workflows into cBioPortal, enhancing research data accessibility and analysis capabilities across the institute. Prior to this role, I worked for two and a half years as a postdoctoral fellow at the Georgia Cancer Center where I analyzed high-dimensional datasets (single-cell RNA sequencing) using machine learning methods and topological data analysis, and developed biology data science tools including TECAT to help other researchers study telomere lengths in humans. This builds off of my graduate work at Cornell University where I explored the effects of radiation on telomere lengths which recapitulated results from my mentor, Dr. Chris Mason's NASA Twin Study. My approach to research is informed by my experience in the 82nd Airborne Division, where I served two tours of duty and managed a 240B gun team. I take pride in producing work that is easily communicated and executable, whether by experts or high-school students, making my datasets public and annotating my code so that my work is accessible to everyone as science moves into the realm of big data analysis. I write code for humans not machines.
Feb 2025 - Present
Jul 2022 - Feb 2025
Jul 2004 - Jul 2008
2021
Weill Cornell
2013
Augusta State University