It’s all about DNA… decoding
My journey into the study of DNA began when the first human genome was sequenced. Burgers were much larger back then. Since then, I’ve had the true privilege of being a part of many incredibly interesting projects. Here are links to some of my works from the last decade.
Rust Turakulov
+61 O4O51 747O4
rust.turakulov@gmail.com
Current
Dates | Responsibilities |
---|---|
2023 – current |
Senior developer Australian Genome Research facility (AGRF), Melbourne, Australia.
|
2020 – 2023 |
Bioinformatician / LIS developer, Laboratory Of Pathology, NCI ,
NIH, Bethesda USA.
|
2019 – 2020 |
Bioinformatician / ANU RSB Division of Ecology and Evolution
(E&E), Centre For Biodiversity Analysis. Canberra,
Australia.
|
2016 – 2019 |
Bioinformatician / Data Scientist, smartDNA Pty Ltd, Melbourne,
Australia.
|
2004 – 2016 |
Senior Scientist / Bioinformatician, Australian Genome Research
Facility (AGRF), The Walter and Eliza Hall Institute, Melbourne,
Australia.
|
2001 – 2003 |
Postdoctoral fellow, Center for Bioinformation Science (CBiS),
Australian National University, Canberra, Australia.
|
1999 – 2001 |
Postdoctoral fellow, Martyn Smith’s lab, School of Public
Health, University of California at Berkeley, USA.
|
Dates | Responsibilities |
---|---|
2001 – 2003 | As a postdoctoral fellow at the Centre for Bioinformatics Science (CBiS) at the Australian National University in Canberra, I led a research project focused on unsupervised and supervised clustering algorithms for genetic profile classification. This work utilized open public datasets obtained directly from the public domain or through the use of my own website scraper. The acquired data was reshaped and stored in a local MySQL database. I employed the Perl::ODBC module to perform ETL (extract, transform, load) operations. Some of the research results from this project were published in the Human Heredity Journal (Turakulov R, Easteal S. ‘Number of SNPs loci needed to detect population structure.’ Hum Hered. 2003;55(1):37-45. PubMed PMID: 12890924). This marked my initial exposure to Perl and SQL coding, and I subsequently worked to maintain and improve my skills in these areas. |
2004 – 2016 | After my tenure in Canberra, I transitioned to the Australian Genome Research Facility, a non-profit organization and a leading provider of genomics research services in Australia. In this role, I engaged extensively in experiment design, statistical testing, and client support, managing data transfer and analysis for routine and customized projects. As I progressed, I took on the responsibility of developing a Perl-coded variant calling pipeline. Additionally, I focused on data integration, establishing a link between the Laboratory Information System (an in-house sample tracking database) and the analysis pipeline under my supervision.My contributions extended beyond technical tasks. I authored comprehensive documentation for clients and internal use, participated in accreditation processes, delivered numerous presentations for company training, and facilitated teaching sessions at workshops. The outcomes of my work were impactful. Notably, we achieved a high rate of returning clients (over 70%), enhanced the LIMS system by incorporating graphical quality control metrics, and made significant contributions to several research publications. Furthermore, I spearheaded the establishment or improvement of various company services, such as the 16S quality control monitoring system, MLST testing lab protocol and analysis script, BVDV test, GoldenGate assay, GBS pipeline, and R code for automating microarray expression data analysis. |
2016 – 2019 | In 2016, I joined SmartDNA, a start-up, where I established and managed an in-house Perl-coded pipeline for 16S sequence analysis and the creation of the human microbiome database, smartHITTM. This automated workflow merges patient data with reference groups, predicts outcomes using a randomForest model, and generates PDF reports for medical practitioners. I used Perl, R, PDF::API2, and Linux bash commands. Additionally, I developed scripts for data transformation and contributed to a patented method for diagnosing medical conditions based on bacterial profiles (AU Patent number: AU 2019203763). |
2019 – 2020 | Just before the COVID era, I joined Moritz Lab at ANU’s RSB Division of Ecology and Evolution (E&E), Centre for Biodiversity Analysis, as a Bioinformatician. My role involved developing analytical pipelines for sequencing data on various in-house servers and clusters, including those at the Australian Supercomputer Facility, PAWSEY, and NECTAR cloud machines. Specifically, I created a containerized GATK best practice variant pipeline for non-model organisms and a phylogenetic pipeline for exon capture data. These pipelines were utilized by different research groups across Australia. |
2020 –- 2023 | In late 2020, I joined the Laboratory of Pathology at the National Cancer Institute (NCI) at the NIH in the USA. In this role, my focus is on harmonizing historical data generated over the years and integrating reports and results with the Palantir enterprise system. Additionally, I collaborate with other groups supporting alternative platforms like R-Shiny and specialized in-house systems and analytical workflows. Specifically, I manage a normalization and dimensionality reduction methylation data pipeline on the NIH HPC cluster (Biowulf). My research orientation revolves around classifying brain, hematological, and other types of tumors. I test various machine learning algorithms, optimize datasets, and conduct exploratory analyses of new clusters formed by methylation profiling. One notable project I’ve undertaken is the development and maintenance of the entire backend of the Methylation Analysis Portal for the Laboratory of Pathology, publicly available here. This backend includes implementations for quality controls, data management, and metadata maintenance for an in-house collection of over 60,000 samples. Each sample is associated with multiple clinical records and genomics tests (DNA/RNA), along with analysis product files, reports, and metrics linked on the Methylscape portal, LIMS, Clinical database, and backup systems. |
2023 – current | In September 2023, I returned to AGRF in Melbourne, taking on the role of Senior Applications Developer. My focus shifted towards bioinformatics analysis of large soil bacterial datasets and the creation of interactive user graphical interfaces in Shiny applications hosted on Amazon Web Services. |