: ⌂ :

Column

It’s all about DNA… decoding


My journey into the study of DNA began when the first human genome was sequenced. Burgers were much larger back then. Since then, I’ve had the true privilege of being a part of many incredibly interesting projects. Here are links to some of my works from the last decade.


Rust Turakulov
+61 O4O51 747O4

Column

Historical

Methylscape (NIH)

Methylscape screenshot

Phylogenetic pipeline (ANU)

Docker repo and CBA logo screenshot

GATK Variant Calling Box (ANU)

GATK Docker repo and CBA logo

Column

Current

Find My SNP

Find my SNP

Microbial Land Scape

Prediction Models for soil microbial profiles

CRC TiME :: Data Browser

CRC TiME UMAP application

 AWS based not ready yet

CV

Rust Turakulov employment history
Dates Responsibilities
2023 – current

Senior developer Australian Genome Research facility (AGRF), Melbourne, Australia.

  • Applications development for soil bacterial profiles and in-house data assets.
  • Large bacterial datasets and metadata integrations.
  • Predictive modelling and classification of soil bacterial profiles.
2020 – 2023 Bioinformatician / LIS developer, Laboratory Of Pathology, NCI , NIH, Bethesda USA.
  • Methylation analysis and cancer classification based on Illumina methylation data.
  • Historical data harmonization and preparing structural input for Palantir database.
  • RNAseq, MethylationArray and Clinical Sequencing panel report customization.
  • Tumor classifier containerization and workflow optimization.
  • 2019 – 2020 Bioinformatician / ANU RSB Division of Ecology and Evolution (E&E), Centre For Biodiversity Analysis. Canberra, Australia.
  • Developing, optimization and maintenance of cloud systems for high-throughput analysis of sequencing data (variant calling, de novo, phylogenetic).
  • Research database (mySQL) development and integration with historical records.
  • Data quality control metrics extraction and automatic reporting system.
  • Research and analysis for the custom and adhoc projects.
  • Development and optimization of data flow and data management protocol: acquiring, transfer, generation, storage, removal and tracking system.
  • 2016 – 2019 Bioinformatician / Data Scientist, smartDNA Pty Ltd, Melbourne, Australia.
  • Computational technology development for bacteria identification.
  • Automatic reporting system and company products development (smartGUTTM test, smartHITTM database, IBS GutDetectorTM, Spectrum DetectorTM ).
  • In-house bacterial sequences database maintenance and versioning documentation.
  • Statistical analysis and experiment design for internal and third parties projects with report writing and follow-ups discursions.
  • Patent developing and patent application lodgement assistance.
  • 2004 – 2016 Senior Scientist / Bioinformatician, Australian Genome Research Facility (AGRF), The Walter and Eliza Hall Institute, Melbourne, Australia.
  • Developing and maintenance of core production analysis pipelines: whole-genome and exome sequencing variant calling.
  • Quality control metrics developing and LIMS integration.
  • Statistical analysis for various in-house and clients experiments.
  • Routine bioinformatics paid service and collaboration support.
  • Developing and validation new products and technologies.
  • 2001 – 2003 Postdoctoral fellow, Center for Bioinformation Science (CBiS), Australian National University, Canberra, Australia.
  • Research project with data mining from the public domain.
  • Establishing local MySQL database and applications development.
  • Statistical analysis and methods comparisons for data clustering.
  • Writing research paper and grant proposals.
  • 1999 – 2001 Postdoctoral fellow, Martyn Smith’s lab, School of Public Health, University of California at Berkeley, USA.
  • Developing laboratory and statistical methods for cancer detection.
  • Reports and publications preparation.
  • Experience

    Rust Turakulov projects history
    Dates Responsibilities
    2001 – 2003 As a postdoctoral fellow at the Centre for Bioinformatics Science (CBiS) at the Australian National University in Canberra, I led a research project focused on unsupervised and supervised clustering algorithms for genetic profile classification. This work utilized open public datasets obtained directly from the public domain or through the use of my own website scraper. The acquired data was reshaped and stored in a local MySQL database. I employed the Perl::ODBC module to perform ETL (extract, transform, load) operations. Some of the research results from this project were published in the Human Heredity Journal (Turakulov R, Easteal S. ‘Number of SNPs loci needed to detect population structure.’ Hum Hered. 2003;55(1):37-45. PubMed PMID: 12890924). This marked my initial exposure to Perl and SQL coding, and I subsequently worked to maintain and improve my skills in these areas.
    2004 – 2016 After my tenure in Canberra, I transitioned to the Australian Genome Research Facility, a non-profit organization and a leading provider of genomics research services in Australia. In this role, I engaged extensively in experiment design, statistical testing, and client support, managing data transfer and analysis for routine and customized projects. As I progressed, I took on the responsibility of developing a Perl-coded variant calling pipeline. Additionally, I focused on data integration, establishing a link between the Laboratory Information System (an in-house sample tracking database) and the analysis pipeline under my supervision.My contributions extended beyond technical tasks. I authored comprehensive documentation for clients and internal use, participated in accreditation processes, delivered numerous presentations for company training, and facilitated teaching sessions at workshops. The outcomes of my work were impactful. Notably, we achieved a high rate of returning clients (over 70%), enhanced the LIMS system by incorporating graphical quality control metrics, and made significant contributions to several research publications. Furthermore, I spearheaded the establishment or improvement of various company services, such as the 16S quality control monitoring system, MLST testing lab protocol and analysis script, BVDV test, GoldenGate assay, GBS pipeline, and R code for automating microarray expression data analysis.
    2016 – 2019 In 2016, I joined SmartDNA, a start-up, where I established and managed an in-house Perl-coded pipeline for 16S sequence analysis and the creation of the human microbiome database, smartHITTM. This automated workflow merges patient data with reference groups, predicts outcomes using a randomForest model, and generates PDF reports for medical practitioners. I used Perl, R, PDF::API2, and Linux bash commands. Additionally, I developed scripts for data transformation and contributed to a patented method for diagnosing medical conditions based on bacterial profiles (AU Patent number: AU 2019203763).
    2019 – 2020 Just before the COVID era, I joined Moritz Lab at ANU’s RSB Division of Ecology and Evolution (E&E), Centre for Biodiversity Analysis, as a Bioinformatician. My role involved developing analytical pipelines for sequencing data on various in-house servers and clusters, including those at the Australian Supercomputer Facility, PAWSEY, and NECTAR cloud machines. Specifically, I created a containerized GATK best practice variant pipeline for non-model organisms and a phylogenetic pipeline for exon capture data. These pipelines were utilized by different research groups across Australia.
    2020 –- 2023 In late 2020, I joined the Laboratory of Pathology at the National Cancer Institute (NCI) at the NIH in the USA. In this role, my focus is on harmonizing historical data generated over the years and integrating reports and results with the Palantir enterprise system. Additionally, I collaborate with other groups supporting alternative platforms like R-Shiny and specialized in-house systems and analytical workflows. Specifically, I manage a normalization and dimensionality reduction methylation data pipeline on the NIH HPC cluster (Biowulf). My research orientation revolves around classifying brain, hematological, and other types of tumors. I test various machine learning algorithms, optimize datasets, and conduct exploratory analyses of new clusters formed by methylation profiling. One notable project I’ve undertaken is the development and maintenance of the entire backend of the Methylation Analysis Portal for the Laboratory of Pathology, publicly available here. This backend includes implementations for quality controls, data management, and metadata maintenance for an in-house collection of over 60,000 samples. Each sample is associated with multiple clinical records and genomics tests (DNA/RNA), along with analysis product files, reports, and metrics linked on the Methylscape portal, LIMS, Clinical database, and backup systems.
    2023 – current In September 2023, I returned to AGRF in Melbourne, taking on the role of Senior Applications Developer. My focus shifted towards bioinformatics analysis of large soil bacterial datasets and the creation of interactive user graphical interfaces in Shiny applications hosted on Amazon Web Services.

    Publications

    1. Schreck KC, Strowd RE, Nabors LB, Ellingson BM, Chang M, Tan SK, Abdullaev Z, Turakulov R, Aldape K, Danda N, Desideri S, Fisher J, Iacoboni M, Surakus T, Rudek MA, Bettegowda C, Grossman SA, Ye X. Response rate and molecular correlates to encorafenib and binimetinib in BRAF-V600E mutant high-grade glioma.Clin Cancer Res. 2024 Mar 6. doi: 10.1158/1078-0432.CCR-23-3241. PMID: 38446982
    2. Miettinen M, Abdullaev Z, Turakulov R, Quezado M, Contreras AL, Curcio CA, Rys J, Chlopek M, Lasota J, Aldape KD. Assessment of The Utility of The Sarcoma DNA Methylation Classifier In Surgical Pathology. Am J Surg Pathol. 2024 Jan 1;48(1):112-122. doi: 10.1097/PAS.0000000000002138.
    3. Cimino PJ, Ketchum C, Turakulov R, Singh O, Abdullaev Z, Giannini C, Pytel P, Lopez GY, Colman H, Nasrallah MP, Santi M, Fernandes IL, Nirschl J, Dahiya S, Neill S, Solomon D, Perez E, Capper D, Mani H, Caccamo D, Ball M, Badruddoja M, Chkheidze R, Camelo-Piragua S, Fullmer J, Alexandrescu S, Yeaney G, Eberhart C, Martinez-Lage M, Chen J, Zach L, Kleinschmidt-DeMasters BK, Hefti M, Lopes MB, Nuechterlein N, Horbinski C, Rodriguez FJ, Quezado M, Pratt D, Aldape K. Expanded analysis of high-grade astrocytoma with piloid features identifies an epigenetically and clinically distinct subtype associated with neurofibromatosis type 1. Acta Neuropathol. 2023 Jan;145(1):71-82. doi: 10.1007/s00401-022-02513-5. Epub 2022 Oct 22. PMID: 36271929
    4. Wu Z, Lopes Abath Neto O, Bale TA, Benhamida J, Mata D, Turakulov R, Abdullaev Z, Marker D, Ketchum C, Chung HJ, Giannini C, Quezado M, Pratt D, Aldape K. DNA methylation analysis of glioblastomas harboring FGFR3-TACC3 fusions identifies a methylation subclass with better patient survival. Acta Neuropathol. 2022 Jul;144(1):155-157. doi: 10.1007/s00401-022-02430-7. Epub 2022 May 14. PMID: 35567606
    5. Wu Z, Abdullaev Z, Pratt D, Chung HJ, Skarshaug S, Zgonc V, Perry C, Pack S, Saidkhodjaeva L, Nagaraj S, Tyagi M, Gangalapudi V, Valdez K, Turakulov R, Xi L, Raffeld M, Papanicolau-Sengos A, O’Donnell K, Newford M, Gilbert MR, Sahm F, Suwala AK, von Deimling A, Mamatjan Y, Karimi S, Nassiri F, Zadeh G, Ruppin E, Quezado M, Aldape K. Impact of the methylation classifier and ancillary methods on CNS tumor diagnostics. Neuro Oncol. 2022 Apr 1;24(4):571-581. doi: 10.1093/neuonc/noab227. PMID: 34555175
    6. Pavlova A, Harrisson KA, Turakulov R, Lee YP, Ingram BA, Gilligan D, SunnucksP, Gan HM. Labile sex chromosomes in the Australian freshwater fish familyPercichthyidae. Mol Ecol Resour. 2021 Dec 4. doi: 10.1111/1755-0998.13569. Epubahead of print. PMID: 34863023.
    7. Potter S, Bragg JG, Turakulov R, Eldridge MDB, Deakin J, Kirkpatrick M,Edwards RJ, Moritz C. Limited introgression between rock-wallabies with extensive chromosomal rearrangements. Mol Biol Evol. 2021 Dec 3:msab333. doi:10.1093/molbev/msab333. Epub ahead of print. PMID: 34865126.
    8. Ivan J, Moritz C, Potter S, Bragg J, Turakulov R, Hua X. Temperature predicts the rate of molecular evolution in Australian Eugongylinae skinks. Evolution.2021 Sep 5. doi: 10.1111/evo.14342. Epub ahead of print. PMID: 34486736.
    9. Nelson MN, Jabbari JS, Turakulov R, Pradhan A, Pazos-Navarro M, Stai JS,Cannon SB, Real D. The First Genetic Map for a Psoraleoid Legume (Bituminariabituminosa) Reveals Highly Conserved Synteny with Phaseoloid Legumes. Plants(Basel). 2020 Jul 31;9(8):973. doi: 10.3390/plants9080973. PMID: 32752081;PMCID: PMC7463921.
    10. Stevenson WS, Morel-Kopp MC, Chen Q, Liang HP, Bromhead CJ, Wright S,Turakulov R, Ng AP, Roberts AW, Bahlo M, Ward CM. GFI1B mutation causes ableeding disorder with abnormal platelet function. J Thromb Haemost. 2013Nov;11(11):2039-47. doi: 10.1111/jth.12368. PMID: 23927492.
    11. Taylor D, Nagle N, Ballantyne KN, van Oorschot RA, Wilcox S, Henry J,Turakulov R, Mitchell RJ. An investigation of admixture in an AustralianAboriginal Y-chromosome STR database. Forensic Sci Int Genet. 2012Sep;6(5):532-8. doi: 10.1016/j.fsigen.2012.01.001. Epub 2012 Jan 30. PMID:22297081.
    12. Chistiakov DA, Chistiakova EI, Voronova NV, Turakulov RI, Savost’anov KV. Avariant of the Il2ra / Cd25 gene predisposing to graves’ disease is associatedwith increased levels of soluble interleukin-2 receptor. Scand J Immunol. 2011Nov;74(5):496-501. doi: 10.1111/j.1365-3083.2011.02608.x. PMID: 21815908.
    13. Chistiakov DA, Voronova NV, Turakulov RI, Savost’anov KV. The -112G>Apolymorphism of the secretoglobin 3A2 (SCGB3A2) gene encoding uteroglobin-related protein 1 (UGRP1) increases risk for the development of Graves’ diseasein subsets of patients with elevated levels of immunoglobulin E. J Appl Genet.2011 May;52(2):201-7. doi: 10.1007/s13353-010-0022-0. Epub 2010 Dec 18. PMID:21170691.
    14. Chistiakov DA, Voronova NV, Savost’Anov KV, Turakulov RI. Loss-of-functionmutations E6 27X and I923V of IFIH1 are associated with lower poly(I:C)-inducedinterferon-β production in peripheral blood mononuclear cells of type 1 diabetespatients. Hum Immunol. 2010 Nov;71(11):1128-34. doi:10.1016/j.humimm.2010.08.005. Epub 2010 Aug 22. PMID: 20736039.
    15. McHale CM, Lan Q, Corso C, Li G, Zhang L, Vermeulen R, Curry JD, Shen M,Turakulov R, Higuchi R, Germer S, Yin S, Rothman N, Smith MT. Chromosometranslocations in workers exposed to benzene. J Natl Cancer Inst Monogr.2008;(39):74-7. doi: 10.1093/jncimonographs/lgn010. PMID: 18648008.
    16. Turakulov R, Nontachaiyapoom S, Mitchelson KR, Gresshoff PM, Men AE.Ultrasensitive determination of absolute mRNA amounts at attomole levels ofnearly identical plant genes with high-throughput mass spectrometry (MassARRAY).Plant Cell Physiol. 2007 Sep;48(9):1379-84. doi: 10.1093/pcp/pcm103. Epub 2007Aug 8. PMID: 17686807.
    17. Symons RC, Turakulov R, Foote SJ, Craig JE, McCartney PJ, Mackey DA. Nomaternally inherited diabetes and deafness mutations in a sample of 193Tasmanian diabetics with glaucoma. Ophthalmic Genet. 2007 Mar;28(1):39-41. doi:10.1080/13816810701201971. PMID: 17454746.
    18. Chistiakov DA, Savost’anov KV, Turakulov RI, Efremov IA, Demurov LM. Geneticanalysis and functional evaluation of the C/T(-318) and A/G(-1661) polymorphismsof the CTLA-4 gene in patients affected with Graves’ disease. Clin Immunol. 2006Feb-Mar;118(2-3):233-42. doi: 10.1016/j.clim.2005.09.017. Epub 2005 Nov 16.PMID: 16297665.
    19. Chistiakov DA, Chernisheva A, Savost’anov KV, Turakulov RI, Kuraeva TL,Dedov II, Nosikov VV. Lack of association between genetic markers on chromosome16q22-Q24 and type 1 diabetes in Russian affected families. Croat Med J. 2005Aug;46(4):670-7. PMID: 16100772.
    20. Chistiakov DA, Chernisheva A, Savost’anov KV, Turakulov RI, Kuraeva TL,Dedov II, Nosikov VV. The TAF5L gene on chromosome 1q42 is associated with type1 diabetes in Russian affected patients. Autoimmunity. 2005 Jun;38(4):283-93.doi: 10.1080/08916930500128594. PMID: 16206511.
    21. Chistiakov DA, Seryogin YA, Turakulov RI, Savost’anov KV, Titovich EV,Zilberman LI, Kuraeva TL, Dedov II, Nosikov VV. Evaluation of IDDM8susceptibility locus in a Russian simplex family data set. J Autoimmun. 2005May;24(3):243-50. doi: 10.1016/j.jaut.2005.01.017. PMID: 15848047.
    22. Ivanov PL, Zemskova EIu, Turakulov RI, Efremov IA. [Research of potentiallylinked variation of polymorphism of chromosome DNA in aspect of forensicexpertise using molecular-genetic individualizing systems CD4, vWA and vWFII].Sud Med Ekspert. 2005 Mar-Apr;48(2):29-34. Russian. PMID: 15881140.
    23. Jorm AF, Butterworth P, Anstey KJ, Christensen H, Easteal S, Maller J,Mather KA, Turakulov RI, Wen W, Sachdev P. Memory complaints in a communitysample aged 60-64 years: associations with cognitive functioning, psychiatricsymptoms, medical conditions, APOE genotype, hippocampus and amygdala volumes,and white-matter hyperintensities. Psychol Med. 2004 Nov;34(8):1495-506. doi:10.1017/s0033291704003162. Erratum in: Psychol Med. 2007 May;37(5):763. PMID:15724880.
    24. Chistiakov DA, Savost’anov KV, Turakulov RI. Screening of SNPs at 18positional candidate genes, located within the GD-1 locus on chromosome14q23-q32, for susceptibility to Graves’ disease: a TDT study. Mol Genet Metab.2004 Nov;83(3):264-70. doi: 10.1016/j.ymgme.2004.07.011. PMID: 15542398.
    25. Chistiakov DA, Savost’anov KV, Turakulov RI, Titovich EV, Zilberman LI,Kuraeva TL, Dedov II, Nosikov VV. A new type 1 diabetes susceptibility locuscontaining the catalase gene (chromosome 11p13) in a Russian population.Diabetes Metab Res Rev. 2004 May-Jun;20(3):219-24. doi: 10.1002/dmrr.442. PMID:15133753.
    26. Chistiakov DA, Turakulov RI. CTLA-4 and its role in autoimmune thyroiddisease. J Mol Endocrinol. 2003 Aug;31(1):21-36. doi: 10.1677/jme.0.0310021.PMID: 12914522.
    27. Turakulov R, Easteal S. Number of SNPS loci needed to detect populationstructure. Hum Hered. 2003;55(1):37-45. doi: 10.1159/000071808. PMID: 12890924.
    28. Chistiakov DA, Savost’anov KV, Turakulov RI, Petunina N, Balabolkin MI,Nosikov VV. Further studies of genetic susceptibility to Graves’ disease in aRussian population. Med Sci Monit. 2002 Mar;8(3):CR180-4. PMID: 11887032.
    29. Chistiakov DA, Savost’ianov KV, Turakulov RI, Shcherbacheva LN, Mamaeva GG,Balabolkin MI, Nosikov VV. Nukleotidnaia zamena C1167T v gene katalazy iraspolozhennye nepodaleku polimorfnye markery D11S907 i D11S2008 sviazany srazvitiem sakharnogo diabeta tipa 2 [Nucleotide substitution C1167T in thecatalase gene and position of nearby polymorphic markers DS11S907 and D11S2008are connected with development of diabetes mellitus type 2]. Mol Biol (Mosk).2000 Sep-Oct;34(5):863-7. Russian. PMID: 11033813.
    30. Chistyakov DA, Savost’anov KV, Turakulov RI, Nosikov VV. Geneticdeterminants of Graves disease. Mol Genet Metab. 2000 Sep-Oct;71(1-2):66-9. doi:10.1006/mgme.2000.3042. PMID: 11001797.
    31. Chistyakov DA, Savost’anov KV, Turakulov RI, Petunina NA, Trukhina LV,Kudinova AV, Balabolkin MI, Nosikov VV. Complex association analysis of gravesdisease using a set of polymorphic markers. Mol Genet Metab. 2000Jul;70(3):214-8. doi: 10.1006/mgme.2000.3007. PMID: 10924276.
    32. Chistiakov DA, Turakulov RI, Shcherbacheva LN, Mamaeva GG, Galabolkin MI,Nosikov VV. Analiz polimorfizma lokusa D11S2008 riadom s genom katalazy ubol’nykh gipertonieĭ i ishemicheskoĭ bolezn’iu serdtsa pri insulinnezavisimomsakharnom diabete v Moskovskoĭ populiatsii [Analysis of polymorphism of theD11S2008 locus of the catalase gene in patients with hypertension and ischemicheart disease in non-insulin-dependent diabetes mellitus in the Muscovitepopulation]. Genetika. 2000 Mar;36(3):423-6. Russian. PMID: 10779920.
    33. Chistiakov DA, Savost’ianov KV, Turakulov RA, Petunina NA, Trukhina LV,Kudinova AV, Balabolkin MI, Nosikov VV. Polimorfizm markera D6S2414,raspolozhennogo v lokuse HLA (6p21.31), i geneticheskaia predraspolozhennost’ kbolezni Greĭvsa v moskovskoĭ populatsii [Polymorphism of D6S2414 marker locatedin HLA (6p21.31) locus and genetic predisposition to Grave’s disease in Moscowpopulation]. Mol Biol (Mosk). 2000 Jan-Feb;34(1):24-7. Russian. PMID: 10732335.
    34. Chistiakov DA, Turakulov RI, Moiseev VS, Nosikov VV. Polimorfizm T174M genaangiotenzinogena i serdechno-sosudistye bolezni v Moskovskoĭ populiatsii[Polymorphism of angiotensinogen T174M gene and cardiovascular diseases in theMoscow population]. Genetika. 1999 Aug;35(8):1160-4. Russian. PMID: 10546120.
    35. Chistiakov DA, Turakulov RI. Geneticheskie markery gipertonicheskoĭ bolezni[Genetic markers of hypertension]. Genetika. 1999 May;35(5):565-73. Russian.PMID: 10495944.
    36. Chistiakov DA, Turakulov RI, Kondrat’ev IaIu, Shestakova MV, Nosikov VV,Dedov II, Debabov VG. Polimorfizm gena angiotenzinogena i geneticheskaiapredraspolozhennost’ k diabeticheskoĭ nefropatii pri sakharnom diabete tipa 1[Polymorphism of the angiotensinogen gene and genetic predisposition to diabeticnephropathy in type I diabetes mellitus]. Mol Biol (Mosk). 1999 Mar-Apr;33(2):211-3. Russian. PMID: 10377565.
    37. Chistiakov DA, Turakulov RI, Girashko NM, Demurov LM, Rachiba IuM,Kondrat’ev IaIu, Milen’kaia TM, Shestakova MV, Dedov II, Nosikov VV.Issledovanie polimorfizma dinukleotidnogo povtora vnutri gena al’dozoreduktazy vnorme i sredi bol’nykh insulin-zavisimym sakharnym diabetom s sosudistymioslozhneniiami [Polymorphism of the dinucleotide repeat inside the aldosereductase gene in normal states and in patients with insulin-dependent diabetesmellitus with vascular complications]. Mol Biol (Mosk). 1997 Sep-Oct;31(5):778-83. Russian. PMID: 9454059.
    38. Turakulov RI, Chistiakov DA, Odinokova ON, Nosikov VV. Allel’nyĭ polimorfizmkorotkikh tandemno povtoriaiushchikhsia posledovatel’nosteĭ lokusov HUMF13A01 iHUMCD4v Russkikh populiatsiiakh Moskvy i Tomska [Allelic polymorphism of short tandemly repeating sequences of the HUMF13A01 and HUMCD4 loci in Russianpopulations from Moscow and Tomsk]. Genetika. 1997 Jul;33(7):979-85. Russian.PMID: 9378293.