Sohrab Salehi

Memorial Sloan Kettering Cancer Center
321 East 61st Street
New York, NY 10065
I am a postdoctoral research fellow at Memorial Sloan Kettering Cancer Center in the Department of Epidemiology and Biostatistics under Drs. Sohrab P. Shah, Charles M. Rudin, and a postdoctoral research scientist at the Irving Institute for Cancer Dynamics at Columbia Univeristy under Dr. David M. Blei. My current research focuses on developing causal inference methods to understand mechanisms of drug-resistance and metastasis in human cancers at a single cell level. I did my PhD in Bioinformatics at University of British Columbia (UBC) under Dr. Alexandre Bouchard-Côté. My PhD research focused on developing Bayesian models to quantify the evolutionary dynamics and fitness of human cancers from single cell whole genome sequencing data.
Selected Publications
- Cancer phylogenetic tree inference at scale from 1000s of single cell genomesSohrab Salehi, Fatemeh Dorri, Kevin Chern, Farhia Kabeer, and 11 more authors2023
A new generation of scalable single cell whole genome sequencing (scWGS) methods allows unprecedented high resolution measurement of the evolutionary dynamics of cancer cell populations. Phylogenetic reconstruction is central to identifying sub-populations and distinguishing the mutational processes that gave rise to them. Existing phylogenetic tree building models do not scale to the tens of thousands of high resolution genomes achievable with current scWGS methods. We constructed a phylogenetic model and associated Bayesian inference procedure, sitka, specifically for scWGS data. The method is based on a novel phylogenetic encoding of copy number (CN) data, the sitka transformation, that simplifies the site dependencies induced by rearrangements while still forming a sound foundation to phylogenetic inference. The sitka transformation allows us to design novel scalable Markov chain Monte Carlo (MCMC) algorithms. Moreover, we introduce a novel point mutation calling method that incorporates the CN data and the underlying phylogenetic tree to overcome the low per-cell coverage of scWGS. We demonstrate our method on three single cell datasets, including a novel PDX series, and analyse the topological properties of the inferred trees. Sitka is freely available at ‘https://github.com/UBC-Stat-ML/sitkatree.git‘.
@article{10_24072_pcjournal_292, author = {Salehi, Sohrab and Dorri, Fatemeh and Chern, Kevin and Kabeer, Farhia and Rusk, Nicole and Funnell, Tyler and Williams, Marc J. and Lai, Daniel and Andronescu, Mirela and Campbell, Kieran R. and McPherson, Andrew and Aparicio, Samuel and Roth, Andrew and Shah, Sohrab P. and Bouchard-C\ot\'e, Alexandre}, title = {Cancer phylogenetic tree inference at scale from 1000s of single cell genomes}, journal = {Peer Community Journal}, eid = {e63}, publisher = {Peer Community In}, volume = {3}, year = {2023}, doi = {10.24072/pcjournal.292}, language = {en}, url = {https://peercommunityjournal.org/articles/10.24072/pcjournal.292/}, type = {first_author} }
- Single-cell genomic variation induced by mutational processes in cancerTyler Funnell, Ciara H. O’Flanagan, Marc J. Williams, Andrew McPherson, and 116 more authorsNature, 2022
How cell-to-cell copy number alterations that underpin genomic instability1 in human cancers drive genomic and phenotypic variation, and consequently the evolution of cancer2, remains understudied. Here, by applying scaled single-cell whole-genome sequencing3 to wild-type, TP53-deficient and TP53-deficient;BRCA1-deficient or TP53-deficient;BRCA2-deficient mammary epithelial cells (13,818 genomes), and to primary triple-negative breast cancer (TNBC) and high-grade serous ovarian cancer (HGSC) cells (22,057 genomes), we identify three distinct ‘foreground’mutational patterns that are defined by cell-to-cell structural variation. Cell- and clone-specific high-level amplifications, parallel haplotype-specific copy number alterations and copy number segment length variation (serrate structural variations) had measurable phenotypic and evolutionary consequences. In TNBC and HGSC, clone-specific high-level amplifications in known oncogenes were highly prevalent in tumours bearing fold-back inversions, relative to tumours with homologous recombination deficiency, and were associated with increased clone-to-clone phenotypic variation. Parallel haplotype-specific alterations were also commonly observed, leading to phylogenetic evolutionary diversity and clone-specific mono-allelic expression. Serrate variants were increased in tumours with fold-back inversions and were highly correlated with increased genomic diversity of cellular populations. Together, our findings show that cell-to-cell structural variation contributes to the origins of phenotypic and evolutionary diversity in TNBC and HGSC, and provide insight into the genomic and mutational states of individual cancer cells.
@article{genomic_variation_funnell, author = {Funnell, Tyler and O'Flanagan, Ciara H. and Williams, Marc J. and McPherson, Andrew and McKinney, Steven and Kabeer, Farhia and Lee, Hakwoo and Salehi, Sohrab and V{\'a}zquez-Garc{\'\i}a, Ignacio and Shi, Hongyu and Leventhal, Emily and Masud, Tehmina and Eirew, Peter and Yap, Damian and Zhang, Allen W. and Lim, Jamie L. P. and Wang, Beixi and Brimhall, Jazmine and Biele, Justina and Ting, Jerome and Au, Vinci and Van Vliet, Michael and Liu, Yi Fei and Beatty, Sean and Lai, Daniel and Pham, Jenifer and Grewal, Diljot and Abrams, Douglas and Havasov, Eliyahu and Leung, Samantha and Bojilova, Viktoria and Moore, Richard A. and Rusk, Nicole and Uhlitz, Florian and Ceglia, Nicholas and Weiner, Adam C. and Zaikova, Elena and Douglas, J. Maxwell and Zamarin, Dmitriy and Weigelt, Britta and Kim, Sarah H. and Da Cruz Paula, Arnaud and Reis-Filho, Jorge S. and Martin, Spencer D. and Li, Yangguang and Xu, Hong and de Algara, Teresa Ruiz and Lee, So Ra and Llanos, Viviana Cerda and Huntsman, David G. and McAlpine, Jessica N. and Hannon, Gregory J. and Battistoni, Georgia and Bressan, Dario and Cannell, Ian G. and Casbolt, Hannah and Jauset, Cristina and Kova{\v c}evi{\'c}, Tatjana and Mulvey, Claire M. and Nugent, Fiona and Ribes, Marta Paez and Pearson, Isabella and Qosaj, Fatime and Sawicka, Kirsty and Wild, Sophia A. and Williams, Elena and Laks, Emma and Smith, Austin and Roth, Andrew and Balasubramanian, Shankar and Lee, Maximilian and Bodenmiller, Bernd and Burger, Marcel and Kuett, Laura and Tietscher, Sandra and Windhager, Jonas and Boyden, Edward S. and Alon, Shahar and Cui, Yi and Emenari, Amauche and Goodwin, Daniel R. and Karagiannis, Emmanouil D. and Sinha, Anubhav and Wassie, Asmamaw T. and Caldas, Carlos and Bruna, Alejandra and Callari, Maurizio and Greenwood, Wendy and Lerda, Giulia and Eyal-Lubling, Yaniv and Rueda, Oscar M. and Shea, Abigail and Harris, Owen and Becker, Robby and Grimaldo, Flaminia and Harris, Suvi and Vogl, Sara Lisa and Joyce, Johanna A. and Watson, Spencer S. and Tavare, Simon and Dinh, Khanh N. and Fisher, Eyal and Kunes, Russell and Walton, Nicholas A. and Al Sa'd, Mohammed and Chornay, Nick and Dariush, Ali and Gonz{\'a}lez-Solares, Eduardo A. and Gonz{\'a}lez-Fern{\'a}ndez, Carlos and Yolda{\c s}, Ayb{\"u}ke K{\"u}pc{\"u} and Miller, Neil and Zhuang, Xiaowei and Fan, Jean and Lee, Hsuan and Sep{\'u}lveda, Leonardo A. and Xia, Chenglong and Zheng, Pu and Shah, Sohrab P. and Aparicio, Samuel and Consortium, IMAXT}, date = {2022/12/01}, date-added = {2024-09-02 18:18:42 -0400}, date-modified = {2024-09-02 18:18:42 -0400}, doi = {10.1038/s41586-022-05249-0}, id = {Funnell2022}, isbn = {1476-4687}, journal = {Nature}, number = {7938}, pages = {106--115}, title = {Single-cell genomic variation induced by mutational processes in cancer}, url = {https://doi.org/10.1038/s41586-022-05249-0}, volume = {612}, year = {2022}, bdsk-url-1 = {https://doi.org/10.1038/s41586-022-05249-0} }
- Clonal fitness inferred from time-series modelling of single-cell cancer genomesSohrab Salehi, Farhia Kabeer, Nicholas Ceglia, Mirela Andronescu, and 104 more authorsJul 2021
Progress in defining genomic fitness landscapes in cancer, especially those defined by copy number alterations (CNAs), has been impeded by lack of time-series single-cell sampling of polyclonal populations and temporal statistical models1–7. Here we generated 42,000 genomes from multi-year time-series single-cell whole-genome sequencing of breast epithelium and primary triple-negative breast cancer (TNBC) patient-derived xenografts (PDXs), revealing the nature of CNA-defined clonal fitness dynamics induced by TP53 mutation and cisplatin chemotherapy. Using a new Wright–Fisher population genetics model8,9 to infer clonal fitness, we found that TP53 mutation alters the fitness landscape, reproducibly distributing fitness over a larger number of clones associated with distinct CNAs. Furthermore, in TNBC PDX models with mutated TP53, inferred fitness coefficients from CNA-based genotypes accurately forecast experimentally enforced clonal competition dynamics. Drug treatment in three long-term serially passaged TNBC PDXs resulted in cisplatin-resistant clones emerging from low-fitness phylogenetic lineages in the untreated setting. Conversely, high-fitness clones from treatment-naive controls were eradicated, signalling an inversion of the fitness landscape. Finally, upon release of drug, selection pressure dynamics were reversed, indicating a fitness cost of treatment resistance. Together, our findings define clonal fitness linked to both CNA and therapeutic resistance in polyclonal tumours.
@article{salehi_clonal_2021, title = {Clonal fitness inferred from time-series modelling of single-cell cancer genomes}, volume = {595}, issn = {1476-4687}, url = {https://doi.org/10.1038/s41586-021-03648-3}, doi = {10.1038/s41586-021-03648-3}, number = {7868}, journal = {Nature}, author = {Salehi, Sohrab and Kabeer, Farhia and Ceglia, Nicholas and Andronescu, Mirela and Williams, Marc J. and Campbell, Kieran R. and Masud, Tehmina and Wang, Beixi and Biele, Justina and Brimhall, Jazmine and Gee, David and Lee, Hakwoo and Ting, Jerome and Zhang, Allen W. and Tran, Hoa and O’Flanagan, Ciara and Dorri, Fatemeh and Rusk, Nicole and de Algara, Teresa Ruiz and Lee, So Ra and Cheng, Brian Yu Chieh and Eirew, Peter and Kono, Takako and Pham, Jenifer and Grewal, Diljot and Lai, Daniel and Moore, Richard and Mungall, Andrew J. and Marra, Marco A. and Hannon, Gregory J. and Battistoni, Giorgia and Bressan, Dario and Cannell, Ian Gordon and Casbolt, Hannah and Fatemi, Atefeh and Jauset, Cristina and Kovačević, Tatjana and Mulvey, Claire M. and Nugent, Fiona and Ribes, Marta Paez and Pearsall, Isabella and Qosaj, Fatime and Sawicka, Kirsty and Wild, Sophia A. and Williams, Elena and Laks, Emma and Li, Yangguang and O’Flanagan, Ciara H. and Smith, Austin and Ruiz, Teresa and Lai, Daniel and Roth, Andrew and Balasubramanian, Shankar and Lee, Maximillian and Bodenmiller, Bernd and Burger, Marcel and Kuett, Laura and Tietscher, Sandra and Windhager, Jonas and Boyden, Edward S. and Alon, Shahar and Cui, Yi and Emenari, Amauche and Goodwin, Dan and Karagiannis, Emmanouil D. and Sinha, Anubhav and Wassie, Asmamaw T. and Caldas, Carlos and Bruna, Alejandra and Callari, Maurizio and Greenwood, Wendy and Lerda, Giulia and Eyal-Lubling, Yaniv and Rueda, Oscar M. and Shea, Abigail and Harris, Owen and Becker, Robby and Grimaldi, Flaminia and Harris, Suvi and Vogl, Sara Lisa and Weselak, Joanna and Joyce, Johanna A. and Watson, Spencer S. and Vázquez-Garćıa, Ignacio and Tavaré, Simon and Dinh, Khanh N. and Fisher, Eyal and Kunes, Russell and Walton, Nicholas A. and Sa’d, Mohammad Al and Chornay, Nick and Dariush, Ali and González-Solares, Eduardo A. and González-Fernández, Carlos and Yoldas, Aybüke Küpcü and Millar, Neil and Whitmarsh, Tristan and Zhuang, Xiaowei and Fan, Jean and Lee, Hsuan and Sepúlveda, Leonardo A. and Xia, Chenglong and Zheng, Pu and McPherson, Andrew and Bouchard-Côté, Alexandre and Aparicio, Samuel and Shah, Sohrab P. and {IMAXT Consortium}}, month = jul, year = {2021}, pages = {585--590}, type = {first_author} }
- ddClone: joint statistical inference of clonal populations from single cell and bulk tumour sequencing dataSohrab Salehi, Adi Steif, Andrew Roth, Samuel Aparicio, and 2 more authorsMar 2017
Next-generation sequencing (NGS) of bulk tumour tissue can identify constituent cell populations in cancers and measure their abundance. This requires computational deconvolution of allelic counts from somatic mutations, which may be incapable of fully resolving the underlying population structure. Single cell sequencing (SCS) is a more direct method, although its replacement of NGS is impeded by technical noise and sampling limitations. We propose ddClone, which analytically integrates NGS and SCS data, leveraging their complementary attributes through joint statistical inference. We show on real and simulated datasets that ddClone produces more accurate results than can be achieved by either method alone.
@article{salehi_ddclone_2017, title = {{ddClone}: joint statistical inference of clonal populations from single cell and bulk tumour sequencing data}, volume = {18}, issn = {1474-760X}, url = {https://doi.org/10.1186/s13059-017-1169-3}, doi = {10.1186/s13059-017-1169-3}, number = {1}, journal = {Genome Biology}, author = {Salehi, Sohrab and Steif, Adi and Roth, Andrew and Aparicio, Samuel and Bouchard-Côté, Alexandre and Shah, Sohrab P.}, month = mar, year = {2017}, pages = {44}, type = {first_author} }