Daniel Puthawala graduated from the department in 2023 with a doctorate. He is currently working as a post-doctoral scientist in bioinformatics at Nationwide Children’s hospital. Daniel agreed to share some of what she’s been up to since graduation and to give advice to current students.
Tell us about your job: Where are you working, and what is your current job title?
I work in the Institute for Genomic Medicine (IGM) at Nationwide Children’s Hospital. I’m currently a Post-Doctoral Scientist in bioinformatics. I recently was also appointed to lead the Categorical Variation (CatVar) Study Group, part of the Genomic Knowledge Standards (GKS) Working Group of the Global Alliance for Genomics and Health (GA4GH, which is kind of like the LSA, but international, in genomics, and with much closer involvement with government agencies around the world). (They sure like their acronyms!)
My work in both capacities (under IGM and for GA4GH) is pretty similar: I work on a large open-source project to develop and implement global standards for how genomics data is stored, shared, verified, accessed, and searched, so that all the different hospitals and research groups can share their data.
How did you get your current job?
It’s kind of a funny story! Just before the pandemic, I saw a video that had a three-second bit related to gene regulation networks, and had the idea that it looked sort of similar in some ways to how the type-logic of a categorial grammar work together to generate or prevent certain parses from obtaining. Later during the pandemic, I was reminded of this thought, and realized I knew someone who worked in genomics. So, I asked them if they knew anyone who could tell me if there was any resemblance between the formal methods we use to study the syntax-semantics interface and what is used in genomic medicine to study gene expression. That led me to chat with a PI at IGM who knew nothing about what we do in theoretical linguistics, but was happy to talk about what they knew about formal methods in genomics. A year and change later, around the time I was taking my candidacy exam and Linguistics faculty jobs were still few and far in between, I got a message from the PI. We chatted again, this time about a paper the lab had just published, and a nagging problem in their field. We talked over the problem and I spitballed some approaches for how it could be solved. Three months later, the PI reached out again to ask if I was looking for a Post-Doc position, and if I’d consider working to help kids’ cancer. I said I was interested, and a couple more interviews, a dissertation, and a job talk later… here I am!
What do you do during a typical work week?
The lab I work in is primarily geared towards developing genomic knowledge standards for GA4GH. This entails creating formal data models through to releasing open source datasets, python packages, and reference implementations of tools that comply with those standards. So, my job relates to this in four main ways:
First, I’m one of the few formally-minded people around, so I spend a decent chunk of my week doing what is, essentially, formal semantics, just with genomic variation data instead of linguistic data. The problem I am working on is called “categorical variation”, and is essentially thus: researchers and labs often come up with names on the fly for natural classes of genomic variants, like “BRAF Gene mutations”, “Nucleotide Insertion Variants”, or “Nucleotide Sequence Variants”. Clinical findings in papers are often attached to these variant classes instead of the specific member variants. And to make it more complicated, these classes have complicated relationships with each other. An insertion variant is a type of sequence variant. And a BRAF Gene Mutation might involve an insertion or sequence change, but it also might not. Because of this complexity, finding the knowledge associated with a particular detected genomic variant (as an oncologist might want to do for a cancer patient) currently requires the manual search, curation, and collation of many different databases, and requires to the searcher to already know what they’re looking for, otherwise they may very well not find it. The goal is to make these classes and class relationships formally understood and efficiently computable, so we can develop tools to connect those dots for clinicians and make the process of interpreting the (in)significant of an observed genomic variant much faster and reliable.
An important part of this job, and something that I learned from teaching classes and teaching my very interdisciplinary dissertation committee, is making visual aids to make salient the findings of my formal analyses. Most people here have never heard of the lambda calculus, let alone type theories and monads, so for most intents and purposes, what I do is completely incomprehensible magic to most of the folks in the institute. So I need to be able to communicate the main intuitive takeaway and the most important technical details of my conclusions in as approachable manner as possible. And in that regard, flow charts and graphs can go a long way.
Second, I write and review code. Like I said, we develop technical tools for other folks in genomic medicine to use, so there’s quite a bit of computational analysis and python software development that occurs. And with that, there are all sorts of small places where I need to code something up to add to an existing project, or where people submit code to me for review to make sure it functions as my arcane divinations recommended. One thing I’ve had to (re)learn and start getting good at is software development workstreams in GitHub. The ability to split off and merge branches of a project is really important for allowing a large team to work on the same project in parallel.
Third, as one of the PhDs and project leads around the lab, part of my job is mentoring the graduate students and developers in the lab. This feels very much similar to what I did as a lab admin in Linguistics: Keeping people organized, assigning work, teaching people the skills they need to do their job, and looking for ways to let my lab-mates step outside their comfort zone in a productive and supported way.
Finally, I’m the CatVar Study Group lead for GA4GH. This is not unlike teaching a seminar, just with a very targeted topic, the expectation of a single completed group project at the end, and with a mishmash of academic and clinical experts, industry partners, and government agency representatives weighing in. Part of this is doing the formal modelling to solve aspects of the categorical variation problem discussed above, but a lot of this part of my job involves running and administrating the group: relaying communications between experts on various sub-problems, scheduling meetings across lots of time zones, and paperwork. There’s already lots of paperwork involved with getting medical innovation cleared by the relevant US Federal agencies. It’s a whole other can of worms when there are international governmental partners involved.
So, in sum, I do a lot of formal modelling, coding, mentoring, and administrating!
Of the skills you acquired in graduate school, which skill is the most important or helpful for your job?
That’s a toughie! I learned so much, and use so much of it. But I think the answer is the ability to teach myself to competence (even if not mastery) of dense technical skills. As a doctoral student, I worked on a lot of different projects, and I frequently had to be taught (or teach myself) lots of new skills, from using new formal frameworks, to experimental design, to coding in new languages (my dissertation ended up needing 7 different programming languages!), to lab administration, navigating IRB, and more, not to mention mastering different literatures well enough to formulate and defend my ideas against my committee. This has left me with a skill that I now frequently recognize in PhDs, and which is far rarer among my junior colleagues: a sort of measured confidence in their ability to buckle down and teach themselves new technical skills and literatures as required, instead of throwing up their hands and giving up. That skill for self-directed study, and being ok with being disoriented, is really important for my job.
What do you enjoy about your job?
There’s a lot, but I’ll mention four particular things, in no particular order of importance. First, in contrast to life in graduate school, I have regular work hours. I have to get my work done during those hours, but that also allows me to push back if time-tables are too tight. And I still get my stuff done, so I don’t have to worry about my work most evenings and weekends, and won’t feel guilty about not working on a weekend, so that’s been wonderful for my managing burnout and my mental health. Second, and crucial for my reasons for starting graduate school in the first place, I’m learning all sorts of new stuff, which is and will keep me stimulated and prevent me from getting bored and complacent. I’m really curious by nature and can’t abide intellectual boredom, so that’s a plus! Like in academia, however, I’m working on really substantive problems. I have found a niche where I am better equipped to make progress than most other people in this field on problems that really matter to the advancement of the field. Finally, I’m working to develop tools that will be force-multipliers for researchers and clinicians in oncology and genetic disease. It makes me feel really good knowing that if I do a good job, my work will improve the lives of sick children everywhere, including potentially my own children or future grandchildren. Using the power of theoretical linguistics to save lives really helps me get out of bed in the morning and take pride in what I do.
Do you have any advice for students who may be thinking about taking a similar career path?
Don’t be afraid to talk to people and network! Do informational interviews! And be open to opportunities in new and unexpected directions. You can always go back, and you’ll learn a ton in the process!