My Ramblings on Various Topics

2022

2020

2019

Reflection: Mastering the science and engineering of the self

3 minute read

Published:

For most of my life, I’ve had some clear goals for that I’ve been working towards. Whether it was getting into boarding school, winning a debate tournament, reaching a math level, getting into college, getting certain grades, getting funding for my project, building a product, getting a job, getting into climate, etc. In this process, I’ve found that there’re 2 factors that drive my success in achieving these goals: clarity of purpose, and structure of action. To the extent I’ve got a meaningful job that benefits people and provides me financial security, I’ve got loving relationships, and I’ve got relative health, they’ve been derived from the times I’ve exhibited the strongest aptitude in those 2 factors.

On Data and Being Honest

4 minute read

Published:

As I stared at my 300th co-occurence number, I let out a groan. Why was I spending hours late into the night staring at countless static numbers? I could’ve been building something, or writing something, or brainstorming. Instead I was doing what felt like grunt work. Double-checking number after number. My co-worker was asked to independently reproduce my model results, and we had to make sure that the thousands of metrics and results we produced for the model were all exactly the same. I felt bad for the guy, having to spend hours into the night writing code that I had already written that did boring, predictable things, just to make sure we got the same numbers. This wasn’t a chance to be creative. It wasn’t even really a chance for reward, because the better we did our job, the more likely we were to have to deliver bad news and stress everyone out. I felt guilty. Were we doing this because I made evaluation mistakes earlier? Did my friend have to spend a late night just because the noob new grad’s code couldn’t be trusted? It isn’t easy being positive when you’re tired and staring at a monolithic table of numbers.

As an ML engineer, your success is driven by the success of the model you build. But in most cases, especially when you’re not at a big company, the same person who builds the model is also the person who writes the code to evaluate its success. People don’t need to be malicious and actively lie about their results to make mistakes in their process that lead to a misunderstanding of the data. They just have to be a little less skeptical, a little less curious about learning about the ways in which what they’ve built might fall apart, and by the time it’s clear that the analysis was misleading and something was wrong, it’s often way too late. The reality as a data scientist is that you know your data better than pretty much anyone else, and the questions you ask about the data, and especially the ones you don’t ask, drive the story that other people hear about this data. With the wrong incentives and attitude, this can be dangerous.

2018

2016

The Heights of Ignorance

5 minute read

Published:

When my Professor told us this week that for extra credit we could write about the great wall in relation to China as a whole, I groaned. I’ve read countless instances of misty-eyed tourists traveling to foreign lands writing of the “profound” metaphors they see around them (myself being no exception). It’s insufferable. Of course, the travel blogger writing about the romance of the Taj Mahal might be well intentioned. But the idea of going off to some Asian or African country and masturbating to the otherness of it all in the form of quasi-sophisticated rambling on the abstract beauty of “the East” is nauseating. So I’m going to try and do something different.