Thoughts on data science, statistics and machine learning.
Effective Train/Test Stratification for Object Detection
TL;DR: Here’s a talk based on this post: There’s an unavoidable, inherent difficulty in fine-tuning deep neural networks for specific tasks, which primarily stems from the lack of training data. It would seem ridiculous to a layperson that a pretrained vision model (containing millions of parameters, trained on millions of images) could learn to solve highly specific problems. Therefore, when fine-tuned models do perform well, they seem all the more miraculous.
A Process for Readable Code
I took a course on data structures and algorithms over the last few months. It is being offered as a part of IIT Madras’ Online Degree Program in Data Science and Programming, taught by Prof Madhavan Mukund. The program is a MOOC in a true sense, with tens of thousands of students enrolling each year. The DSA course itself is offered every trimester, and sees an average of ~700 enrollments every time.
Book Review: A World Without Email - Cal Newport
This book is a good refresher on Cal Newport’s central thesis which shows up in both Deep Work and Digital Minimalism, but with email as the central device. The same essential theorem, but a lot of new stories to go with it as corollaries. Of course, it’s not email technology that the book contests, but the hyperactive hive-mind that are enabled by people’s email habits. But here’s the only thing I want to leave a note of: I was mildly annoyed by Newport’s invocation (or perhaps, misappropriation) of Claude Shannon’s information theory.
Book Review: Travels with Charley - John Steinbeck
It’s the end of 2020, and when you’ve been stuck at home for a year, with only your dog as your constant companion, Travels with Charley is a good book to read. But this book is a lot more about Steinbeck’s road trip than about the dog. Steinbeck romanticises everything. If so much as a tree sheds a leaf in front of him, he bursts forth with pages of ideas, thoughts and memories.