The Bridge of Asses: Learning Coding with Novices

Over the last few years, I have been deeply involved with the IIT-M Programme in Data Science & Applications - as a student, a mentor and an analytics consultant. The programme provides diplomas and bachelor’s degrees in data science and applications. I’m often asked why I’m so invested in the programme - especially since I’m already an experienced data scientist.

At least three people are mad at me for being in the programme. One of them thinks that I’ve unfairly claimed a higher education seat I don’t need - which would be true if this was a conventional program with limited seats. But it’s a MOOC.

Another person thinks that professionals like myself (especially those who’ve spent a fair time in data science and software development) are pulling the grading curves to the right. I promise you - we’re not. First of all, kids these days are a lot smarter than I. Secondly, practising data science is one thing, but getting good grades and writing exams is a different game. Third, the program doesn’t grade us on a curve anyway. A CGPA of 9.01 means that I consistently demonstrated mastery of 90% of the material. It does not mean that I am 90% as smart as the smartest student.

The third person who thinks I have no business being here is my wife. She insists that the right time for studies was ten years ago (when I actually went to college but never studied, not for exams anyway), and the time I spend studying now is time stolen from her. She’s right on both counts. Me not studying hard enough back then, and studying too hard now, have both taken something from her.

It’s true that I have no business being in this programme. But I’ve rarely been motivated by reasonable things.

I didn’t come for the degree. I came because I teach machine learning courses myself, and I wanted to see what IIT Madras was really up to. I was shocked that IIT Madras was offering degrees without a JEE score. I went in only to audit some of the courses, and see if they really knew what they were doing. In fact, I once asked Prof Andrew Thangaraj, the coordinator of the programme, what he and his team were thinking when they decided to open the floodgates. He laughed and said, “We weren’t thinking.”

So I came in as a spectator, but then I started seeing this programme as a massive social experiment. Since I’d been teaching this stuff myself, I was curious about what happens when you take high-quality educational content and make it available en masse: anyone can apply as long as they’ve finished high school, from anywhere, no age limits either. What you get is the largest, most diverse MOOC in the world.

I’m sorry that I can’t furnish the exact numbers here, but suffice it to say that we’ve got people from their late teens to their early 80s, from various socio-economic backgrounds. The programme also boasts a better sex ratio than most other engineering or science curricula anywhere else in the country. People can come from any or no professional background. There are people like myself, experienced programmers and data scientists who are here just for fun - without the pressure of having to worry about getting jobs. On the other hand there are people who’ve never seen a computer. There are seniors and retired professionals who are familiar with computers and programming, but not quite in the same way that would be suitable for data science. There are also people who are here primarily to upskill and get jobs.

The difficulty of the content too, spans the whole spectrum - courses range from trivial to backbreaking. There are courses in English, mathematics, statistics, databases, programming languages, shallow and deep ML, and specialized topics like reinforcement learning too. There were courses where I scored a top grade without lifting a finger. There were also courses where the entire class got slaughtered.

What we have here are two singularities - the varying complexity of the content and the varying backgrounds, skills, hopes and aspirations of the people consuming that content.

The exchanges and discussions which emerge from the collision of these two singularities are nothing short of mind-boggling. These discussions, at the very least, are useful to those involved. And they can be truly enlightening if they’re summarized, critiqued and documented properly.

What you are reading is a small attempt at doing just that.


We use a Discourse instance as our discussion forum. It’s open to all who enrol in the programme to discuss everything from homework to operations and logistics. For as long as I was in the programme, I was addicted to the forum - it was the first tab I opened every morning, and the last I closed before I left my desk. In the beginning, when we onboarded some of the first batches onto Discourse, most of us were gradually being exposed to programming and data science for the first time. This meant that we had newcomers exposed to many different technologies. For instance, stuff like numpy, pandas, sklearn and bash form the center of some of the courses offered in the diploma year. Many of these tools are my very livelihood. I’ve been using them for a decade - I’ve done my 10000 hours2. So I was able to go through these courses with relatively little effort. The time that I saved went into writing long post and replies on Discourse.

It turns out that I’ve written about a thousand posts on Discourse, nearly one for every day that I was enrolled in the program. They span lots of topics, but three overarching themes emerged from them - code, tools and libraries, and pedagogy3.

This post focuses on the first item - code. I’ve learnt a lot about how people code - not about coding itself, not even about just the act of coding, but how people (novices and experts alike) react to code, interpret it, and develop habits around it.

The Bridge of Asses

Medieval students called the moment at which casual learners fail the pons asinorum, or “bridge of asses.” The term was inspired by Proposition 5 of Euclid’s Elements I, the first truly difficult idea in the book. Those who crossed the bridge would go on to master geometry; those who didn’t would remain dabblers. Section 4.3 of “Beginning Visual C++,” on “Dynamic Memory Allocation,” was my bridge of asses. I did not cross.

- James Somers, A Coder Considers the Waning Days of the Craft

Anyone who doesn’t want to remain a ‘dabbler’ must cross their own pons asinorum. I crossed mine a bit late, in the last year of my bachelor’s course in electronics engineering. I realized that if I were to do machine learning instead of electronics, I’d have to write MATLAB instead of HDL and embedded C. And if I were to do it really well, I’d have to use a real programming language, like Python4. That was my bridge of asses. Crossing it consisted of understanding that there exists a whole world outside the paradigm of array-oriented computing.

The Pons Asinorum of the foundation level of the BS Degree Programme is also a course on Python programming. Rude as it may sound, there’s no hope in data science and applications for those who can’t write Python code. The Python course occurs within the first year of the programme, and acts as a ruthless filter. It’s one of the most difficult courses in the entire programme. At any given time there are a few thousand students enrolled in that course, and many of them take five or six attempts to pass it. For comparison, the second most difficult course is Maths 2 (linear algebra and multivariate calculus), where the slowest students take up to four attempts. At some level, this appears very strange - why would Python - one of the simplest programming languages - be more difficult to master than advanced mathematics? And isn’t mathematics a language in itself, too?

But it is what it is. You can’t hope to get much done in data science and applications without some aptitude in Python. So it’s safe to say that despite its ruthlessness, the bridge of asses is placed correctly. As to why it is so narrow, I think the answer lies in how novices react to computer code.

Scalpels, not Sledgehammers

The one thing that characterized my own first struggle with general purpose programming - 15 years ago - was frequent rage-quitting. Code looked horribly obscure to me. Reading pages and pages of it gave me a frightful nausea. If the sight of Borland C++ GUI makes you nostalgic, I envy you. In me it still invokes fear. But I did have grades riding on it, so I had no choice but to keep reading and trying to comprehend it in vain.

Code looked like a dark, viscous, looming mass of tar, hovering ominously above my desk. It felt large enough to swallow you whole. You’re backed up against a wall. It slowly closes in on you. It will eventually drown you. It also seems to have magical powers - when you swing a bat at it, it hardens like stone. The bat breaks against it. But when you step back and stop exerting force, it starts dripping once again with tar, giving you the fleeting impression that it may perhaps be pliable after all, giving you hope that you may yet conquer this monstrosity.

So you lash out again, and once again the mass hardens and you break yet another weapon. You realize that the amorphous, intangible mass is impenetrable to blunt force. But maybe, just maybe, it’ll disintegrate if you poke it at just the right spot. You realize that what you needed, really, was a scalpel, not a sledgehammer. But now it’s too late. You’ve been badly tarred and feathered. And when, weeks later, you face another inscrutable block of code-tar, you might not even remember what worked or failed the last time. You are condemned to eternally swing bats at something that is invulnerable to force.

… at least that’s what I wrote in my journal back then. But there are others who have had similar experiences. No wonder Fred Brooks wrote about the Tar Pit - there’s something black and viscous and dirty about incomprehensible code. Anuvrat Parashar compares incomprehensible code to an almost-finished sweater - one that can be unravelled if only you’d find the right thread to pull.

Now, after fifteen years, code certainly doesn’t scare me. And only sometimes nauseates me.

I suspect that most people - whether experts or novices - have similar reactions when they start reading code. At first glance, a block of code looks as indecipherable and as arcane to an expert as it looks to a novice. The difference is that the expert does not feel intimated. The expert’s furrowed brows reflects an attempt at comprehension (and sometimes, it reflects disgust). Nobody, no matter how experienced, are fully comfortable with code at first glance.

To make matters worse for novices, the hieroglyphs of variables, functions and loops scare them further away from reading the code. It becomes even more magical, even more forbidding. Not to mention the pressing assignment deadlines, if you’re in the middle of a programming course. Ultimately, the novice’s reaction to code becomes haphazard instead of comprehensive. As a result, most questions on Discourse come in the form of a screenshot of code followed by a short question like “why isn’t this working?” or “what’s wrong with this code?”.

If you’re reading code on GitHub or in your IDE, the environment has already primed you for reading code. But when code appears interspersed with natural language, as in textbooks, blogs, or API documentation, it takes a few seconds for the brain to switch from comprehending natural language to comprehending code. If you’re reading English, your eyes are confidently moving from left to right, and then from top to bottom. But if you’re reading code, is this left-right-top-bottom movement of the eyes always the most efficient? Maybe sometimes, if you are presented with a simple function or a short script. But if you’re presented with a whole Python module or a Java class, your eyes might jump around all over the screen until you find the entry point - which could be a class constructor, a command line argument parser, a function, etc.

The point is that the most important part of code could appear anywhere in it. On the other hand, in a paragraph of English text (assuming it’s not written by James Joyce or William Faulkner), we know where the most important bits are likely to be. Moreover, natural language doesn’t keep you guessing. You can keep following the left-right-top-bottom movement of your eyes until you find what you’re looking for. In code, there’s no such sure-shot method for finding things.

That’s why I suspect that the visceral reaction of most people - whether experts or novices - to code is some form of discomfort. It is certainly not comprehension. I ran a few polls to see how many people agree with me. It turns out that hell is indeed other people’s code. Some of us are optimistic about foreign code - their instant reaction is excitement or curiosity - Nerds! I tell you…

But those who said that their reaction is comprehension are straight up lying.

Everyone takes a second or two to get accustomed to reading code. It is during this short time that the differences between the thoughts of experts and novices is most stark.

After the initial glance at a snippet, the expert starts behaving like a trained bloodhound. They shut off all external noise and focus rapidly. They then proceed to look for the entry point, scrolling up and down through the code, until they find their quarry. Only when the entry point is isolated (by making a note of which function, statement, or line of code holds the solution) do they allow themselves the fleeting pleasure of comprehension. But their work isn’t done yet. The entry point is but the root of the tree. They have yet to parse the whole tree to figure out how stuff works. After a brief pause, the bloodhound is at work again.

All this is incredibly difficult for a novice - not least because it cannot be taught. That is why, novices react with discomfort and stay in that same state, until the discomfort compounds into panic and they (at least temporarily) give up.

This effect of code-induced-disorientation is further amplified when students are chasing deadlines and when they’re appearing for exams. Most of us are chronic procrastinators (including myself - I’m saying this without judgement, no shame in it). When we read code that really needs to be understood quickly, it’s almost always too late. If you’ve got only a few hours left to finish a daunting assignment, even a relatively small block of code can look very scary. Fear gets the best of even the best of us.

So, ultimately, it’s not surprising that an overwhelming amount of Discourse threads are about what some piece of code does, and why it may not work. The antidote to obscure code is what I’d call human-centric code.

Human-Centric Code

The Internship (2013) is a movie about what happens when a couple of men with zero technical skills end up in an internship at Google. Much of it is a caricature of how hard it is for a couple of out-of-work, middle-aged, technologically-impaired and somewhat obnoxious salesmen to find good jobs, and how they save their careers with little more than their street-smarts and life experience.

At one point in the movie, teams of interns have been tasked with debugging an app which has 2 million lines of code (perhaps a cinematic exaggeration - who knows?). As soon as the problem is stated, the experts in the team - young students, coders and a Google PM - launch into a variety of strategies to solve the problem. You can hear them say various plausible phrases like “scan the logs”, “see what exceptions were thrown”, “read the user reports”, etc. The protagonists, however, are left mumbling nonsense to themselves. They are so clueless, they don’t know how clueless they are.

But even though it is ignored by the rest of their team, they do make one very good point. They keep insisting that “Someone wrote that code. All we have to do is find them.” They end up being hilariously pranked, but recognising that there is a human behind a given piece of code can be a very useful insight.

I’m not even talking about knowing exactly who wrote a particular piece of code. The first step is to simply acknowledge that a piece of code was written by one or more humans. In the post-genAI world, recognizing the touch of a human is going to be all the more insightful. The next step, then, is to start arranging a persona around the authors. The people who design coding problems in exams do not have the same persona as the people who write jupyter notebooks on Kaggle. Knowing the persona even vaguely can help us find the motivations behind code. A programming exam is designed to be tricky to read and debug. A pandas tutorial is not.

The realization that humans are behind code is the cornerstone of the entire edifice of literate programming. It is also why communities of programmers who maintain software projects have coding standards. Recognizing these standards in new, unseen code can very often lead to an easy bugfix, even if one doesn’t understand every single expression. In fact, complete and total comprehension of code (that is to say, a deep understanding of every line, expression and statement) is not only not necessary when debugging, but it can actually be distracting. When you’re asked to fix a piece of code or predict its output, the solution is rarely found in the lowest level of abstraction. Developer communities use coding standards precisely to help themselves maintain consistency throughout the codebase, to debug programs faster, and, most crucially, make the code easier for newcomers.

This is why libraries in the scientific Python stack (like scikit-learn) are beautiful. Let me tell you how.

In a previous life, I used to work in Enthought’s buildsystem team. Python package management back then wasn’t nearly as mature as it is. Tools like pip were a distant dream, and people used to joke that we ought to have a package manager named “maybe” - “maybe install”, “maybe upgrade” or “maybe remove” a package. Our job in the Enthought buildsystem was to build and package various Python libraries so they could be sold to customers. These packages were usually built from source. Python packaging being what it was, almost everything we built would break once in a while, on some OS or platform.

So we internalized a rule - upstream is always broken. Since we couldn’t sell broken packages, we had to patch them ourselves. We had patches for almost every Python package that existed. In the two years that I spent in the buildsystem team, I remember only 3 Python packages that never had to be patched - requests, ipython and scikit-learn.

Like the developers of many popular open-source libraries, the developers of sklearn have been highly dedicated to coding standards. In 2011, at the SciPy India conference, I had the good fortune of showing some of my badly written code to Gael Varoquaux. He said a lot of things about my code, most of them good. But he also emphatically kept pointing out that my code had no docstrings. I said “Really? You think that’s the biggest problem with this code?” And sure enough, scikit-learn’s contribution guide emphasizes coding standards.

The effects of this dedication are obvious to anyone who has spent a reasonable time using the sklearn API. This is why I can recognize sklearn code from a mile away. I know how sklearn developers divide their API neatly into only a few types of classes and functions (their use of abstractions and interfaces would make Java programmers jealous), and that piece of knowledge makes it easy for me to identify when a transformer is being confused for an estimator.

Similar praise is due to the other no-patch libraries too - requests is an example of how good Python code is supposed to be written, and IPython represents an entire ecosystem built out of Python’s metaprogramming capabilities. Any Pythonista’s (and especially, any data scientist’s) very life is shaped by these libraries.

And none of that is possible without code that is, as Abelson and Sussman wrote, “written for people to read, and only incidentally for machines to execute.”

Good Code = Good Communication

Until now, I thought that the quote above was from Donald Knuth. But since I needed to paraphrase it here, I went around looking for its origin. All along, I had it right here in my room, albeit in one of the least read books in one of the dustier corners of my bookshelf. The quote is from the preface to the first edition of SICP.

Reading the preface again after all these years is a revelation. For ages, I have thought that SICP was a very… serious and hardcore programming textbook, and the authors would not have bothered with something peripheral like communication. That particular outlook was perhaps one of my own first mistakes as a novice programmer - the very mistake that I’m now cribbing about. That incorrect outlook towards programming - that it’s all about clever algorithms, efficiency and managing complexity - must be why I rushed to acquire a copy of SICP, barely touched it later, and then, years later, come around to the idea of code as communication.

The authors of SICP mention that most of their students " have had little or no prior formal training in computation, although many have played with computers a bit and a few have had extensive programming or hardware-design experience." This is very similar to the situation I described earlier about my classmates at IIT-M. The authors further write,

Our goal is that students who complete this subject should have a good feel for the elements of style and the aesthetics of programming. They should have command of the major techniques for controlling complexity in a large system. They should be capable of reading a 50-page-long program, if it is written in an exemplary style. They should know what not to read, and what they need not understand at any moment. They should feel secure about modifying a program, retaining the spirit and style of the original author.

So here’s one of the greatest programming textbooks of all time, emphasizing style, aesthetics and abstraction. I don’t know of any other programming 101 content that takes this stand.

Now, I’ve had my share of being in software development and data science teams which thought that clean code was a luxury they couldn’t afford. My experience has taught me that clean code helps even novices, and the lack of it hurts even experts. What makes clean code hard to sell is that clean code is more about the long-term health and stability of a project than about the short-term skill of an individual programmer. So, as vindicated as I feel about this idea, it’s not exactly something I can preach to a bunch of exasperated kids who are chasing homework deadlines.

Perhaps, we all need to take our time. Let’s just hope that it’s not so long that our textbooks turn to dust.

Acknowledgements

Thanks are due to Tushar Sharma, Shivani Bhardwaj, Abhiram R and Sai Rahul Poruri for reviewing early drafts of this post.

Further Reading & Bibliography

  1. Abelson, H., Sussman, G. J., & Sussman, J. (2002). Structure and Interpretation of Computer Programs (2nd ed.). Universities Press.

  2. Guzdial, M. (2010). Why is it so hard to learn to program? In A. Oram & G. Wilson (Eds.), Making Software: What Really Works, and Why We Believe It (pp. 111–124). O’Reilly Media.

  3. Dawkins, R. (2021). Tutorial-driven Teaching. In Books Do Furnish a Life (pp. 405–412). Penguin Books.

  4. Martin, R. C. (2011). The Clean Coder: A Code of Conduct for Professional Programmers. Pearson Education.

  5. Brooks Jr., F. P. (1995). The Mythical Man-Month: Essays on Software Engineering (Anniversary ed.). Pearson Education.


  1. Well, 8.99, to be exact. Yes, you read that right. I missed a merit award by a hundredth of a point. ↩︎

  2. I may have spent those many hours with Python (and its ecosystem of data science libraries), bash, SQL and even HTML + JavaScript - but it’s precisely this expertise that makes you jaded. I realized early on that I was somewhat blind to the problems that novices first face with these technologies. There were questions about things I’ve taken for granted all these years - and the best students will always ask more than the best mentors can answer. If someone asks why iteration ought to be avoided in numpy, one can certainly do better than mumble something about vectorization. If someone asks why list comprehensions might be preferred over for loops2, one can’t just cling to their influences and wave the question off. When you try to explain why the choice of a web framework isn’t nearly as important as learning how to build a RESTful API, you have to remember to keep evangelism aside. So I’ve had to think long and hard about what I know and what I’ve taught - often from first principles. ↩︎ ↩︎

  3. Discourse has a fairly extensive API - writing scripts to extract stuff is quite doable. The rest of the analysis is thanks to a topic modelling recipe I learned from S Anand: get embeddings from an LLM for pieces of text, and cluster them until the clusters represent meaningful, hopefully non-overlapping topics. ↩︎

  4. Stephen Boyd of Stanford, in his lectures on Convex Optimization, encouraged his students to try out assignments in Python. That was the first time I heard of the language. ↩︎

comments powered by Disqus