Thoughts on data science, statistics and machine learning.

Understanding Allen Downey's Solution to the M&M Problem

Allen Downey makes a very good case for learning advanced mathematics through programming (Check the first section of the preface of Think Bayes, titled “My theory, which is mine”). But before the reader can hit paydirt with using the Bayes theorem in programming, Downey makes you go through some elementary problems in probability, which have to be solved by hand first, if you expect to have a clear enough understanding of the concept. I can vouch for this way of learning complex concepts. The way I learnt the backpropagation algorithm (and its derivation), was with a pen, paper and a calculator.

Read more...

Evangelism in Foss

One of the most dangerous things that can affect any FOSS community is the tendency of evangelism for the sake of evangelism. Promoting the Python stack, expanding the userbase, etc, should come only as a consequence of the content we produce as developers. If evangelism even remotely becomes one of your goals, your quality is sure to suffer. And it’s not just the empirical evidence that prompts me to say this. It even makes logical sense. If we want to “promote” Python and related tech, our best market would be the young and unexperienced (and therefore non-opinionated) minds. But note that such audiences are also very fickle. They may not return for the next conference. And since they don’t, we have to count on more fresh entries each year. And in the conference itself, since we’re all acutely aware of the demographic, we spend too many talks pandering to this part of the audience.

Read more...

Organizing a Bookshelf with Multivariate Analysis

I have recently moved from Pune to Delhi. I had spent only a year in Pune, having moved there from Mumbai, where I lived for three years. Whenever I move, the bulk of my luggage consists of books and clothes. My stay in Pune was just a transition, so I never bothered to unpack and store all my stuff too carefully. Thus, a corner of my bedroom in my Pune apartment always looked like this: My first image

Read more...

Limitations of the Fourier Transform

My implementation of the Hilbert Huang transform (PyHHT) is quite close to a beta release. After nearly three years of inactivity, I’ve found some time to develop the PyHHT library in the last few weeks. In all this time many people have written to me about a lot of things - from the inability to use the module because of the lack of documentation, to comparison between the results of HHT and conventional time series analysis techniques. And now the time has come when I can no longer avoid writing the documentation. But as I write the documentation, I’ve found that I need to re-learn the concepts which I learned back when I started writing this module. It is out of this learning cycle that came my last blog post.

Read more...

Page 8 of 9