Thoughts on data science, statistics and machine learning.
Understanding Allen Downey's Solution to the M&M Problem
Allen Downey makes a very good case for learning advanced mathematics through programming (Check the first section of the preface of Think Bayes, titled “My theory, which is mine”). But before the reader can hit paydirt with using the Bayes theorem in programming, Downey makes you go through some elementary problems in probability, which have to be solved by hand first, if you expect to have a clear enough understanding of the concept.
Evangelism in Foss
One of the most dangerous things that can affect any FOSS community is the tendency of evangelism for the sake of evangelism. Promoting the Python stack, expanding the userbase, etc, should come only as a consequence of the content we produce as developers. If evangelism even remotely becomes one of your goals, your quality is sure to suffer. And it’s not just the empirical evidence that prompts me to say this.
Organizing a Bookshelf with Multivariate Analysis
I have recently moved from Pune to Delhi. I had spent only a year in Pune, having moved there from Mumbai, where I lived for three years. Whenever I move, the bulk of my luggage consists of books and clothes. My stay in Pune was just a transition, so I never bothered to unpack and store all my stuff too carefully. Thus, a corner of my bedroom in my Pune apartment always looked like this: The actual volume of books that I carried from Pune to Delhi is about twice of what is seen in the picture.
Limitations of the Fourier Transform
My implementation of the Hilbert Huang transform (PyHHT) is quite close to a beta release. After nearly three years of inactivity, I’ve found some time to develop the PyHHT library in the last few weeks. In all this time many people have written to me about a lot of things - from the inability to use the module because of the lack of documentation, to comparison between the results of HHT and conventional time series analysis techniques.