Random Number Puzzle

Posted Leave a comment

Suppose that you have a function that you can use to generate uniformly distributed random numbers between \(1\) and \(5\). How can you use the above function to generate uniformly distributed random numbers between \(1\) and \(7\)?

Why do we need to omit a dummy variable when estimating impact of a categorical variable on a dependent variable?

Posted Leave a comment

The usual explanation for why we need to omit one of the levels of a categorical variable when using dummy coding relies on the mathematics of linear regression. The explanation relies on the fact that including all the levels of the categorical variable along with the intercept results in perfect multicollinearity and thus there is no […]

What you don’t learn in grad school.

Posted Leave a comment

Developing robust software that is error free is one of the most important skills that everyone should learn in grad school but don’t for a variety of reasons or at least that is my experience. Yours may be different. I will illustrate one context where a simple software engineering idea is very useful in eliminating hard […]

Deriving the Posterior Distribution for Population Variance

Posted Leave a comment

In an earlier post, we discussed how to derive the posterior distribution for the population mean. In this post, we will focus on deriving the posterior distribution for the variance parameter which is used in different types of Bayesian inference. Background Context In a lot of different Bayesian contexts (e.g., hierarchical Bayesian linear regression, hierarchical […]

Deriving the Posterior Distribution for Population Mean

Posted Leave a comment

Background Context In a lot of different Bayesian contexts (e.g., hierarchical Bayesian linear regression, hierarchical Bayesian estimation of discrete choice models etc), the following situation arises: There are \(n\) respondents whose response can be modeled by a set of independent variables and associated parameters. We will denote these parameters by \(\beta_i\) .   We assume […]

Sentiment Analysis Using Python

Posted Leave a comment

Processing open ended consumer responses used to be time consuming. The usual process involved categorizing the response into several categories and then summarize the themes that emerge. Natural language processing (NLP) libraries can help with the task of analyzing open ended responses to assess consumer sentiment, aspects of the experience consumers are talking about the […]

What are confidence intervals?

Posted Leave a comment

Using confidence intervals is one approach to identify if the proposed version of an A/B test is achieving business objectives relative to baseline. This post discusses what confidence intervals are, how to interpret them and the impact of sample sizes on the confidence intervals.