In this post I will provide a gentle introduction to the theory of martingales (also called “fair games”) by way of a beautiful proof, due to Johan Wästlund, that there are precisely labeled trees on vertices. Apertif: a true story In my early twenties, I appeared on the TV show Jeopardy! That’s not what this […]| Matt Baker's Math Blog
For over 50 years, it has been known that people tend to overestimate the likelihood of uncommon events/items occurring, and underestimate the likelihood of common events/items. This behavior has replicated in many experiments and is sometimes listed as a so-called cognitive bias. Cognitive bias has become the term used to describe the situation where the […]| The Shape of Code
h/t Benny Sudakov The Ramsey number R(ℓ,k) is the smallest integer n such that in any two-coloring of the edges of the complete graph on n vertices, , by red and blue, there is either a red (a complete graph … Continue reading →| Combinatorics and more
A recent question illustrates well the different ways to solve problems in combinatorics. We’ll see an easy way, another easy way, and a … less suitable … way to solve a set of problems. Alphabetical order, and beyond It came from Joe, in mid-May: Background: I’m a year 11 student doing extension 1 maths (Sydney) …Arranging Letters in Words, Revisited Read More »| The Math Doctors
I don’t know much about this variant of Bayes, but the central idea is that we consider Bayes updating as a coherent betting rule and back everything else out from that. This gets us something like classic Bayes but with an even more austere approach to what probability is. I am interested in this because, following an insight of Susan Wei’s, I note that it might be an interesting way of understanding when foundation models do optimal inference, since most neural networks are best underst...| The Dan MacKinlay stable of variably-well-consider’d enterprises
I like looking a little deeper into problems; here we’ll find that although the problem is simple if you take it on its own terms, those terms are actually impossible. Does it matter? P(A) = 0.47 and P(A∩B) = 0.34 … but can they? The question came from Louis, a teacher, in mid-May: Refer to …An Easy But Impossible Probability Problem Read More »| The Math Doctors
The Liberty Bell 3-reel slot machine, made by Charles August Fey, San Francisco, 1899 In the days of my youth, slot machines were part of the furniture of life. Essential scenery in bars, cafés, arcades, everywhere we chilled out. I have an analytical turn of mind but cannot recall ever applying it to these devices. […]| carnotcycle
Well, we are coming up on Easter. And while Christians will be celebrating the resurrection, others will doubt it. Our world’s skepticism over miracles is nothing new. Ever since David Hume, philosophers and scholars have been making the case against the possibility of miracles. But, now things have shifted. Hume has been roundly (and decisively) […]| Canon Fodder
We want to optimize the expected value of some random function. This is the problem we solved with Stochastic Gradient Descent. However, we assume that we no longer have access to unbiased estimate of the gradient. We only can obtain estimates of the function itself. In this case we can apply the Kiefer-Wolfowitz procedure. The … Continue reading "Zero-Order Stochastic Optimization: Keifer-Wolfowitz"| Applied Probability Notes
Let’s explain why the normal distribution is so important. (This is a section in the notes here.)| Applied Probability Notes
An important part of probability theory concerns with the sums of independent random variables. In many situations, the number of terms in the independent sum is itself a random variable. Introduct…| A Blog on Probability and Statistics
Neural denoising diffusion models of language| The Dan MacKinlay stable of variably-well-consider’d enterprises
If I told you to count how many different users have visited a high traffic website, how would you do it? It’s not trivial, is it? Probably, the first idea that would come to every programmer’s mind would be to use some kind of set with the user id in it. That’s not bad at all: each access to the hash set is constant. However, having to keep each element in memory doesn’t seem very optimal, we’d be wasting a lot of memory… What if we were counting something with a very high cardin...| Adri’s Blog
This week’s Fiddler is about hopping back and forth. You are a frog in a pond with an infinite number of lily pads in a line, marked “1,” “2,” “3,” etc. You are currently on pad 2, and your goal is to make it to pad 1. From any given pad, there are specific probabilities … Continue reading "Can you hop to the lily pad?" The post Can you hop to the lily pad? first appeared on Book Proofs.| Book Proofs
This is an addendum to my post about typicality, where I try to quantify flawed intuitions about high-dimensional distributions.| Sander Dieleman
A summary of my current thoughts on typicality, and its relevance to likelihood-based generative models.| Sander Dieleman
This week's Fiddler is about rounding! You are presented with a bag of treats, which contains $n \geq 3$ peanut butter cups and some unknown quantity of candy corn kernels (with any amount being equally likely). You reach into the bag $k$ times, with $3 \leq k \leq n$, and pull out a candy at| Book Proofs
This week’s Fiddler is about rounding! Let $\text{round}(x)$ be the value of $x$ rounded to the nearest integer. Suppose $x_1,\dots,x_n$ are independent uniformly distributed random variables in $[0,1]$. Find the probability that \[ \text{round}(x_1+\cdots+x_n) = \text{round}(x_1)+\cdots+\text{round}(x_n) \] My solution: [Show Solution] Let’s call the probability we seek $p(n)$. The values of the $x_i$ determine what … Continue reading "Round, round, get a round" The post Round, round, ...| Book Proofs
We consider distributions that have a continuous range of values. Discrete probability distributions where defined by a probability mass function. Analogously continuous probability distributions a…| Applied Probability Notes
There are some probability distributions that occur frequently. This is because they either have a particularly natural or simple construction. Or they arise as the limit of some simpler distributi…| Applied Probability Notes
Often we are interested in the magnitude of an outcome as well as its probability. E.g. in a gambling game amount you win or loss is as important as the probability each outcome. (This is a section…| Applied Probability Notes
(This is a section in the notes here.) Conditional probabilities are probabilities where we have assumed that another event has occurred.| Applied Probability Notes
(This is a section in the notes here.) Counting in Probability. If each outcome is equally likely, i.e. $latex \mathbb P( \omega ) = p$ for all $latex \omega \in \Omega$, then since (where $latex |…| Applied Probability Notes
(This is a section in the notes here.) We want to calculate probabilities for different events. Events are sets of outcomes, and we recall that there are various ways of combining sets. The current…| Applied Probability Notes
(This is a section in the notes here.) I throw a coin $latex 100$ times. I got $latex 52$ heads.| Applied Probability Notes
This is the appendix in the notes here.| Applied Probability Notes
Quantifying how likely each birthday is present (covered) in some large group of people.| liorsinai.github.io
This week’s Fiddler is based on “Showcase Showdown” on the game show “The Price is Right”. Suppose we have some number of players. Player A is the first to spin a giant wheel, which spits out a real number chosen randomly and uniformly between 0 and 1. All spins are independent of each other. After … Continue reading "Showcase Showdown" The post Showcase Showdown first appeared on Book Proofs.| Book Proofs
Background Consider the problem of comparing two treatments by doing squential analyses by avoiding putting too much faith into a fixed sample size design. As shown here the lowest expected sample size will result from looking at the developing data as often as possible in a Bayesian design. The Bayesian approach computes probabilities about unknowns, e.g., the treatment effect, and one can update the current evidence base as often as desired, knowing that the current information has made pre...| Statistical Thinking
A little over one year ago, this blog marked the beginning of the year of the rooster in this blog post. Now it’s time to celebrate the year of the dog. The Lunar New Year actually falls on F…| All Math Considered
This post works with 5-card Poker hands drawn from a standard deck of 52 cards. The discussion is mostly mathematical, using the Poker hands to illustrate counting techniques and calculation of pro…| All Math Considered
This post highlights certain basic probability problems that are quite easy to do using the concept of Markov chains. Some of these problems are easy to state but may be calculation intensive (if n…| A Blog on Probability and Statistics
Suppose that a patient is to be screened for a certain disease or medical condition. There are two important questions at the outset. How accurate is the screen or test? For example, at the outset,…| A Blog on Probability and Statistics
A student in a probability course may have evaluated an integral such as the following: $latex \displaystyle \int_0^\infty \displaystyle t^{x-1} \ e^{-t} \ dt$ Plug in a value for $latex x$ and eva…| A Blog on Probability and Statistics
The previous post is on the Monty Hall Problem. This post adds to the discussion by looking at three pieces from New York Times. Anyone who is not familiar with the problem should read the previous…| A Blog on Probability and Statistics
The post discusses the Monty Hall problem, a brain teaser and a classic problem in probability. The following 5 pictures describe the problem. Several simple solutions are presented. ______________…| A Blog on Probability and Statistics
What better way to celebrate Pi Day than to have a blog post on the digits of $latex \pi$! Giving pi as tips The number $latex \pi$ is the ratio of the circumference of a circle to its diameter. It…| A Blog on Probability and Statistics
Consider this random experiment. You ask people (one at a time) of their birthdays (month and day only). The process continues until there is a repeat in the series of birthdays, in other words, un…| A Blog on Probability and Statistics
This post discusses the coupon collector problem, a classical problem in probability. ___________________________________________________________________________ The Coupon Collector Problem The pr…| A Blog on Probability and Statistics
In this post, we discusses an example in which you are given a password (every character of it) and yet it is still very hard (or even impossible) to crack. Anyone who understands this example has …| A Blog on Probability and Statistics
What is the distribution of the maximum of $latex n$ random variables? What started out a utilitarian question in my exploration of some generalized versions of the secretary problem turns out to b…| Christopher Olah's Blog
The p-value is the most commonly used statistic in scientific papers and applied statistical analyses. Learn what its definition is, how to interpret it and how to calculate statistical significance if you are performing statistical tests of hypotheses. The utility, interpretation, and common misinterpretations of observed p-values and significance levels are illustrated with examples.| GIGAcalculator Articles
The reader might wonder about the relation between the previous post and my discussion of Arman Razaali. If I could say it is more likely that he was lying than that the thing happened as stated, w…| Entirely Useless
Why are they all blurry? In a recent article, Michael Shermer says about UFOs: UFOlogists claim that extraordinary evidence exists in the form of tens of thousands of UFO sightings. But SETI scient…| Entirely Useless
During my studies I got to see a lot of programming and a lot of mathematics.| hillman.dev
かな (Kana), かも(Kamo), 気がする(Kigasuru), 思う(Omou) and more. Explore a variety of Japanese expressions that show diffeent levels of uncertainty.| Tofugu
How to calculate the statistical distance between two 2D distributions of points. But first a lesson in bad statistics, the pitfalls of visual solutions and ...| liorsinai.github.io
You drew 40 random cells from a sample and found that a new drug affected 16 of them. An online calculator told you: “With 90% confidence, the true fraction is between 26.9% and 54.2%.”…| Justin Domke
What is the likelihood of a gambler winning all the cash in a casino? A better question is: what is the likelihood of a gambler losing all the money he or she brings into the casino? There had been…| Introductory Statistics