I recently bought Bryan Johnson’s Super Veggie T-Shirt, in order to fully immerse myself in his protocol.| www.nathom.dev
import matplotlib.pyplot as plt| www.nathom.dev
The Sharpe Ratio measures the quality of an equity or hedge fund by showing the return per unit of risk, calculated as | www.nathom.dev
I find entropy to be extremely fascinating. But, matching the formula | www.nathom.dev
I’m doing AoC in Haskell to learn the language. These are my solutions. Day 1 import Data.List importqualified Data.Map as Map f xs =let x1s = sort $ map fst xs x2s = sort $ map snd xs diff x y = abs (x - y) in sum $ zipWith diff x1s x2s counter=Map.fromListWith (+) . map (,1) sim xs =let c = counter (map snd xs) in sum [x *Map.findWithDefault 0 x c | x <- map fst xs] main=do l <- readFile "data1.txt"let xs = [(read x, read y) | [x, y] <- map words (lines l)] print (f xs) print (sim xs) Pre...| Home on Nathaniel Thomas
This entire site is static. All the visualizations are running completely in the browser. I use Hugo to build the site. It’s pretty neat, since its template language lets me program a lot features statically, without any JavaScript. Even the LATEX on this site is statically rendered! The theme is based off of Typo by tomfran, but I’ve made a bunch of UI tweaks to my liking (like the slick Table of Contents on widescreen).| Home on Nathaniel Thomas
Goal Suppose we have a dataset of features, but no labels. If we know (or guess) that there are $K$ classes in the dataset, we could model the dataset as the weighted average of $K$ class–conditional Gaussians. This is what Gaussian Mixture Models do. We assume that the model is parameterized by $\boldsymbol \Theta = \{ \pi_k, \mu_k, \sigma^2_k \}_{k=1}^K$, where $\pi_k$ determines the weight of the $k$th Gaussian in the model.| Home on Nathaniel Thomas
I am a Neovim diehard, but it is impossible to use over SSH. Since I do ML research, all my code runs on a remote server with high power GPUs. Reluctantly, I have been using VSCode, for its excellent remote-ssh plugin. But even with its half-baked Vim mode, it is still the same sluggish Electron app. Zed may the the editor that changes this game. It is extremely fast, supports LSP and Treesitter natively, has some pretty nifty AI features, and has native Vim bindings.| Home on Nathaniel Thomas
Bayesian Parameter Estimation (BPE) is fundamentally different compared to MLE or MAP. Whereas the latter two solve for an optimal set of parameters $\hat{\boldsymbol{\theta}}$ for the model, BPE treats $\boldsymbol{\theta}$ as a random variable with a distribution $p(\boldsymbol{\theta})$. Setup We are given a dataset $\mathcal{D}$, which contains $n$ i.i.d. features $\mathbf{x}_j$. Given a new feature vector $\mathbf{x}$, we want to classify it to some class $\omega$. One way to do this is ...| Home on Nathaniel Thomas
This is a collection of V60 recipes that I have used. Emi Fukahori (1 cup) Source video. This recipe is specific to the Hario switch, my current brewer. It gives a consistent and bright cup. Filtered Water: 200g Coffee: 14g Grind: Medium-coarse, 7.5 on Fellow Ode 2 Ratio: 14.28 Water temp: 95º C Close the switch (no flow), put filter, and preheat the brewer with hot water. After some time, open switch and toss the water.| Home on Nathaniel Thomas
This is a method of evaluating strategies for the multi-armed bandit problem 1. The testbed works as follows: Generate $10$ reward means $\mu_i$ associated with $10$ actions $a_i$ On each iteration allow the agent to take some action $a_j$, and receive a reward $r_t \sim \mathcal N(\mu_j, 1)$. We repeat this for $100$ randomly sampled sets of $\mu_i$. The agent’s goal is to maximize average rewards. Hopefully, it should learn which action has the highest mean and sample from that.| Home on Nathaniel Thomas
The goal is essentially the same as MLE. We have an assumed model for $p(\mathbf{x}_j | \omega_j)$ parameterized by $\theta$. We want to classify a feature $\mathbf{x}$ into some class $\omega_j$ based on a labeled dataset $\mathcal{D}$. In MLE, we were trying to maximize the likelihood: $$ \hat{\boldsymbol{\theta}}_{\text{MLE}} = \arg \max_{\boldsymbol{\theta}} p(\mathcal{D} | \boldsymbol{\theta}) $$ In MAP, we instead maximize the a posteriori: $$ \begin{align*} \hat{\boldsymbol{\theta}}_{\...| Home on Nathaniel Thomas
Goal We are given a dataset $\mathcal{D}$, which contains feature vectors $\mathbf{x}_k$ and class labels $\omega_k$. Denote $\mathcal{D}_i$ as the set of features of class $\omega_i$. We assume the following: That $p(\mathbf{x} \mid \omega_j) \sim \mathcal{N}(\boldsymbol{\mu}_j, \boldsymbol{\Sigma}_j)$. That is, given a class label, the distribution of features belonging to that class forms a Gaussian with mean $\boldsymbol{\mu}_j$ and covariance $\boldsymbol{\Sigma}_j$. The samples $\mathbf...| Home on Nathaniel Thomas
We’re going to go through a minimal example that will let you run Rust code on the client side of a Hugo site. We are going to compile the Rust code into WebAssembly (wasm), which will give us near-native performance on the browser!| Home on Nathaniel Thomas
Explore different methods to win, and beat expert humans in 2048 interactively!| Home on Nathaniel Thomas
One of the most striking elements of Silicon Valley to outsiders is productivity culture. Whereas most people in most places live in complete satisfaction doing their job as they would, Silicon Valley people won’t find peace without optimizing their every habit and system to extract that extra iota of productivity per unit time. I am one of those people, and this article is about how I revolutionized my productivity switching from Neovim org-mode to Obsidian.| Home on Nathaniel Thomas
If you’re a nerd, and you’ve been around Macs for a while, you might remember Applescript. It was a language developed by Apple to allow intermediate–to–advanced users to write simple scripts that could control Mac applications. It was actually created to resemble the English language, so accessing a pixel would be written as pixel 7 of row 3 of TIFF image "my bitmap" or even TIFF image "my bitmap"'s 3rd row's 7th pixel Needless to say, there’s a good reason modern programming langu...| Home on Nathaniel Thomas
Me on top of Mount Pilatus, Switzerland I’m a CS nerd who loves all things programming and math. I got started in open source software in highschool, and now do ML research. I completed my B.S. in Computer Engineering from UCSD, and am now doing an M.S. in Intelligent Systems, Robotics, and Controls. I have interned at Stanford AI Lab, Keysight Technologies, San Diego Supercomputer Center, and Yahoo. I am currently doing Reinforcement Learning research, advised by Professor Xiaolong Wang.| Home on Nathaniel Thomas
My digital bookshelf, in no particular order.| www.nathom.dev
Training a deep neural network is essentially a compression task. We want to represent our training data distribution as a function parameterized by a bunch of matrices. The more complex the distribution, the more parameters we need. The rationale for approximating the entire distribution is so that we can forward any valid point at inference using the same model, with the same weights. But what if our model was trained on-the-fly, at inference?| Posts on Nathaniel Thomas
Here are some code snippets in various languages that compute the Basel Problem:| www.nathom.dev
Begin| www.nathom.dev
Draw digits on the canvas and watch an AI guess what it is!| www.nathom.dev
My previous post (which was honestly created to test out the theme for this site), provided a few code snippets that computed $N$ terms of the sum of inverse squares. I wrote the code in my 4 favorite languages—Python, C, Rust, and Haskell—but when I ran the Python code, it was embarrassingly slow. Compared to the $\approx 950$ ms it took sequential Rust, Python took 70 seconds! So, in this post, we’re going to attempt to get Python some more reasonable numbers.| www.nathom.dev