We'd be Better Off with 9-bit Bytes| pavpanchekha.com
This was my first time attending EGRAPHS. Isn't that kind of crazy?| pavpanchekha.com
Ok, here's the story so far. Way back when I wanted to test whether sort + FastTwoSum is faster than TwoSum. But as I started optimizing it, I kept getting frustrated with how hard it is to benchmark low-level code, and eventually I wrote a simulator for my CPU, the Apple M1 chip. At this point, I was in too deep and wanted to keep improving the simulator. Compiling to Assembly As of the last blog post, I'd added a little DAG-building library so that I could easily write little assembly snipp...| Pavel Panchekha’s Blog
I just read David's paper on floating-point accumulation networks. It's fantastic work: he's developed a new abstraction of floating-point numbers that allows him to verify error bounds for sequences of TwoSum-like operations, and then used that abstraction to design new, faster double-double addition algorithms. Since I already had a Apple M1 simulator on hand, I figured I'd give his algorithms a spin. About my simulator I guess I was a big glib when describing my simulator earlier. Let me g...| Pavel Panchekha’s Blog
I've been thinking about quantifier elimination, and one thing I have gained a new appreciation for is how much of our ordinary mathematical syntax can be thought of as the "consequence" of quantifier elimination. This page lists a few examples. Modular arithmetic The mod operation, as in x % 3, is necessary to eliminate existentials like: ∃ x, y = 3 x + 2 Here the quantifier elimination results in y % 3 = 2. Set operations The theory of sets has just one primitive relation, x ∈ S. But if...| Pavel Panchekha’s Blog
I've recently been on a tear writing about Egraphs modulo T, first writing about polynomials specifically and later about the general case. My main insight is that pattern-matching, specifically relational e-matching, is related to quantifier elimination and the related notion of cell decomposition, which in fact lots of theories have. So I wanted to try to make this explicit in one of the easiest possible cases: linear equations. LIN-graphs A LIN-graph has terms which are linear combinations...| Pavel Panchekha’s Blog
Phil Zucker has been writing about Egraphs modulo T for a while now; see his latest blog post and latest paper for up to date efforts. I've been thinking about this too, and this post is my attempt to summarize. I'll be referencing Phil's ideas repeatedly, since all of my thoughts trace back to his. What is an E-graph (abstractly) An e-graph is for conjunctive reasoning with equality. Conjunctive means we don't do any DPLL-style guesswork and we therefore don't need to backtrack; it's an inte...| Pavel Panchekha’s Blog
I have a dream of "safe" numerical programming, where any time you do| pavpanchekha.com
One thing Herbie does is simplify stuff, and one that I am personally interested in is simplifying trigonometric expressions. For example, did you know that sin(x) / (1 + cos(x)) and (1 - sin(x)) / cos(x) are the same? Or that both are equal to tan(x / 2)? But how can we do this automatically? Before I move on, a thank you to ChatGPT, mostly o3 and 4.1, for generative discussions. The Weirstrass Substitution One key to any discussion of trigonometric stuff is the Weirstrass substitution.1 [1 ...| Pavel Panchekha’s Blog
I have recently become a big fan of the condition number formalism for floating-point error; it is a simple mental model that explains a lot about floating point. In this blog post I want to use that mental model to investigate "operator fusion": basically, why library functions like log1p, cospi, or hypot improve error.1 [1 Actually, one of the big effects of hypot is to avoid overflow, which condition numbers don't address, so, uhh, it's only part of the story.] The condition number formali...| Pavel Panchekha’s Blog
In Herbie we use e-graphs to do algebraic rewriting of mathematical expressions. For example, in the Rust standard library's acosh function, e-graphs rewrite log(sqrt(x^2 + 1))… into log(1 + sqrt(x^2 + 1) - 1) into log1p(sqrt(x^2 + 1) - 1) into log1p(x^2 / (sqrt(x^2 + 1) + 1)) into log1p(x / (sqrt(1 + (1/x)^2) + (1/x))), Still, they're bad at some things, and one that really stands out is basic sums and products. Canceling like terms, distributing products over sums, and factoring things is...| Pavel Panchekha’s Blog
In my last post, I pointed out that FastTwoSum plus a sort is actually faster than normal TwoSum on modern hardware, and suggested that TwoSum is therefore obsolete. This post expands the analysis and tests futher optimizations on the Apple M1 chip. General approach The general FastTwoSum algorithm is the following: s = a + b bb = s - a e = b - bb This assumes that |a| > |b|, or perhaps a slightly weaker condition. To satisfy this precondition, we need to sort a and b by absolute value, kind ...| Pavel Panchekha’s Blog
FastTwoSum is Faster Than TwoSum| pavpanchekha.com
In the Herbie project we have some old bits of code that work, but are| pavpanchekha.com
I've been practicing using LLMs (mostly OpenAI o3-mini-high) for coding. It's good at quick prototypes, and I've recently been thinking about fast set membership operations (specifically related to CSS matching). So I've had it implement a number of different binary search algorithms. The upshot is that modern CPUs are amazingly fast. Here's the fastest version I ended up with: intsSearch(intarr[], intsize, inttarget) { __builtin_assume(size > 1); int *begin = arr; inttwo_k = 1 << (sizeof(int...| Pavel Panchekha’s Blog
Floating-point code uses various functions on floating-point numbers, like arithmetic and elementary functions. When I got started working on floating-point error, I thought that was the ontology: arithmetic vs elementary. However, I now realize that it's more complex than that: it's useful to distinguish between a few different levels of floating-point operations. Bitwise operations (Group 0) These operations include comparisons, min/max, absolute value, copysign, and some more obscure optio...| Pavel Panchekha’s Blog
I have long been interested in CSS selector matchingalgorithms. In short: existing selector matching algorithms are \(O(d^2)\) at worst, where \(d\) is the tree depth, and there are better algorithms that are \(O(d)\). But of course the constants matter a lot. I previously described both algorithms, and even showed that the bad cases can occur in practice. Recently I dove into the Stylo CSS engine, used by Firefox, to better understand exactly how it works and better understand the constraint...| Pavel Panchekha’s Blog
Selector Matching is Quadratic| pavpanchekha.com
Alright, I'm back at it: comparing Herbie against LLMs to see who is best at numerical rewriting. If you're new here, Herbie is a research compiler I've been working on for about a decade that helps compile mathematical expressions to fast and accurate floating-point code. More recently, I've been comparing new OpenAI models against Herbie to see who is the top numerical analyst. So far, Herbie's been winning all of the match-ups… but with O3-mini's release can the LLMs finally dethrone the...| Pavel Panchekha’s Blog
This proposal suggests a radically different way Herbie could work, making use of high-quality oracle-free methods of evaluating error. This new method would make Herbie simpler, faster, and possibly better, while (I hope) maintaining all the things we currently like about Herbie. This new Herbie would use egg/egglog much more heavily than today, and the greedy improvement loop would be gone. Background Herbie currently works by: Take a starting expression or expressions Grow an egraph contai...| Pavel Panchekha’s Blog
First, users should be able to specify a platform and have Herbie load and run using that platform. The platform should affect:| pavpanchekha.com
This post is the third in a series of me trying to work out batches, a new data structure I am testing for Herbie and more generally for programming with recursive data structures. In my first post I laid out a type theory for batches, and in my second post I discussed how we can check that programs that use batches are safe. But I'm still after a "holy grail": how to making programming with batches "normal". Specifically, most programming with batches involves lots of indices, which need to ...| Pavel Panchekha’s Blog
A year and a half ago, I wrote a blog post comparing Herbie to the first ChatGPT (which we now call, I think, GPT 3.5). I chose 11 floating-point repair benchmarks, and fed all of them to Herbie and ChatGPT. Herbie is a tool my students and I develop to do exactly this work, and I wanted to know if AI tools had obsoleted it. The conclusion was that Herbie was still much better, winning 6/11 and tying two others. Moreover, the cases where Herbie lost ChatGPT's response was usually not actually...| Pavel Panchekha’s Blog
Doing normal programming with batches has a common pattern. For| pavpanchekha.com
In MemBalancer, my student Marisa and I came up with a novel correctness property for garbage collection heap limits, which we called "compositionality", derived compositional heap limits for a basic one-generation garbage collector, and showed that this new heap limit lead to big reductions in memory usage for Chrome's V8 JavaScript runtime. But actually V8 uses a generational GC, and compositional heap limits for that are much harder to derive. Compositional heap limits Any time you have a ...| Pavel Panchekha’s Blog
I was recently talking to Hans Boehm about floating point rounding modes. They are a mess, aren't they? It's a piece of global application state that very subtly changes the meaning of every floating-point operation in the whole program. It causes bugs in compilers, confusion for programmers, and all sorts of pain for library developers. So what would be better? Can we get rid of rounding modes? Well, for one, we could just get rid of rounding modes entirely. I don't mean remove them from the...| Pavel Panchekha’s Blog
In research it's common to compare your solution to some simple approach as a baseline, to prove that whatever complex algorithm you've developed is actually worth using. Sometimes, coming up with this baseline is hard. But now we have ChatGPT, so let's plug a hole in my 2015 research paper on floating-point error improvement by comparing my tool, Herbie, with ChatGPT. Methodology I've chosen a representative sample of 12 benchmarks from the Herbie benchmark suite. Iv'e biased it a bit toward...| Pavel Panchekha’s Blog
Now that I have advised a few research projects, I've begun to notice similarities, common errors, and inflection points. For the benefit of my students, here's a quick summary of how I approach projects. Naturally, this is specific to my research, and might not apply to any other advisor. Projects I divide my students' research into "projects". Usually a project begins with some vague idea, like "balance multiple garbage collectors" or "synthesize math function implementations" or "optimize ...| Pavel Panchekha’s Blog
This semester, I am teaching Compilers, and I'm quite happy with how I covered parsing this year. Specifically, I taught students to build a top-down LR parser and it seemed to go relatively well. Update: Laurence asks me to note that he's not as much of a parser theorist as this post makes him out to be. Update: There's an ongoing discussion on Hacker News about this post. I learned a lot from it! Update: Gábor Lehel pointed me to Russell Johnston's post on recursive ascent parsing. It's no...| Pavel Panchekha’s Blog
About two years ago, I wrote about some ideas for using a typed DSL for implementing math libraries. I wanted to give a bit of a status update on that project: MegaLibm. All the work described here was done by the extremely talented Ian Briggs—if you're looking to hire a verification and optimization wizard, email me. Not everything described below currently works, but it's all pretty close. We hope to have prototypes to show off in a few months. Math Library Implementation When you import ...| Pavel Panchekha’s Blog
Last week I had a fun experience using Herbie to actually improve the numerical accuracy of real code. Discovering a Rust problem It all started when Finn (an undergraduate from UW) submitted a PR to FPBench to compile FPBench benchmarks to Rust. Besides being great work and a valuable contribution, he also pointed out that our test framework was failing on his FPBench to Rust compiler "due to precision issues with Rust's implementation" of the asinh and acosh functions. This intrigued me, be...| Pavel Panchekha’s Blog
I've been thinking recently about the differences between affine arithmetic and error Taylor series. In both, an expression is represented as 1) the answer, plus 2) a number of error terms. But in my experience, tools based on affine arithmetic are less precise than those based on error Taylor series. What gives? And how are the two analyses related? In affine arithmetic, the true answer is modeled as the computed answer plus a series of errors, each of which are a symbolic epsilon times a re...| Pavel Panchekha’s Blog
Earlier today, I merged one of the biggest improvements to Herbie ever. Let me take a moment to celebrate. Why cost-accuracy trade-offs? About a year ago, the Herbie team, lead by Brett and Oliver, published Combining Precision Tuning and Rewriting at ARITH'21. While ostensibly that paper was about incorporating precision tuning into Herbie, in reality it was about slightly more than that—it was about teaching Herbie about program cost, and having Herbie find not just the single most accura...| Pavel Panchekha’s Blog
I've been tackling a lot of old issues in Herbie this summer, and in the process I found out that Herbie's pruning algorithm is incredibly slow. This is the story of how I sped it up, demonstrating both some good optimization tricks and also some nice tooling that we've built in Herbie. The Backstory Pruning is an essential part of Herbie. At a high level, Herbie has three phases: first, it samples a bunch of points it can use to evaluate how accurate a program is; then, it generates a bunch ...| Pavel Panchekha’s Blog
Skia is a fast and widely used 2D graphics library for tasks like rasterization. it's used in Google Chrome and Android, so speeding it up would have large impacts, and I think e-graphs have a lot of potential to drive an interesting research project in that direction. A bit of Skia background I don't know much about Skia, but I learned a lot from Chris Harrelson helping him edit a chapter about visual effects for the book on browsers we are writing together. Here's a quick summary of mental ...| Pavel Panchekha’s Blog
CSS selectors are part of the CSS language used for styling web pages on the web. CSS stylesheets are made up of rules, where each rule has a selector that defines which elements on the page the rule applies to. When the browser loads a stylesheet, it needs to match those selectors to the elements on the page, to determine which rules apply to which element. Here's my question: what is the computational complexity of this matching step? Preliminaries Here I'm considering the matching step in ...| Pavel Panchekha’s Blog
Error models play a central role in automated numerical analysis. Basically, when you have a floating-point operation x + y, it is usually easier to analyze it in terms of non-deterministic real operations. Epsilon and Delta The most common error model is that an operation x * y (taking * to be addition, subtraction, multiplication, or division, and sometimes square root or fused multiply-add) can be modeled with: \[ x \hat{*} y = (x * y) (1 + \varepsilon) \] Here, \(\hat{*}\) represents floa...| Pavel Panchekha’s Blog
E-graphs are good. They are a powerful, general-purpose mechanism for representing sets of programs and reasoning about those sets of programs via equational rules. My colleagues at the University of Washington who work on the Egg library have been developing a "cookbook" of techniques for common subexpression elimination, synthesis, constant folding, and binding in e-graphs. This post is a new take on represent binding in an e-graph. The problem of binding The basic problem is that. Suppose ...| Pavel Panchekha’s Blog
As part of Ian and my work on synthesizing math libraries, I've been thinking about synthesizing range reductions. I think we've found a beautiful and efficient way to do this tricky bit of synthesis using e-graphs. This idea is also related to recent Herbie work on detecting symmetric expressions. What is range reduction The core of modern library function implementation is the Remez algorithm, which approximates a function f over an interval [a, b] with polynomials.1 [1 When certain conditi...| Pavel Panchekha’s Blog
As maintainer of Racket's math library, I sometimes put my numerical methods knowledge to a practical test. For example, a few days ago, I took a look at Racket's quadratic-solutions function, which returns the roots of a quadratic formula. Humorously, I've used the quadratic formula as an example in dozens of talks and in the Herbie paper. Yet it was more complex than I expected! The quadratic formula The quadratic formula returns all roots \(x\) of the equation \(a x^2 + b x + c = 0\). You ...| Pavel Panchekha’s Blog
For NSV 2020, Zach and I were invited to give a keynote talk about numerica accuracy, and we set out to write down some lessons learned from our five years of work on Herbie. One theme of our talk, Toward Numerical Assistants, was on user trust: how we can help users trust Herbie. A recent interaction on the Herbie mailing list underscored this issue. The Problem About a month ago, Steve Hoelzer wrote in with a fun Herbie bug. You can see the bug for yourself, but in short Herbie is proposing...| Pavel Panchekha’s Blog
One of my favorite browser components is the layout engine, which computes the size and position of everything on the page. It's full of algorithmic challenges, but done right it's marvelous how the whole thing comes together. My upcoming book about web browser engineering devotes two full chapters to layout. But algorithmic challenges and tricky algorithms also mean bugs. So over the last year, my students William, Nathan, and I have been working with the Google Chrome team to build a fuzzer...| Pavel Panchekha’s Blog
FPTaylor is a big advance in sound floating-point error estimation;1 [1 Alexey Solovyev, the primary author of FPTaylor, notes that the inspiration for FPTaylor came from earlier projects like Rosa and Daisy, and the Taylor series technique itself goes back further.] but it is little-known even in the field. I've been working with error Taylor series recently and think they're the bees knees, so I want them to be more broadly understood. Error at a point Suppose you have a floating-point expr...| Pavel Panchekha’s Blog
Implementing a mathematical function like sin or exp is tricky! Yet if you want to trade off accuracy and input range, for speed or vectorization, you'll need to do that, and likely once for each architecture you need an implementation for. In the process, you'll need to decide on range reduction strategies, working precisions, polynomial orders, and function-specific tricks. So my student Ian Briggs and I are working to make it easier. I don't want to get deep into the weeds here, so I'm jus...| Pavel Panchekha’s Blog
I've been working with my student Marisa Kirisame on optimizing that space-time tradeoff, because different programs and different choices trade space for time at different efficiencies. Garbage collectors are an important way most programs trade space for time. So here I want to show a quick derivation that demonstrates how that optimization can occur. A Toy Model Consider a program that runs continuously with \(W\) bytes of live memory, and also generates garbage at a constant rate of \(g\)...| Pavel Panchekha’s Blog
As I work on Herbie, my numerical analysis assistant, I'm always thinking about how I could teach it classic numerical analysis tricks. Recently, my student Oliver Flatt and I have been thinking about tricks for symmetric expressions. Symmetry for Accuracy Consider the following computation, which is important in statistics: \[ \log(e^a + e^b) \] The numerical analysis wisdom is that this expression is best evaluated via the expression max(a, b) + log1p(exp(min(a, b) - max(a, b))) There's a l...| Pavel Panchekha’s Blog
I just finished teaching Utah's Software Verification course, and I've been thinking how to improve the final project to be more straightforward and informative. This draft is something I might assign in the future; it is intended to be completed over multiple weeks, with regular check-ins to make sure students are on track. Overview This assignment involves writing a verifier for a simple programming language. The verifier converts code to verification conditions, uses Z3 to test those verif...| Pavel Panchekha’s Blog
Over the last semester of 2019, Oliver Flatt rewrote Herbie's simplifier—its most important and most expensive component—to use Max Willsey's Egg library. There were quite some difficulties involved, but this change speeds up Herbie by 2.8× and opens up opportunities for further improvements in the future. E-graphs Herbie's simplification component takes an algebraic expression and simplifies it. The main intended benefit is to remove algebraic cancellations, which would otherwise introd...| Pavel Panchekha’s Blog
As part of my usual quality assurance routine, I did a crawl through the latest Herbie reports, looking for timeouts, long runtimes, and sampling errors. In summary, Herbie is now really fast! I found only a few benchmarks that run for more than 20s, with even quite large benchmarks running fast. What makes Herbie slow? Taking a look at a few timelines, it seems like Herbie now spends most of its time, especially for the slower benchmarks, in series expansion and pruning. That said, it seems ...| Pavel Panchekha’s Blog
As I am teaching Software Verification, I am leaning on an analogy that symbolic execution corresponds to expressions in a given programming language while Hoare logic corresponds to statements. It's vague but to the extent that your language has "pure" and "imperative" bits to it, symbolic execution and Hoare logic are frameworks for reasoning about them. But I've long felt that the theory beyond that level is poorly settled. The way I see it, most programs have structure at roughly four lev...| Pavel Panchekha’s Blog
Shell scripts are bug prone, unmaintainable, and inscrutible. But what in particular makes them bad? This is a good question without answers. I have four independent theories.1 [1 Naturally all probably contribute, but assigning blame, or resolving different kinds of errors to different causes, would be valuable. "All of the above" is usually true but without more details also usually useless.] Shell is a bad programming language Data structures beyond text are necessary Processes are a baroq...| Pavel Panchekha’s Blog
I was discussing parallel web page layout in class and mentioned that drawing text in parallel is quite difficult. That made me think: what about other text-related algorithms? So in this post I'll describe my five-hours' amusement: a parallel line-breaking algorithm. Defining the problem By line breaking I mean the task of figuring out which words each line of text contains. In a web browser, this is always done greedily. That is, let's suppose we've already measured each word in the line of...| Pavel Panchekha’s Blog
I met with a student recently who was a little confused by the relationship between first-order logic, ZFC, dependent types, Martin-Löf, and so on. Recalling S. Aaronson, I explained it by analogy to computers. Proof systems, like first-order logic1 [1 Here I'll be specifically refering to classical first-order logic with equality.] or dependent types, are like the operating system for your computer. By themselves, they do not do much; their main goal is to run particular programs. These pro...| Pavel Panchekha’s Blog
About a year ago, I did a little survey of PL blogs from the first half of 2018: who blogs (199 people), how much (3.8 posts every half-year, on average), and what about (mostly technical, PL-related content). I even gave a breakdown of the 199 non-technical blog posts by topic. In that post I promised to do a closer look at the technical, PL-related blog posts, since they were by far the biggest category, and worth breaking down further. Well, here we are, a year later… Turns out reading a...| Pavel Panchekha’s Blog
Haoyi Li recently wrote an excellent blog post on the Visitor pattern, specifically on how Visitors make fusion and streaming really easy. I loved the example of JSON parsing and processing. It is an excellent example of a fundamental duality in PL theory: intrinsic and extrinsic data. Constructing extrinsic data Intrinsic data is what we normally think of as data, a number, say, or an array. Extrinsic data is data described by its functionality. For example, instead of five being then number...| Pavel Panchekha’s Blog
Applications often implement one of a small number of architecture. These core architectures are like "design patterns", except they are patterns in dataflow and computation structure, not in class design and structure. I've documented some patterns documented below, along with links for common problems those architectures face. Event loop Many applications have a "main loop", which processes events and fires off callbacks as a result of these events. For example, the read-eval-print loop in ...| Pavel Panchekha’s Blog
One of the most rewarding non-research tasks I've done in grad school is to maintain Herbie, a tool for automatically improving the accuracy of floating-point programs. I published Herbie at PLDI’15, which means that Herbie was largely done (as a research project) in 2014. But in the five years since, Herbie has had several major changes that have significantly improved it. Making Herbie more maintainable Since I maintain Herbie in my free time (I have other research projects), making Herbi...| Pavel Panchekha’s Blog
My work on formally specifying web page layouts often has people asking how web page layouts—which after all are esthetic as much as they are functional—could possibly follow a specification in pure logic. I understand the confusion: we too often think of a program as having an end-to-end specification, and that the point of verification is to allow us to treat the program as a black box, secure in the understanding that it follows the spec. That is an unusual special case. Most programs ...| Pavel Panchekha’s Blog
In an interview with Devon Zeugel, Stu Card mentions that lower case had to be invented, and that it had the benefit of increasing reading speed: The Carolingian Renaissance invented small letters. Those small letters put together with the capitals allowed words to have shapes. They could be then read in faster speed. Now, I think the usability theory here is disputed,1 [1 See Kevin Larson via Susan Weinschenk.] and the history is not quite right2 [2 Stu seems to be referring to Carolingian m...| Pavel Panchekha’s Blog
In a previous post, I discussed a twist on the Prisoner's Dilemma, wherein it is rational for agents to “feel shame”: to modify their utility function to decrease the utility of defecting against cooperators. What's neat is that it is a fully rational micro-foundation for getting cooperation in the PD. Today, I want to look whether this behavior can be learned. Learning to play One line of research in game theory is in determining how game-theoretic agents learn to play the game they are ...| Pavel Panchekha’s Blog
New academic languages are near-universally variations on a theme: first-class closures, mostly functional, with inductive data types and an algebraic type system. So has programming language design as a field ended? I think yes, but there's a chance to apply the same approaches and techniques to the design of standard libraries. In this post, I demonstrate that comparing even the most basic libraries across languages demonstrates many degrees of freedom for a designer to weigh. All serious p...| Pavel Panchekha’s Blog
None of the web technologies are great, but CSS is probably the least-complained-about.1 [1 CSS does not feature in the wat talk.] But that doesn't mean people don't complain! CSS is confusing, from how floats work to problems with interfering selectors. I've spent years doing academic CSS research, so I have some thoughts on what's wrong with CSS. But first—let's survey what others think. If you're not interested in a dissection of the standard and just want to know why CSS is so weird, I ...| Pavel Panchekha’s Blog
Functional reactive programming is a niche but interesting way of writing stateful programs, especially games and user interfaces. The basic idea is to implement the user interface as a purely functional program from the history of user inputs to a UI. That UI is recomputed every time user input occurs, automatically using caching to make things efficient. Modern formulations1 [1 I like the FRP Refactored paper at Haskell'16.] generally use some sort of type-theory machinery to ensure that yo...| Pavel Panchekha’s Blog
Series This is post 4 of the Crossword Science series. < In my previous post on crossword science, I looked at whether Joe got better at crosswords by doing several hundred crosswords in one weekend. Today, I want to look not at getting better at crosswords, but at pretending to be better at crosswords. The model If you recall, my model of crossword performance contains four factors: The difficulty of that day's crossword The skill of the solver Crosswords on Saturdays are harder Beginners un...| Pavel Panchekha’s Blog
Series This is post 3 of the Crossword Science series. < > In my last post on crossword science, I looked at whether people get better at crosswords over time. They don't.1 [1 Or, they do, for the first few months; but not after that, on average.] But that just looked at doing one crossword every day. What if you did dedicated, sustained crossword practice? That's today's post. Joe's weekend Joe has always talked of getting good at crosswords. When my model provided him with a single numeric ...| Pavel Panchekha’s Blog
Series This is post 2 of the Crossword Science series. < > People want to get better at crosswords. There's something about recording a number every day that makes you invested in it, you know? In my last post on crossword science, I talked about a competitive NYT Minicrossword club I'm a part of, and how I used Stan to learn a statistical model of different peoples' skill at the crossword. However, that model viewed skill as a single, unchangeable attribute. In this post, I want to fix that....| Pavel Panchekha’s Blog
Series This is post 1 of the Crossword Science series. > One of the biggest Slack channels at UW is called #minicrossword: it is a channel for fans of the NYT Minicrossword to post their times. Times are recorded by a Slack bot that Max wrote, so that we can crown a daily winner, improve over time, or try to beat our nemeses. But one scary truth is that some people solve crosswords better than others; so some people win almost every day while others are nearly always at the bottom of the stan...| Pavel Panchekha’s Blog
I have a To-do list that tracks everything I ought to do: blog posts to write, bugs to fix, emails to send, and side projects to continue. This is not a list of things I will do today;1 [1 I sort items I plan to do today to the top of the list] the purpose of the list is really to avoid forgetting stuff, to get it out of my head so that I can stop thinking about it. After using the list for a while, I noticed a problem. I felt good when the list was short, and bad when it was long. So, I avoi...| Pavel Panchekha’s Blog
Herbie is a tool for automatically transforming floating-point formulas to be more accurate, and it's pretty good at this job. And one of Herbie's best features is that it does not require any user input; numerical methods expertise is rare. But while fully automated operation is convenient, it doesn't help the user understand what Herbie is doing, and while Herbie is great at finding clever transformations, it's not very good at making its output simple, digestible, and clear. With these iss...| Pavel Panchekha’s Blog
It seems every programming language now has its own package manager. These package managers solve both technical and social problems. But they solve the social problems poorly; version customs are a focal example. Technical and social problems All package managers download packages, install their recursive dependencies, unpack packages in the right directories, and record their location so they can be found by the language runtime. Some also go further, building C modules or documentation. Au...| Pavel Panchekha’s Blog
One trend I'm seeing in PL research is “programs as patchwork”, where different aspects of one program are written in different DSLs. Halide The best example of this kind of work is Halide, a language for writing fast matrix operations (for graphical filters). Halide is originally a product of academia but is now widely used in industry. For example, a 3×3 blur in Halide would be described by first defining the computation: blur_x(x, y) = (input(x-1, y) + input(x, y) + input(x+1, y))/3; ...| Pavel Panchekha’s Blog
I had a blast attending RacketCon 2018. Besides some great talks (check out Nadeem Abdul Hamid's talk on Sinbad), I also learned about a lot of neat stuff going on in the Racket world (for example, I discovered both the macro and feature profilers). One particularly enjoyable bit was attempting to port a part of Herbie to Typed Racket, under Sam Tobin-Hochstadt's supervision. Typed Racket is Racket plus sound gradual typing. This means that all functions get types, and those types are checked...| Pavel Panchekha’s Blog
Resistance is a game, much like the math-camp-beloved Mafia, where one group of players tries to coordinate against traitors in their midst. Unlike Mafia, people aren't actively removed from the game, which makes it more fun at parties. I am not fun at parties, so I decided to ruin everyone else's fun by solving the game. Simplifying the game To solve a game is to learn how to play it best. Resistance has two teams (the good guys and the traitors), so each team's best strategy must be the bes...| Pavel Panchekha’s Blog
In my last post, I proved a neat little theorem in category theory: that if \(C\) and \(D\) are categories with pushouts, then the category \(C \to D\) of pushout-preserving functors from \(C\) to \(D\) itself has pushouts. Here, I'm going to develop a construct similar to pushouts, but a little more general. Generalized pushouts A pushout is a construction you might be able to perform in a category on two arrows \(f\) and \(g\) that have a common source. If \(f : A \to B\) and \(g : A \to C\...| Pavel Panchekha’s Blog
I've been thinking about pushouts in category theory recently, and found a quick theorem about categories with pushouts that I haven't seen anywhere. Cartesian closure of pushouts Theorem: Let C be a category, and D a category with pushouts. Then the functor category C → D also has pushouts. Proof: Let us consider the construction of an arbitrary pushout in C → D. We have three functors F, G, H : C → D, and two natural transformations α : F → G and β : F → H. We must define a new ...| Pavel Panchekha’s Blog
On May 5–6th, I took part in Microsoft's Puzzlehunt. I love these events (as you can imagine, from pastpuzzles I've written), but this year's had a puzzle I also managed to apply some of my Z3 skills to. You see, in a puzzle hunt, you're allowed to use any tools you wish,1 [1 Except for people not on your team…] including programming and Z3. But it's rare that I find the opportunity… How the Puzzle Worked This year, one of the puzzles was a cryptic mastermind game. In the usual mastermi...| Pavel Panchekha’s Blog
As a long-time blogger,1 [1 I guess…] I find the idea of blogging, especially blogging about technical topics, fascinating. So I've recently been attempting a survey of PL blogging. This part talks about overall statistics. My aim was to do something like a comprehensive survey. Of course, it is impossible to be truly comprehensive, so I'll try to indicate when I am selecting against something or other, but I also hope that I have captured, say, a decent percentage of the PL research commun...| Pavel Panchekha’s Blog
A standard caricature in game theory is that voting gives politicians incentives for bland, middle-of-the-road appeal. That's not true in a system with primaries. Without primaries The usual setup is, suppose political philosophies are entirely one-dimensional, so that each voter can be identified by a real number \(x\) (say, less than zero for left-wing and more than zero for right-wing, with opinion on all issues always in perfect correlation), and let the cumulative density function of vot...| Pavel Panchekha’s Blog
Update: M. Arntzenius gave a great talk a few years ago about voice programming using the Talon voice typing system. I don't know how it is that I didn't think of voice typing systems but they're the well-developed answer here. Interestingly Arntzenius confirms a lot of points that I guessed at here (nominalism, integration with language) but also a lot of new ones (like the challenges of resolving ambiguity and handling errors). Two years ago at PNW PLSE, I had a wonderful conversation with ...| Pavel Panchekha’s Blog
A conversation a month ago with James reminded me of two blog posts I'd written a while ago: on interpreting first-order logic as a modal logic and on equality of functions in ITT. Combining the two posts, I've started wondering whether it's possible to define a type system that lets you prove facts about open terms. Why open terms In that second post, I pointed out that ITT allows you to prove ∀n, n + 0 = 0 + n, but does not allow you to use this fact to prove (n ⇒ n + 0) = (n ⇒ 0 + n)...| Pavel Panchekha’s Blog
I recently had an interesting conversation with Xin Yi, of NUDT, about accurately evaluating the polynomial \(P(x) = 5x^2 - 3\) near its root \(x = \sqrt{3/5}\). He wanted to know: what's the most accurate way to evaluate this polynomial, and why doesn't Herbie find it? Error in evaluating polynomials Evaluating a polynomial P(x) at one of its roots is bound to be inaccurate. If you pick a floating point input x, think of it as representing a range (x-ε, x+ε). and very close to a zero, P(x-...| Pavel Panchekha’s Blog
A conversation with Jeffrey Sarnoff on the Herbie mailing list raised a very interesting small example of how floating-point error can occur. This example is extreme, but I thought it would be interesting to work through. The program is this: sqrt(x^2). Non-compositional error Run it through Herbie and Herbie will tell you that this expression is fairly inaccurate: approximately 29.1 bits of error, with the problems occurring for very large and very small inputs.1 [1 Herbie suggests the more ...| Pavel Panchekha’s Blog
In a recent conversation, James enlightened me about the connection between functions and relations in first order logic. In this short post I'd like to share the insight. Functions, of course, are a certain kind of relation—a total, functional relation. In set theory we directly define functions that way, and in any first order logic we think of a function f(x) and the relation f(x) = y as representing the same thing. Yet there is a definite difference between the two: syntactically, they ...| Pavel Panchekha’s Blog
In Cassius, I am making it possible to mechanically reason about the layout of a web page. It's been a lot of work! The hardest part has been writing a logical specification for most of the CSS rendering algorithm; in other words, the hardest part is the part where I write a web browser. But there are some core ideas in Cassius separate from the CSS specification, and to make those clear in this post I'll write a toy version of Cassius from scratch. Toy CSS To make toy Cassius somewhat shorte...| Pavel Panchekha’s Blog
Often when I switch between Racket, my home language, and Coq, a research interest, I am struck by a minor but meaningful difference: Racket allows functions that vary in how many arguments they take, and Coq does not. It's minor, but it means that, for example, I cannot use this idiom: (map + '(1 2 3) '(4 5 6) '(7 8 9)) ⇝ '(12 15 18) This works because in Racket, + takes any number of arguments and adds them all, and map takes any number of lists as arguments (it picks one element from eac...| Pavel Panchekha’s Blog
Cindy Rubio Gonzáles recently visited UW PLSE and gave a fantastic lecture on her work—both on Precimonious and on her ICSE'16 paper on blame analysis. I spent some time this weekend reading that paper carefully and thinking about how it relates to my floating-point work. I think blame analysis is related to a key technique that appears in both Herbie and Herbgrind: local error. Local error To identify the operations in a floating-point computation most responsible for error, local error l...| Pavel Panchekha’s Blog
Note This article describes my attempt, in the summer of 2015, to develop a modular version of Verdi, a framework for verified distributed systems in Coq. The dream of a modular Verdi has now been achieved, in a more complete and interesting way than in this work, by my friends Ilya Sergey, James R. Wilcox, and Zachary Tatlock in their DiSeL system (first described at SNAPL’17 and then published in full detail at POPL’18). This article demonstrates one alternate idea, and highlights the d...| Pavel Panchekha’s Blog
Some of my friends at the University of Washington have been working with computational geometry and running into quite a few headaches because of floating-point error. Of course, they were expecting my work on Herbie to have taught me how to help, so I've been fielding a few floating-point questions. When they got desperate enough to start using MPFR and talking about computable reals, I had an idea. My friends are working on a programming language for describing shapes in two and three dime...| Pavel Panchekha’s Blog
I use Emacs Org-mode to maintain this blog, and I keep those Org-mode source files in git (with Gitolite). I love this setup, because to me it is transparent, nearly fool-proof, and still gives me many of the benefits of a modern publishing system. In particular, I can write draft posts by just committing them to git without linking them from the main page. These get rendered and published, so I can share them with friends and ask for feedback; yet, since they're not linked, no one is going t...| Pavel Panchekha’s Blog
When I started studying dependent types a few years ago, one question I never got a good answer to was: why is equality such a big deal? Equality types are, of course, practically important. They're practically important because we want to prove stuff equal to other stuff a lot! But why does adding equality types necessitate so much deep theory about how a dependent type system works?1 [1 After all, integers, also a complicated concept, can be added cleanly and independently, even though indu...| Pavel Panchekha’s Blog
A few years I did a comprehensive breakdown of how much writing the various sections of the HOOT paper required. What surprised me was how many more times some sections, like the introduction, were rewritten, compared to the technical sections, which only had to be written and edited a few times. This post is a sequel: I want to look at how hard each section was to write, not how much editing it took. Gathering the data To answer this question, I'm going to use TimeTracker, a project Cathy an...| Pavel Panchekha’s Blog
The group web page I made for the research group I'm in has a neat design detail. Our group has a presence on various sites, and to link to those, we use those sites' icons. The icons are normally flat and colored the same light blue that we use as a color throughout the design. But when you hover over an icon, it changes color, to be the site's familiar colored logo: Figure 1: Hovering over a social site's icon makes it colorful It took a few hours, but I managed to make this effect work tot...| Pavel Panchekha’s Blog
In Cassius, I am formalizing the CSS standard; in other words, I am implementing my own web browser, in logic. I've been reading the standard, trying out examples in browsers, and condensing the standard into something easy for a computer to reason about. Recently I've been working on floats, and I've come up with a quite simple mental model for thinking about float layout. Since I don't see it described anywhere else, I want to explain. Table of Contents Summary How are floats positioned CSS...| Pavel Panchekha’s Blog
Cassius, my tool for formal reasoning about web page layout, can be used in a lot of ways. You can use it to lay out web pages, much like a browser; you can use it to find bugs like overlapping text; you can use it to debug web pages; or you can use it to synthesize CSS files from examples. For each of these uses, Cassius provides a convenient command-line tool. Since most of these tools use the same back-end, these command-line tools share many options and flags. The old way: multiple files ...| Pavel Panchekha’s Blog
I've been working on a little side-project called Fine to explore some of the foundations of type theory. I haven't had much time recently to pursue it, so I'm writing up the work I've already done. A preliminary implementation is on GitHub, or you can look at the rudimentary language tutorial using RawGit. Function intensionality The main goal of Fine is to explore what a world without function extensionality would be like. I've written before that basic type theories usually leave open the ...| Pavel Panchekha’s Blog
I do most of my research in Racket (Herbie, Cassius). I work in programming languages, so most of my code manipulates programs as data. Racket is a wonderful language for this thanks to its extensive support for S-expression program representations: pattern matching, quasi-quoting, and a bevy of list manipulation functions all help manipulate programs. However, writing down type signatures or contracts for S-expression based data structures is a pain. This post is about a helpful macro, defin...| Pavel Panchekha’s Blog
Over winter break, I decided to dig into my notes and try to sketch out the history of the Herbie project. Besides a fun exploration of our git history, I was hoping to crystallize some of the lessons I learned from the project. Besides being my first successful research project in general, I think Herbie taught me a few particular lessons: Tackle ambitious problems I don't work well alone Good benchmarks guide research Generating reports from search processes is great for debugging You never...| Pavel Panchekha’s Blog
I am working on a side-project to understand game theory where the utility functions can change, using modal logic. In the course of doing this, I came up against the question: is there any use for a modality that takes two arguments? Modal logic is a logic with an operator of type Prop → Prop. There are, of course, many modal logics that take an additional argument; for example, the epistemic modalities are used to describe when various agents know various things, and usually are given as ...| Pavel Panchekha’s Blog
Game theorists love studying rational agents, yet humans are obviously not rational agents.1 [1 No, not even rational agents with very complex utility functions.] Why study rational agents? In a slogan: it's rational to be rational. Even in a world with irrational agents, it is individually advantageous to be rational. It never hurts to play the rational strategy—that defines the rational strategy. So it is individually advantageous to become a rational actor. The only equilibrium is for ev...| Pavel Panchekha’s Blog
In a previous post, I introduced modal logic as a common way of adding an operation from propositions to propositions to ordinary logic. In that post, I noted that you want to add the operator, and then axiomatize some properties of it. Now I want to examine the model of modal logic, to shed more light on what the axioms mean. The particular type of model used for modal logic is called a Kripke structure.1 [1 More precisely, only “normal” modal logics have a Kripke structure as their sema...| Pavel Panchekha’s Blog