tl;dr https://gist.github.com/MaskRay/74cdaa83c1f4| MaskRay
Alignment refers to the practice of placing data or code at memory addresses that are multiples of a specific value, typically a power of 2. This is typically done to meet the requirements of the prog| MaskRay
In my previous post, LLVM integrated assembler: Improving expressions and relocations delved into enhancements made to LLVM's expression resolving and relocation generation. This post covers recent re| MaskRay
In my previous assembler posts, I've discussed improvements on expression resolving and relocation generation. Now, let's turn our attention to recent refinements within section fragments. Understandi| MaskRay
For years, I've been involved in updating LLVM's MC layer. A recent journey led me to eliminate the FK_PCRel_ fixup kinds: MCFixup: Remove FK_PCRel_ The generic FK_Data_ fixup kinds handle both absolute and PC-relative fixups. ELFObjectWriter sets IsPCRel to true for `.long foo-.`, so the backend has to handle PC-relative FK_Data_. However, the existence of FK_PCRel_ encouraged backends to implement it as a separate fixup type, leading to redundant and error-prone code. Removing FK_PCRel_ sim...| MaskRay
On https://x.com/settings/, click More -> Settings and privacy -> Download| MaskRay
Both compiler developers and security researchers have built disassemblers. They often prioritize different aspects. Compiler toolchains, benefiting from direct contributions from CPU vendors, tend to offer more accurate and robust decoding. Security-focused tools, on the other hand, often excel in user interface design. For quick disassembly tasks, rizin provides a convenient command-line interface.| MaskRay
The Google C++ Style is widely adopted by projects. It contains a brace omission guideline in Looping and branching statements: For historical reasons, we allow one exception to the above rules: the curly braces for the controlled statement or the line breaks inside the curly braces may be omitted if as a result the entire statement appears on either a single line (in which case there is a space between the closing parenthesis and the controlled statement) or on two lines (in which case there...| MaskRay
LLD, the LLVM linker, is a mature and fast linker supporting multiple binary formats (ELF, Mach-O, PE/COFF, WebAssembly). Designed as a standalone program, the code base relies heavily on global state, making it less than ideal for library integration. As outlined in RFC: Revisiting LLD-as-a-library design, two main hurdles exist: Fatal errors: they exit the process without returning control to the caller. This was actually addressed for most scenarios in 2020 by utilizing llvm::sys::Process:...| MaskRay
LLVM's C++ API doesn't offer a stability guarantee. This means function signatures can change or be removed between versions, forcing projects to adapt. On the other hand, LLVM has an extensive API surface. When a library like llvm/lib/Y relies functionality from another library, the API is often exported in header files under llvm/include/llvm/X/, even if it is not intended to be user-facing. To be compatible with multiple LLVM versions, many projects rely on #if directives based on the LLVM...| MaskRay
LLVM 19 will be released. As usual, I maintain lld/ELF and have added some notes to https://github.com/llvm/llvm-project/blob/release/19.x/lld/docs/ReleaseNotes.rst. I've meticulously reviewed nearly| MaskRay
In my previous post, LLVM integrated assembler: Improving MCExpr and MCValue delved into enhancements made to LLVM's internal MCExpr and MCValue representations. This post covers recent refinements to| MaskRay
A dominator tree can be used to compute natural loops. For every node H in a post-order traversal of the dominator tree (or the original CFG), find all predecessors that are dominated by H. This identifies all back edges. Each back edge T->H identifies a natural loop with H as the header. Perform a flood fill starting from T in the reversed dominator tree (from exiting block to header) All visited nodes reachable from the root belong to the natural loop associated with the back edge. These no...| MaskRay
In my previous post, Relocation Generation in Assemblers, I explored some key concepts behind LLVM’s integrated assemblers. This post dives into recent improvements I’ve made to refine that system. Th| MaskRay
This post explores how GNU Assembler and LLVM integrated assembler generate relocations, an important step to generate a relocatable file. Relocations identify parts of instructions or data that canno| MaskRay
This post describes how to compile a single C++ source file to an object file with the Clang API. Here is the code. It behaves like a simplified clang executable that handles -c and -S.| MaskRay
Followed this guide: https://www.patrickthurmond.com/blog/2023/12/11/commenting-is-available-now-thanks-to-giscus Add the following to layout/_partial/article.ejs 12345678910111213141516171819<% if| MaskRay
This article describes ELF interposition, the linker option -Bsymbolic, and its friends. In the end, it will discuss an ambitious plan which I dubbed "the Last Alliance of ELF and Men". Motivated by a| MaskRay
Updated in 2022-09. Background: -fno-pic can only be used by executables. On most platforms and architectures, direct access relocations are used to reference external data symbols. -fpic can be used| MaskRay
Updated in 2024-01. Branch target Many architectures encode a branch/jump/call instruction with PC-relative addressing, i.e. the distance to the target is encoded in the instruction. In an executable| MaskRay
Updated in 2025-05. Symbol address In an executable or shared object (called a component in ELF), a text section may need the absolute virtual address of a symbol (e.g. a function or a variable). The| MaskRay
Updated in 2025-02. Thread-local storage (TLS) provides a mechanism allocating distinct objects for different threads. It is the usual implementation for GCC extension __thread, C11 _Thread_local, and| MaskRay
LLVM 20 will be released. As usual, I maintain lld/ELF and have added some notes to https://github.com/llvm/llvm-project/blob/release/20.x/lld/docs/ReleaseNotes.rst. I've meticulously reviewed nearly all the patches that are not authored by me. I'll delve into some of the key changes.| MaskRay
tl;dr https://gist.github.com/MaskRay/74cdaa83c1f44ee105fcebcdff0ba9a7| MaskRay
Clang provides a few options to generate timing report. Among them, -ftime-report and -ftime-trace can be used to analyze the performance of Clang's internal passes. -fproc-stat-report records time a| MaskRay
一如既往,主要在工具链领域耕耘。 Blogging I have been busy creating posts, authoring a total of 31 blog posts (including this one). 7 posts resonated on Hacker News, garnering over 50 points. (https://news.ycombinato| MaskRay
In debuggers, stepping into a function with arguments that involve function calls may step into the nested function calls, even if they are simple and uninteresting, such as those found in the C++ STL| MaskRay
GNU ld's output section layout is determined by a linker script, which can be either internal (default) or external (specified with -T or -dT). Within the linker script, SECTIONS commands define how i| MaskRay
Updated in 2024-02. (In celebration of my 2800th llvm-project commit) Happy Halloween! This article describes relative relocations and how the RELR format can greatly decrease file sizes. An ELF linke| MaskRay
This article introduces CREL (previously known as RELLEB), a new relocation format offering incredible size reduction (LLVM implementation in my fork). ELF's design emphasizes natural size and alignme| MaskRay
This article describes ABI and toolchain considerations about systems without a Memory Management Unit (MMU). We will focus on FDPIC and the in-development FDPIC ABI for RISC-V, with updates as I delv| MaskRay
LLVM 18 will be released. As usual, I maintain lld/ELF and have added some notes to https://github.com/llvm/llvm-project/blob/release/18.x/lld/docs/ReleaseNotes.rst. I've meticulously reviewed nearly| MaskRay
This article describes some notes about z/Architecture with a focus on the ELF ABI and ELF linkers. An lld/ELF patch sparked my motivation to study the architecture and write this post. z/Architecture| MaskRay
Updated in 2024-04. GNU indirect function (ifunc) is a mechanism making a direct function call resolve to an implementation picked by a resolver. It is mainly used in glibc but has adoption in FreeBSD| MaskRay
My journey with the LLVM project began with a deep dive into the world of lld and binary utilities. Countless hours were spent unraveling the intricacies of object file formats and shaping LLVM's rele| MaskRay
This article describes SHF_ALLOC|SHF_COMPRESSED sections in ELF and lld's linker option --compress-sections to compress arbitrary sections.| MaskRay
Updated in 2023-11. For a user who only uses one C++ standard library, such as libc++, there are typically three compatibility goals, each with increasing compatibility requirements: Can the program,| MaskRay
I do not use Apple products myself, but I sometimes delve into Mach-O due to my interest in object file formats. Additionally, my LLVM/Clang changes sometimes require some understanding of Mach-O. Occ| MaskRay
When linking an oversized executable, it is possible to encounter errors such as relocation truncated to fit: R_X86_64_PC32 against `.text' (GNU ld) or relocation R_X86_64_PC32 out of range (ld.lld).| MaskRay
This article provides a description of popular assemblers and their architecture-specific differences. Assemblers GCC generates assembly code and invokes GNU Assembler (also known as "gas"), which is| MaskRay
UNDER CONSTRUCTION This article describes target-specific details about AArch32 in ELF linkers. I described AArch64 in a previous article. AArch32 is the 32-bit execution state for the Arm architectur| MaskRay
This article describes an interesting overflow bug in the ELF hash function. The System V Application Binary Interface (generic ABI) specifies the ELF object file format. When producing an executable| MaskRay
This article describes target-specific details about AArch64 in ELF linkers. AArch64 is the 64-bit execution state for the Arm architecture. The AArch64 execution state runs the A64 instruction set. T| MaskRay
This article describes target-specific details about Power ISA in ELF linkers. Initially there was IBM POWER. The 1991 Apple–IBM–Motorola alliance created PowerPC. In 2006, the architecture was rebran| MaskRay
Updated in 2024-08. Note: The article will likely get frequent updates in the next few days. This article describes some approaches to distribute debug information. Commands below will use two simple| MaskRay