I finally had some time over the holidays to complete the first panel of the TorchStation. The core idea is to have a monitor box The post TorchStation Prototype V1 – GPUs panel first appeared on Terra Incognita.| Terra Incognita
We have been training language models (LMs) for years, but finding valuable resources about the data pipelines commonly used to build the datasets for training The post Large language model data pipelines and Common Crawl (WARC/WAT/WET) first appeared on Terra Incognita.| Terra Incognita
Just sharing ~100 slides about PyTorch 2 internals focusing on recent innovations (Dynamo, Inductor, and ExecuTorch). I had a lot of fun preparing this and The post PyTorch 2 Internals – Talk first appeared on Terra Incognita.| Terra Incognita
I really like to peek into different ML codebases for distributed training and this is a very short post on some things I found interesting The post Torch Titan distributed training code analysis first appeared on Terra Incognita.| Terra Incognita
This is just a fun experiment to answer the question: how can I share a memory-mapped tensor from PyTorch to Numpy, Jax and TensorFlow in The post Memory-mapped CPU tensor between Torch, Numpy, Jax and TensorFlow first appeared on Terra Incognita.| Terra Incognita
PS: thanks for all the interest, here you are some discussions about VectorVFS as well: Hacker News: discussion thread Reddit: discussion thread When I released The post VectorVFS: your filesystem as a vector database first appeared on Terra Incognita.| Terra Incognita
Note: This is a continuation of the previous post: Thoughts on Riemannian metrics and its connection with diffusion/score matching [Part I], so if you haven’t The post The geometry of data: the missing metric tensor and the Stein score [Part II] first appeared on Terra Incognita.| Terra Incognita
Happy new year ! This is the first post of 2025 and this time it is not a technical article (but it is about philosophy The post Notes on Gilbert Simondon’s “On the Mode of Existence of Technical Objects” and Artificial Intelligence first appeared on Terra Incognita.| Terra Incognita
Introduction One of the most interesting, but also obscure and difficult parts of Kant's critique is schematism. Every time I reflect on generalisation in Machine Learning and how concepts should be grounded, it always leads to the same central problem of schematism. Friedrich H. Jacobi said that schematism was "the most wonderful and most mysterious| Terra Incognita
Introduction| blog.christianperone.com
Introduction Hi ! I was going to publish this content on ArXiv but I decided to write a blog post this time so I can write it a bit more informally =) It is not a secret that Diffusion models have become the workhorses of high-dimensionality generation: start with a Gaussian noise and, through a| Terra Incognita