How to quickly use llama.cpp for LLM inference (no GPU needed).| /dev/posts/
How to quickly use vLLM for LLM inference using CPU.| /dev/posts/
Overview of neural network distillation| /dev/posts/
Some notes on how transformer-decoder language models work,| /dev/posts/
Extracting the system prompt from GitHub CoPilot.| /dev/posts/
AIStore + HuggingFace: Distributed Downloads for Large-Scale Machine Learning| AIStore
AIStore + HuggingFace: Distributed Downloads for Large-Scale Machine Learning| AIStore
This year’s SIGIR conference featured the “LiveRAG Challenge”. In this competition participants received 500 questions synthesized from a given corpus (FineWeb10BT) and had 2 hours time to generate answers... The post SIGIR LiveRAG Challenge Report appeared first on OpenSource Connections.| OpenSource Connections
This might be beating a dead horse, but there are several "mysterious" problems LLMs are bad at that all seem to have the same cause. I wanted an article I could reference when this comes up, so I wrote one. LLMs can't count the number of R's in strawberry. LLMs …| Brendan Long
Implementing OpenAI's Whisper speech-to-text model in Elixir with Bumblebee - a simple GenServer solution that replaces a complex Python microservices stack| Lucas Sifoni
In this post we look at the new Prompt Guard 2 model from Meta, and introduce a concept I've been calling "Tokenization Confusion" which aims to confuse Unigram tokenization into generating tokens which will result in the misclassification of malicious prompts. We'll also look at why building up our ML knowledge will lead to better findings when assessing LLM API’s, as I discovered during a flight across the Atlantic.| XPN InfoSec Blog
Advanced shape restrictions, such as combinations of monotonicity and convexity / concavity using polyhedral cones| Alex Shtoff
Fitting shape-restricted functions with ease using PyTorch.| Alex Shtoff
We develop an efficient alternative to PyTorch built-in dataloader class for the case of in-memory datasets, and lightweight models.| Alex Shtoff
We demonstrate how we can reduce model size by pruning un-needed neurons.| Alex Shtoff
We study a way to represent a tilted loss as an average of losses by lifting to a higher dimensional space, and employing regular SGD| Alex Shtoff
We study various polynomial bases from the bias-variance perspective, and the derivative-control properties of the Bernstein basis. This concludes our series on polynomial regression.| Alex Shtoff
We demonstrate an important use-case for Bernstein basis regularization in model calibration. We briefly discuss the use-cases of a well-calibrated machine learned classification model, and develop a simple calibrator that improves upon the ones provided by Scikit-Learn using regularization of the Bernstein basis.| Alex Shtoff
Are your users searching for particular brands? Can we use LLMs for brand detection and use this to improve search & drive business? The post More Query Understanding: Brand Detection with LLMs appeared first on OpenSource Connections.| OpenSource Connections
Looking at bias and variance from another perspective..| Good Audience - Medium
A deep dive into DeepSeek’s Multi-Head Latent Attention, including the mathematics and implementation details. The layer is recreated in Julia using Flux.jl.| liorsinai.github.io
There are many excellent AI papers and tutorials that explain the attention pattern in Large Language Models. But this essentially simple pattern is often obscured by implementation details and opt…| Bartosz Milewski's Programming Cafe
Guest post by Nat Jeffries, Founding Engineer at Useful Sensors. At Useful Sensors we love using disposable frameworks to deploy on-device transformers. Having built several such frameworks, I real…| Pete Warden's blog
Introduction Natural Language Processing is a fast-advancing field. And it is also one of the fields that require a huge amount of computational resources to make important progress. And although breakthroughs are openly announced, and papers are released in free-to-access repositories such as arXiv, Open Review, Papers with Code, etc., and despite (sometimes) having the code freely available on GitHub, using those language models is not something widely accessible and easy. Let me provide mo...| Posts by Rito Ghosh
As most of us know, artificial intelligence (AI) has taken some big steps forward in the past few years, with the advent of Large Language Models (LLM) like ChatGPT. With these programs, you can enter a query in plain language, … Continue reading →| Letters to Creationists
Generative models, in particular energy-based models, are often used to sample from conditional distributions–a process known as inpainting. One of the most fundamental kinds of generative energy-based models is called a restricted Boltzmann machine (RBM), which is essentially a bipartite, glassy Ising model. Inpainting with RBMs is usually done by sampling from the visible layer while fixing the value of some visible nodes, which is an intervention, not a passive observation. I could not f...| Abel Jansma
Implicit neural networks| vitalab.github.io
TensorFlow.js is an incredibly powerful JavaScript library for training and deploying machine learning models in the browser and Node. js. Let’s explore this library by building a teachable machine!| iO tech_hub
I’ve been using the ONNX Runtime a lot recently, and while it has been a lot of fun, there are a few things I’ve missed from the TensorFlow Lite world. The biggest (no pun intended) is the lack of tools to shrink the model file size, something that’s always been essential in the mobile app … … Continue reading →| Pete Warden's blog
Diffusion networks As there’s a lot of recent developments around image generation and diffusion models in general, I took a deep dive in the fundamentals of...| vitalab.github.io
Slides and materials for my MLOps Scotland 2024 talk about Xournal++ HTR.| Blog
I implemented an Online Handwritten Text Recognition (HTR) system using PyTorch, based on Google paper.| Blog
Slides and materials for my mini Machine Learning course at Princeton University’s Datta Lab.| Blog
My presentation notes for a seminar that I recently gave at the University of Edinburgh on dense representation learning for financial applications.| Blog
This post was adapted from a paper I originally wrote and extended for a school project. The full notebook can be found as a .ipynb file on my GitHub. The post assumes some background knowledge of linear algebra and eigenvalue decomposition. If you don’t have these prerequisites, I highly recommend watching 3Blue1Brown’s playlist on linear algebra.| dmicz devblog
A series on automatic differentiation in Julia. Part 5 shows how the MicroGrad.jl code can be used for a machine learning framework like Flux.jl. The working...| liorsinai.github.io
A series on automatic differentiation in Julia. Part 4 extends part 3 to handle maps, getfield and anonymous functions. It creates a generic gradient descent...| liorsinai.github.io
A series on automatic differentiation in Julia. Part 3 uses metaprogramming based on IRTools.jl to generate a modified (primal) forward pass and to reverse d...| liorsinai.github.io
A series on automatic differentiation in Julia. Part 2 uses metaprogramming to generate a modified (primal) forward pass and to reverse differentiate it into...| liorsinai.github.io
A series on automatic differentiation in Julia. Part 1 provides an overview and defines explicit chain rules.| liorsinai.github.io
If you asked me to sum up my experience as an intern at SAS® this summer in one word, it would probably be this: “Growth.| The SAS Data Science Blog
In this post, we will discuss how to build a Prompt Injection detector using a simple classification task with Scikit-learn’s Logistic Regression. Logistic Regression is a statistical method …| Shekhar Gulati
A transformer for generating text in Julia, trained on Shakespeare’s plays. This model can be used as a Generative Pre-trained Transformer (GPT) with further...| liorsinai.github.io
I had a friend the other day interested in a hypothesis along the lines of “I think the mix of crime at a location is different”, in particular they think it will be pushed to more lower level prop…| Andrew Wheeler
Video and GitHub repo to go along with this post.| dmicz devblog
There is now a widespread concern that the algorithms deployed to set prices, may `learn’ to collude and set prices above what one might consider to be the competitive level. Indeed, the FTC …| The Leisure of the Theory Class
(TL;DR: For a university course I’ve built a classifier that returns an emoji –that should make sense 😝 – given some input text: https://github.com/javierhon...| hondu.co
The news| inkdroid.org
I came across a 2 minute video where Ilya Sutskever — OpenAI’s chief scientist — explains why he thinks current ‘token-prediction’ large language models will be able to become sup…| R&A IT Strategy & Architecture
Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text| dmicz devblog
At a recent Algorithmic Fairness meeting, there was some discussion of algorithmic homogenization. The concern, as expressed, for example in Kleinberg and Raghavan (2021) is that “the quality…| The Leisure of the Theory Class
The Coral Dev Board is a TPU-enabled development board for testing out machine learning models with a requirement for near-real-time inference. For instance, image classification or object detection on video feeds, where a CPU would struggle to keep up.| questionable services
I have another post explaining function calls as used by GPT-4, as well as other updates made by OpenAI recently.| dmicz devblog
Singular Value Decomposition (SVD) is a fundamental concept in linear algebra, and it is particularly important in the field of machine learning for tasks such as dimensionality reduction, data compression, and noise reduction.| dmicz devblog
Attaining ViT/ConvNeXt performance with a couple of simple modifications to ResNet.| Frank’s Ramblings
If companies want to get value from their data, they need to focus on accelerating human understanding of data, scaling the number of modeling questions they can ask of that data in a short amount of time, and assessing their implications.| Looking for data in all the right places...
Step-by-step implementation of Hierarchical Navigable Small Worlds (HNSW).| Frank’s Ramblings
It’s important to be able to deploy a machine learning model when trained. But how do we approach serving ML models correctly?| alexandruburlacu.github.io
Tutorial on building production-ready serverless Machine Learning pipeline on AWS Lambda and solving common problems: bundle size, performance, latency.| Better Dev .blog