Language models prompted with a user description or persona are being used to predict the user’s preferences and opinions. However, existing approaches to building personas mostly rely on a user’s demographic attributes and/or prior judgments, but not on any underlying reasoning behind a user’s judgments. We introduce PB&J (Psychology of Behavior and Judgments), a framework that improves LM personas by incorporating potential rationales for why the user could have made a certain judgmen...| Apple Machine Learning Research
Entity Linking (EL) has traditionally relied on large annotated datasets and extensive model fine-tuning. While recent few-shot methods leverage large language models (LLMs) through prompting to reduce training requirements, they often suffer from inefficiencies due to expensive LLM-based reasoning. ARTER (Adaptive Routing and Targeted Entity Reasoning) presents a structured pipeline that achieves high performance without deep fine-tuning by strategically combining candidate generation, conte...| Apple Machine Learning Research
Simulation-based inference (SBI) is a statistical inference approach for estimating latent parameters of a physical system when the likelihood is intractable but simulations are available. In practice, SBI is often hindered by model misspecification—the mismatch between simulated and real-world observations caused by inherent modeling simplifications. RoPE, a recent SBI approach, addresses this challenge through a two-stage domain transfer process that combines semi-supervised calibration w...| Apple Machine Learning Research
Fine-tuning large language models (LLMs) with backpropagation — even for a subset of parameters such as LoRA — can be much more memory-consuming than inference and is often deemed impractical for resource-constrained mobile devices. Alternative methods, such as zeroth-order optimization (ZO), can greatly reduce the memory footprint but come at the cost of significantly slower model convergence (10× to 100× more steps than backpropagation). We propose a memory-efficient implementation of...| Apple Machine Learning Research
This paper was accepted at the Evaluating the Evolving LLM Lifecycle Workshop at NeurIPS 2025. Existing video understanding benchmarks often conflate knowledge-based and purely image-based questions, rather than clearly isolating a model’s temporal reasoning ability, which is the key aspect that distinguishes video understanding from other modalities. We identify two major limitations that obscure whether higher scores truly indicate stronger understanding of the dynamic content in videos: ...| Apple Machine Learning Research
Hallucinations pose a significant obstacle to the reliability and widespread adoption of language models, yet their accurate measurement remains a persistent challenge. While many task- and domain-specific metrics have been proposed to assess faithfulness and factuality concerns, the robustness and generalization of these metrics are still untested. In this paper, we conduct a large-scale empirical evaluation of 6 diverse sets of hallucination detection metrics across 4 datasets, 37 language ...| Apple Machine Learning Research
Recent advances in multimodal models have demonstrated remarkable text-guided image editing capabilities, with systems like GPT-4o and Nano-Banana setting new benchmarks. However, the research community’s progress remains constrained by the absence of large-scale, high-quality, and openly accessible datasets built from real images. We introduce Pico-Banana-400K, a comprehensive 400K-image dataset for instruction-based image editing. Our dataset is constructed by leveraging Nano-Banana to ge...| Apple Machine Learning Research
Knowledge graphs (KGs) are foundational to many AI applications, but maintaining their freshness and completeness remains costly. We present ODKE+, a production-grade system that automatically extracts and ingests millions of open-domain facts from web sources with high precision. ODKE+ combines modular components into a scalable pipeline: (1) the Extraction Initiator detects missing or stale facts, (2) the Evidence Retriever collects supporting documents, (3) hybrid Knowledge Extractors appl...| Apple Machine Learning Research
As the adoption of language models advances, so does the need to better represent individual users to the model. Are there aspects of an individual’s belief system that a language model can utilize for improved alignment? Following prior research, we investigate this question in the domain of opinion prediction by developing PrimeX, a dataset of public opinion survey data from 858 US residents with two additional sources of belief information: written explanations from the respondents for w...| Apple Machine Learning Research
A dangerous assumption that can be made from prior work on the bias transfer hypothesis (BTH) is that biases do not transfer from…| Apple Machine Learning Research
While federated learning (FL) and differential privacy (DP) have been extensively studied, their application to automatic speech recognition…| Apple Machine Learning Research
Read updates about machine learning research, events, and programs from Apple.| Apple Machine Learning Research
Apple believes that privacy is a fundamental human right. As AI experiences become increasingly personal and a part of people's daily lives…| Apple Machine Learning Research
Recent generations of frontier language models have introduced Large Reasoning Models (LRMs) that generate detailed thinking processes…| Apple Machine Learning Research
At Apple, we believe privacy is a fundamental human right. And we believe in giving our users a great experience while protecting their…| Apple Machine Learning Research
Large generative models are becoming increasingly capable and more widely deployed to power production applications, but getting these…| Apple Machine Learning Research
This study explores using embedding rank as an unsupervised evaluation metric for general-purpose speech encoders trained via…| Apple Machine Learning Research
Self-play has powered breakthroughs in two-player and multi-player games. Here we show that self-play is a surprisingly effective strategy…| Apple Machine Learning Research
Nonverbal behaviors such as posture, gestures, and gaze are essential for conveying internal states, both consciously and unconsciously, in…| Apple Machine Learning Research
This paper introduces a framework, called EMOTION, for generating expressive motion sequences in humanoid robots, enhancing their ability to…| Apple Machine Learning Research
At Apple, we believe privacy is a fundamental human right. Our work to protect user privacy is informed by a set of privacy principles, and…| Apple Machine Learning Research
This paper was accepted at the Efficient Natural Language and Speech Processing (ENLSP) Workshop at NeurIPS 2024. The pre-training phase of…| Apple Machine Learning Research
This paper was accepted at the Efficient Natural Language and Speech Processing (ENLSP) Workshop at NeurIPS 2024. While large language…| Apple Machine Learning Research
Apple is sponsoring the annual meeting of the Association for Computational Linguistics (ACL), which takes place in person from August 11 to…| Apple Machine Learning Research
Build amazing machine-learned experiences with Apple. Discover opportunities for researchers, students, and developers.| Apple Machine Learning Research
At the 2024 Worldwide Developers Conference, we introduced Apple Intelligence, a personal intelligence system integrated deeply into…| Apple Machine Learning Research
Today, we are excited to release optimizations to Core ML for Stable Diffusion in macOS 13.1 and iOS 16.2, along with code to get started…| Apple Machine Learning Research