A new technical paper titled “MAHL: Multi-Agent LLM-Guided Hierarchical Chiplet Design with Adaptive Debugging” was published by researchers at the University of Minnesota – Twin Cities. Abstract “As program workloads (e.g., AI) increase in size and algorithmic complexity, the primary challenge lies in their high dimensionality, encompassing computing cores, array sizes, and memory hierarchies. To... » read more The post LLM-Based Chiplet Design Generation Framework (Univ. of Minneso...| Semiconductor Engineering
A new technical paper titled “Accelerating LLM Inference via Dynamic KV Cache Placement in Heterogeneous Memory System” was published by researchers at Rensselaer Polytechnic Institute and IBM. Abstract “Large Language Model (LLM) inference is increasingly constrained by memory bandwidth, with frequent access to the key-value (KV) cache dominating data movement. While attention sparsity reduces some... » read more The post Dynamic KV Cache Scheduling in Heterogeneous Memory Systems for...| Semiconductor Engineering
A new technical paper titled “Power Stabilization for AI Training Datacenters” was published by researchers at Microsoft, OpenAI, and NVIDIA. Abstract “Large Artificial Intelligence (AI) training workloads spanning several tens of thousands of GPUs present unique power management challenges. These arise due to the high variability in power consumption during the training. Given the synchronous... » read more The post Power Stabilization To Allow Continued Scaling Of AI Training Workloa...| Semiconductor Engineering
It can be difficult to control the camera viewpoint in text-to-video systems, which chiefly rely on the user's text instructions to generate tracking shots. A more precise way of imposing camera viewpoint is to extract motion from an existing video into the generated video. However, the newest crop of T2V systems cannot use prior methods that obtain this functionality, because their architectures are different. Therefore, researchers have now presented a novel and - they argue - superior meth...| Metaphysic.ai
It can be difficult to control the camera viewpoint in text-to-video systems, which chiefly rely on the user's text instructions to generate tracking shots. A more precise way of imposing camera viewpoint is to extract motion from an existing video into the generated video. However, the newest crop of T2V systems cannot use prior methods that obtain this functionality, because their architectures are different. Therefore, researchers have now presented a novel and - they argue - superior meth...| Metaphysic.ai
A new collaboration between China and Denmark offers a way to extract traditional CGI meshes and textures from implicit neural human avatars - a task that is extraordinarily challenging, but which could pave the way for more controllable AI-generated imagery and video in the future. The post Extracting Controllable CGI From the ‘Black Box’ of Neural Human Avatars first appeared on Metaphysic.ai.| Metaphysic.ai
Text-to-image and text-to-video models such as Stable Diffusion and Sora rely on datasets of images that include captions which accurately describe the photos in the collection. Most often, these captions are either inadequate or inaccurate - frequently both. Sometimes they're downright deceptive, damaging models trained on them. But the research sector's hopes that multi-modal large language models can create better captions is challenged in a recent new paper from NVIDIA and Chinese researc...| Metaphysic.ai
Using AI to modify selfies is a relatively hot pursuit in the research sector, not least because it is a potentially lucrative line of inquiry. This new paper proposes using a Generative Adversarial Network to re-imagine a taken selfie so that the subject appears to have been photographed by someone else, as well as removing the distortion that can manifest with wide lenses and close proximity. The post Using Generative Adversarial Networks to Rethink the Selfie first appeared on Metaphysic.ai.| Metaphysic.ai
A new system for the detection of AI-generated images trains partially on the noise-maps typical of Stable Diffusion and similar generative systems, as well as using reverse image search to compare images to online images from 2020 or earlier, prior to the advent of high-quality AI image systems. The resulting fake detector works even on genAI systems that have no public access, such as the DALL-E series, and MidJourney. The post Detecting AI-Generated Images With Inverted Stable Diffusion Im...| Metaphysic.ai
Making people move convincingly in text-to-video AI systems requires that the system have some prior knowledge about the way people move. But baking that knowledge into a huge model presents a number of practical and logistical challenges. What if, instead, one was free to obtain motion priors from a much wider net of videos, instead of training them, at great expense, into a single model? The post Powering Generative Video With Arbitrary Video Sources first appeared on Metaphysic.ai.| Metaphysic.ai
Turning human poses into skeletal stick-figures and back into new images via generative AI is fraught with pitfalls, particularly when the pose in question is uncommon, taken from an unusual angle, or in some way 'out of distribution' for what the target generative system is expecting. Among systems that perform these tasks for Stable Diffusion, ControlNet's openpose module has become very popular in the last year or so - but new research has improved on it, bringing us nearer to the dream of...| Metaphysic.ai
Gaussian Splatting has taken the VFX scene by storm over the last 6-8 months, and new research backed by Synthesia has produced some of the most impressive Splat-based human avatars ever seen, overtaking the state-of-the-art in tests. But is the burden of complexity, in adding neural systems, too high a price to pay for the improvements? The post Better Human Facial Synthesis With Gaussian Splatting and Parametric Heads first appeared on Metaphysic.ai.| Metaphysic.ai
After years of struggling to get NeRF and GANs into a more user-friendly configuration, the advent of 3D Gaussian Splatting (based on an older technology formerly used for medical imaging) has enlivened the AI synthesis research scene, which is hoping to obtain a new and more user-friendly version of older and less effective CGI technologies. One example is this new project, which allows the user to deform 3D Gaussian Splats with an external cage - just as Hollywood has been doing with CGI fo...| Metaphysic.ai