Metaphysic, alongside industry leaders such as Amazon, Anthropic, Civitai, Google, Meta, Microsoft, Mistral AI, OpenAI, and Stability AI, has committed to implementing robust child safety measures in the development, deployment, and maintenance of generative AI technologies. This initiative, led by Thorn, a nonprofit dedicated to defending children from sexual abuse, and All Tech Is Human, an organization dedicated to collectively tackling tech and society's complex problems, aims to mitigate...| Metaphysic.ai -
It can be difficult to control the camera viewpoint in text-to-video systems, which chiefly rely on the user's text instructions to generate tracking shots. A more precise way of imposing camera viewpoint is to extract motion from an existing video into the generated video. However, the newest crop of T2V systems cannot use prior methods that obtain this functionality, because their architectures are different. Therefore, researchers have now presented a novel and - they argue - superior meth...| Metaphysic.ai
It can be difficult to control the camera viewpoint in text-to-video systems, which chiefly rely on the user's text instructions to generate tracking shots. A more precise way of imposing camera viewpoint is to extract motion from an existing video into the generated video. However, the newest crop of T2V systems cannot use prior methods that obtain this functionality, because their architectures are different. Therefore, researchers have now presented a novel and - they argue - superior meth...| Metaphysic.ai
A new collaboration between China and Denmark offers a way to extract traditional CGI meshes and textures from implicit neural human avatars - a task that is extraordinarily challenging, but which could pave the way for more controllable AI-generated imagery and video in the future. The post Extracting Controllable CGI From the ‘Black Box’ of Neural Human Avatars first appeared on Metaphysic.ai.| Metaphysic.ai
Text-to-image and text-to-video models such as Stable Diffusion and Sora rely on datasets of images that include captions which accurately describe the photos in the collection. Most often, these captions are either inadequate or inaccurate - frequently both. Sometimes they're downright deceptive, damaging models trained on them. But the research sector's hopes that multi-modal large language models can create better captions is challenged in a recent new paper from NVIDIA and Chinese researc...| Metaphysic.ai
Using AI to modify selfies is a relatively hot pursuit in the research sector, not least because it is a potentially lucrative line of inquiry. This new paper proposes using a Generative Adversarial Network to re-imagine a taken selfie so that the subject appears to have been photographed by someone else, as well as removing the distortion that can manifest with wide lenses and close proximity. The post Using Generative Adversarial Networks to Rethink the Selfie first appeared on Metaphysic.ai.| Metaphysic.ai
A new system for the detection of AI-generated images trains partially on the noise-maps typical of Stable Diffusion and similar generative systems, as well as using reverse image search to compare images to online images from 2020 or earlier, prior to the advent of high-quality AI image systems. The resulting fake detector works even on genAI systems that have no public access, such as the DALL-E series, and MidJourney. The post Detecting AI-Generated Images With Inverted Stable Diffusion Im...| Metaphysic.ai
Making people move convincingly in text-to-video AI systems requires that the system have some prior knowledge about the way people move. But baking that knowledge into a huge model presents a number of practical and logistical challenges. What if, instead, one was free to obtain motion priors from a much wider net of videos, instead of training them, at great expense, into a single model? The post Powering Generative Video With Arbitrary Video Sources first appeared on Metaphysic.ai.| Metaphysic.ai
Turning human poses into skeletal stick-figures and back into new images via generative AI is fraught with pitfalls, particularly when the pose in question is uncommon, taken from an unusual angle, or in some way 'out of distribution' for what the target generative system is expecting. Among systems that perform these tasks for Stable Diffusion, ControlNet's openpose module has become very popular in the last year or so - but new research has improved on it, bringing us nearer to the dream of...| Metaphysic.ai
Gaussian Splatting has taken the VFX scene by storm over the last 6-8 months, and new research backed by Synthesia has produced some of the most impressive Splat-based human avatars ever seen, overtaking the state-of-the-art in tests. But is the burden of complexity, in adding neural systems, too high a price to pay for the improvements? The post Better Human Facial Synthesis With Gaussian Splatting and Parametric Heads first appeared on Metaphysic.ai.| Metaphysic.ai
After years of struggling to get NeRF and GANs into a more user-friendly configuration, the advent of 3D Gaussian Splatting (based on an older technology formerly used for medical imaging) has enlivened the AI synthesis research scene, which is hoping to obtain a new and more user-friendly version of older and less effective CGI technologies. One example is this new project, which allows the user to deform 3D Gaussian Splats with an external cage - just as Hollywood has been doing with CGI fo...| Metaphysic.ai