AI at Meta

AI at Meta

Research Services

Menlo Park, California 912,988 followers

Together with the AI community, we’re pushing boundaries through open science to create a more connected world.

About us

Through open science and collaboration with the AI community, we are pushing the boundaries of artificial intelligence to create a more connected world. We can’t advance the progress of AI alone, so we actively engage with the AI research and academic communities. Our goal is to advance AI in Infrastructure, Natural Language Processing, Generative AI, Vision, Human-Computer Interaction and many other areas of AI enable the community to build safe and responsible solutions to address some of the world’s greatest challenges.

Industry
Research Services
Company size
10,001+ employees
Headquarters
Menlo Park, California
Specialties
research, engineering, development, software development, artificial intelligence, machine learning, machine intelligence, deep learning, computer vision, engineering, computer vision, speech recognition, and natural language processing

Updates

  • View organization page for AI at Meta, graphic

    912,988 followers

    🎥 Today we’re excited to premiere Meta Movie Gen: the most advanced media foundation models to-date. Developed by AI research teams at Meta, Movie Gen delivers state-of-the-art results across a range of capabilities. We’re excited for the potential of this line of research to usher in entirely new possibilities for casual creators and creative professionals alike. More details and examples of what Movie Gen can do ➡️ https://go.fb.me/00mlgt Movie Gen Research Paper ➡️ https://go.fb.me/zfa8wf 🛠️ Movie Gen models and capabilities • Movie Gen Video: A 30B parameter transformer model that can generate high-quality and high-definition images and videos from a single text prompt. • Movie Gen Audio: A 13B parameter transformer model can take a video input along with optional text prompts for controllability to generate high-fidelity audio synced to the video. It can generate ambient sound, instrumental background music and foley sound — delivering state-of-the-art results in audio quality, video-to-audio alignment and text-to-audio alignment. • Precise video editing: Using a generated or existing video and accompanying text instructions as an input it can perform localized edits such as adding, removing or replacing elements — or global changes like background or style changes. • Personalized videos: Using an image of a person and a text prompt, the model can generate a video with state-of-the-art results on character preservation and natural movement in video. We’re continuing to work closely with creative professionals from across the field to integrate their feedback as we work towards a potential release. We look forward to sharing more on this work and the creative possibilities it will enable in the future.

  • View organization page for AI at Meta, graphic

    912,988 followers

    Meta Sparsh is the first general-purpose encoder for vision-based tactile sensing that works across many tactile sensors and many tasks. The family of models was pre-trained on a large dataset of 460,000+ tactile images using self-supervised learning. To help foster new generations of robotics AI research in the academic community, we've released: • The PyTorch implementation • Pre-trained model weights on Hugging Face • Datasets • A new research paper You can find details on this work here ➡️ https://go.fb.me/hmlavg

    • No alternative text description for this image
  • View organization page for AI at Meta, graphic

    912,988 followers

    As part of our continued work to help assure the future security of deployed cryptographic systems, we recently released new code that will enable researchers to benchmark AI-based attacks on lattice-based cryptography — and compare them to new and existing attacks going forward. We shared more on our work on Salsa — as well as seven other new releases for the open source community in this post ➡️ https://go.fb.me/h3f1fl

  • View organization page for AI at Meta, graphic

    912,988 followers

    Following the release of our latest system level safeguards, today we're sharing new research papers outlining our work and findings on Llama Guard 3 1B & Llama Guard 3 Vision — models that support input/output safety in lightweight applications on the edge and in multimodal prompts. Llama Guard 3 1B research paper ➡️ https://go.fb.me/o8y8m1 Llama Guard 3 Vision research paper ➡️ https://go.fb.me/1cb0xh Our hope in releasing this research openly is that it helps practitioners build new customizable safeguard models — and that this work inspires further research and development in LLM safety.

    • No alternative text description for this image
  • View organization page for AI at Meta, graphic

    912,988 followers

    The NVIDIA team shared more on how they optimized Llama 3.2 on-device and vision models for performance and cost-efficiency from data-center scale all the way to low-power edge devices.

  • View organization page for AI at Meta, graphic

    912,988 followers

    Whether you're attending #EMNLP2024 in person or following from your feed, here are five research papers being presented by AI research teams at Meta to add to your reading list. 1. Distilling System 2 into System 1: https://go.fb.me/5l9832 2. Altogether: Image Captioning via Re-aligning Alt-text: https://go.fb.me/1eanji 3. Beyond Turn-Based Interfaces: Synchronous LLMs for Full-Duplex Dialogue: https://go.fb.me/e25irp 4. Memory-Efficient Fine-Tuning of Transformers via Token Selection: https://go.fb.me/c67v9h 5. To the Globe (TTG): Towards Language-Driven Guaranteed Travel Planning: https://go.fb.me/9cknbp

    • No alternative text description for this image
    • No alternative text description for this image
    • No alternative text description for this image
    • No alternative text description for this image
    • No alternative text description for this image
  • View organization page for AI at Meta, graphic

    912,988 followers

    Newly published in the latest issue of Science Robotics today: NeuralFeels with neural fields — Visuotactile perception for in-hand manipulation.

    View profile for Sudharshan Suresh, graphic

    Research Scientist, Boston Dynamics

    For robot dexterity, a missing piece is general, robust perception. Our new Science Robotics article combines multimodal sensing with neural representations to perceive novel objects in-hand. See it on the cover of the November issue! https://lnkd.in/ezZRs5dN We estimate pose and shape by learning neural field models online from a stream of vision, touch, and proprioception. The frontend achieves robust segmentation and depth prediction for vision and touch. The backend combines this information into a neural field, while also optimizing for pose. Vision-based touch (digit.ml/digit) perceives contact geometries as images, and we train an image-to-depth tactile transformer in simulation. For visual segmentation, we combine powerful foundation models (SAMv1) with robot kinematics. It doubles up as a multimodal pose tracker, when provided CAD models of the objects at runtime. For different levels of occlusion, we find that “touch, at the very least, refines and, at the very best, disambiguates visual estimates during in-hand manipulation." We release a large dataset of real-world and simulated visuo-tactile interactions and tactile transformer models on Hugging Face: bit.ly/hf-neuralfeels This has been in the pipeline for a while, thanks to my amazing collaborators from AI at MetaCarnegie Mellon University, University of California, Berkeley, Technische Universität Dresden, and CeTI: Haozhi Qi, Tingfan Wu, Taosha F., Luis Pineda, Mike Maroje Lambeta, Jitendra MALIK, Mrinal Kalakrishnan, Roberto Calandra, Michael Kaess, Joseph Ortiz, and Mustafa Mukadam Paper: https://lnkd.in/ezZRs5dN Project page: https://lnkd.in/dCPCs4jQ #ScienceRoboticsResearch

    • Science Robotics November Cover
  • View organization page for AI at Meta, graphic

    912,988 followers

    Together with Reskilll, we hosted the first official Llama Hackathon in India. This hackathon brought together 270+ developers & 25+ mentors from across industries in Bengaluru. The result? 75 impressive new projects built with Llama in just 30h of hacking! Read the full recap, including details on some of the top projects like CurePharmaAI, CivicFix, Evalssment and Aarogya Assist ➡️ https://go.fb.me/0n8xkz

    • No alternative text description for this image
    • No alternative text description for this image
    • No alternative text description for this image
    • No alternative text description for this image
    • No alternative text description for this image
      +13
  • View organization page for AI at Meta, graphic

    912,988 followers

    Join us in San Francisco (or online) this weekend for a Llama Impact Hackathon! Teams will be spending two days building new ideas and solutions with Llama 3.1 + Llama 3.2 vision and on-device models. Three challenge tracks for this hackathon: 1. Expanding Low-Resource Languages 2. Reducing Barriers for Llama Developers 3. Navigating Public Services Join us and build for a chance to win awards from a $15K prize pool ➡️ https://go.fb.me/vnzbd3

    • No alternative text description for this image

Affiliated pages

Similar pages