Mark Zuckerberg Was Early in AI. Now Meta Is Trying to Catch Up.

By Karen Hao , Salvador Rodriguez and Deepa Seetharaman June 16, 2023 11:00 pm ET Meta is doing something Mark Zuckerberg doesn’t like: playing catch up. A decade ago, the company founder and CEO saw the promise for artificial intelligence and invested large sums of money into its advancement. He hired one of its early visionaries, Yann LeCun, to lead the charge. Now, just months after OpenAI’s ChatGPT burst into the consumer marketplace, Meta is falling behind in the very same technology. Meta is now scrambling to refocus its resources to generate usable

A person who loves writing, loves novels, and loves life.Seeking objective truth, hoping for world peace, and wishing for a world without wars.

USA Jun 17, 2023 Add to Reading List

Mark Zuckerberg Was Early in AI. Now Meta Is Trying to Catch Up.

Karen Hao ,

Salvador Rodriguez and

Deepa Seetharaman

June 16, 2023 11:00 pm ET

Meta is doing something Mark Zuckerberg doesn’t like: playing catch up.

A decade ago, the company founder and CEO saw the promise for artificial intelligence and invested large sums of money into its advancement. He hired one of its early visionaries, Yann LeCun, to lead the charge. Now, just months after OpenAI’s ChatGPT burst into the consumer marketplace, Meta is falling behind in the very same technology.

Meta is now scrambling to refocus its resources to generate usable AI products and features, including its own chatbots, after spending years prioritizing academic discoveries and sharing them freely while struggling to capitalize on their commercial potential.

Newsletter Sign-Up

Technology

Get email notifications about major news in the technology sector.

Subscribe Now

That’s a tall order as many of Meta’s top AI employees have departed and amid the company’s own sets of layoffs in what Zuckerberg has called a “year of efficiency.” About a third of Meta workers who co-authored published AI research related to large language models—the complex systems that power AI systems like ChatGPT—have left in the last year, according to a Wall Street Journal analysis.

Zuckerberg himself and other top executives have taken more control of the company’s AI strategy. They created a new generative AI group that reports directly to Chief Product Officer Chris Cox, one of the longest-serving and most trusted executives at Meta. The group is training generative AI models—which produce content, such as text, images or audio—intended to be infused into “every single one of our products,” Zuckerberg said. He has touted Meta’s flagship AI language model, called LLaMA, which — after its code leaked — spurred the emergence of homegrown tools that could one day compete with the products that Google and OpenAI are trying to sell.

If Meta succeeds in commercializing its AI efforts, it could help boost its user engagement, create a better metaverse and make the company more attractive to the young users who are now proving harder for it to attract. If Meta can’t capitalize on this technology fast enough, it runs the risk of losing relevance as competitors, including a fast-growing crop of scrappy AI startups, leap ahead.

In a statement Joelle Pineau, VP of AI Research at Meta said the company is not behind in AI and defended its focus on research and structure, saying it will position Meta for success. Meta’s AI research unit “is one of the world’s leading destinations for AI researchers and open science, and its research output has increased significantly over the last year alone,” said Pineau. “Our research breakthroughs have provided a tremendous foundation to build on as we bring a new class of generative AI-powered experiences to our family of apps. We’re proud of the contributions that Meta’s AI researchers, past and present, are making to help shape the future of advanced state-of-the-art AI.”

Zuckerberg on Friday announced an AI model called Voicebox that can read aloud text prompts in a manner of different ways or correct audio recordings with the help of text prompts to remove background noise, like the bark of a dog. Meta didn’t say when the research project will become available to the public.

This article is based on interviews with more than a dozen current and former Meta employees, reviews of LinkedIn and social-media profiles and startup news announcements.

Zuckerberg and other executives have called AI a third leg to Meta’s stool, believing it essential to the company’s long-term growth and relevance, alongside global connectivity and virtual and augmented reality. Lagging behind in AI threatens to make Meta appear stodgy and slow, instead of the nimble, aggressive upstart that coined the phrase “move fast and break things” and set the pace of innovation in Silicon Valley.

In May, the White House didn’t invite Meta to a summit of AI leaders, billed as a meeting of “companies at the forefront of AI innovation.”

Meta has taken sharp turns before at moments when it has appeared behind, such as when it transitioned Facebook from a desktop to a mobile-first ads business or in 2016, when it launched its Stories feature on Instagram to lure people away from ,

Meta faces other strategic, political and financial challenges. Its longtime heavy focus on original research in Meta’s AI division disincentivized work on generative AI, the systems like ChatGPT that produce humanlike text and media. Executives misstepped in designing the hardware required to run such AI programs, which it is now trying to correct. Years of scrutiny into the company’s handling of user data and human-rights violations has made some executives indecisive and wary of launching new AI products for consumers.

Meta began investing in AI in 2013. Zuckerberg and then-CTO Mike Schroepfer personally sought to recruit one of the leading minds in AI to lead a new research division to advance the technology. They found their lieutenant in LeCun, a New York University professor whose breakthrough work in the field was renowned.

Yann LeCun has run Meta’s artificial intelligence research unit for a decade and has maintained an academic approach to breakthroughs in AI focused on publishing and sharing discoveries.

Photo: Nathan Laine/Bloomberg News

LeCun, deeply rooted in academia and fundamental research, was instrumental in creating a culture that reflected his priorities: hiring scientists over engineers and emphasizing academic outputs, such as research papers, over product development for the company’s end users. The strategy made Meta’s fundamental AI research lab highly attractive to top talent over the years, but challenged the company’s ability to commercialize its advancements, people familiar with the matter said.

It also encouraged a diffuse, bottoms-up approach to research direction and resource allocation. Researchers drove their own agendas, pursuing independent projects in different directions rather than toward a cohesive companywide strategy, the people said. Meta divvied up hardware into small pools across each project: Some researchers, given more computer chips than they needed, would tie them up in unnecessary tasks to avoid relinquishing them, some of the people said.

Meanwhile, Meta was slow to equip its data centers with the most powerful computer chips needed for AI development. Even as the company acquired more of these chips, it didn’t have a good system for getting them into the hands of engineers and researchers. At times thousands of pieces of coveted and expensive hardware sat around unused, some of the people said.

Meta is in the process of overhauling its data centers, which could have contributed to the logjams. As of May, Meta’s latest supercomputer for AI projects has 16,000 such chips, a company blog post said.

As large language models began to show increasingly impressive capabilities in 2020, tension mounted within Meta’s AI research division between those who urged the company to invest seriously in the industry’s new direction, and those, including LeCun, who believed such models are fads that lack scientific value, people familiar with the matter said. LeCun’s strong opposition toward large language models (he believes they don’t get AI closer to human-level intelligence), both internally and publicly, made it difficult for researchers with opposing views to amass the support and vast resources needed for those kinds of projects, some of the people said.

Some Meta researchers pressed forward anyway with fewer resources, using around 1,000 chips to produce a large language model in 2022 known as OPT, or Open Pretrained Transformer, and around 2,000 chips to produce Meta’s flagship model called LLaMA in 2023. The industry standard, by contrast, is 5,000 to 10,000 chips. Meta initially allowed a limited group of outside researchers access to LLaMA before it leaked online, sparking a burst of innovation that executives cite as a prime example of Meta’s goal to share its AI technology.

Meta has since lost numerous AI researchers who worked on these and other key generative AI projects in the last year, many citing burnout or a lack of confidence in Meta to keep up with competitors. Six of the 14 authors listed on the research paper for LLaMA, have left or announced they will be departing, according to their LinkedIn profiles and people familiar with the matter. Eight of the 19 co-authors on the paper for OPT have left as well.

The departures have accelerated following OpenAI’s release of ChatGPT in November of last year. Some have been lured by AI startup fever, which has fueled staffing changes at Silicon Valley companies across the board, including at Google. As of March, the number of job listings on LinkedIn mentioning GPT is up 79% year-over-year, the professional social network told The Wall Street Journal.

A Meta spokesman said the company has continued to recruit and brought in new AI talent.

After ChatGPT’s debut, Zuckerberg and Cox joined Chief Technology Officer Andrew Bosworth in overseeing all of the company’s AI-related efforts. The three executives are now spending hours a week on AI, participating in meetings and approving AI projects.

The new generative AI group is focused exclusively on building usable products and tools instead of on scientific research. It received over 2,000 internal applications and has rapidly amassed hundreds of people from different teams. Hardware resources have shifted over from the AI research division and are being used to train new generative AI models, people familiar with the work said.

In March, Zuckerberg said that “advancing AI and building it into every one of our products” was the company’s single largest investment. Speaking at Meta’s annual shareholder meeting in May, Zuckerberg said the company also hopes to extend the technology to the metaverse as well.

Meta headquarters in Menlo Park, Calif.

Photo: David Paul Morris/Bloomberg News

At a town hall meeting with employees earlier this month, Zuckerberg announced a number of generative AI products that the company is currently working on, the Meta spokesman said. The initiatives include AI agents for Messenger and WhatsApp, AI stickers that users can generate from text prompts and share in their chats and a photo generation feature that will allow Instagram users to modify their own photos using text prompts and then share them in Instagram Stories.

Zuckerberg also shared some internal-only generative AI tools geared toward employees, including one called Metamate, a productivity assistant that pulls information from internal sources to perform tasks at employees’ request. Metamate was recently rolled out to a large group of employees as part of a trial run, the Meta spokesman said.

“In the last year, we’ve seen some really incredible breakthroughs—qualitative breakthroughs—on generative AI,” Zuckerberg said at the town hall.

Meta still faces broad challenges. The company’s increasingly low tolerance for risk following seven years of intense government and media scrutiny for its user-privacy practices has created friction about how and when to introduce AI products, people familiar with the matter said.

In the past, Meta has had to consider its public reputation when developing and releasing large language models, which can be prone to churning out incorrect answers or offensive remarks.

Several years ago, AI researchers were working on a chatbot code-named Tamagobot, based on an early version of a large-language-model system, according to people familiar with the matter. The team was impressed by its performance, but concluded that it wasn’t worth launching while the company was facing intense criticism for allowing misinformation to flourish on its platform during the 2016 presidential election, one of the people said.

The concern around public scrutiny was also on display when Meta released its BlenderBot 3 chatbot in August 2022. Within a week of launching, BlenderBot 3 was panned for making false statements, offensive remarks and racist comments. The system also called Zuckerberg “creepy and manipulative.”

The Meta spokesman said the project was still left up for over a year until the conclusion of the research, and the company maintained an open and transparent approach through its life cycle. Meta has released and seen through many other projects that demonstrate the company’s willingness to take risks, he added.

But the scenario played out again in November 2022 when the company released Galactica, a science-focused large language model. The system was shut down by Meta within three days of its release after it was hit with a wave of criticism by scientists due to its incorrect and biased answers.

Two weeks later, OpenAI released ChatGPT.

Write to Karen Hao at [email protected], Salvador Rodriguez at [email protected] and Deepa Seetharaman at [email protected]