Vibe Coding

Explore articles tagged with Vibe Coding

#AI#Artificial Intelligence#OpenAI

Why Weibo’s tiny VibeThinker-3B has the AI world arguing over benchmarks again

On Sunday, a team of nine researchers at Sina Weibo — the Chinese social media giant better known for its microblogging platform than for cutting-edge artificial intelligence — quietly posted a 14-page technical report to arXiv that sent shockwaves through the AI research community. Their claim: a language model with just 3 billion parameters can match or exceed the reasoning performance of flagship systems from Google DeepMind, OpenAI, Anthropic, and DeepSeek that are hundreds of times larger.

#AI#Technology#Vibe Coding

EU publishes its AI content labelling playbook ahead of the AI Act’s August deadline

The European Union has published its AI content labelling playbook, a voluntary Code of Practice meant to help companies meet transparency rules that become law across the bloc on August 2 onwards. The European Commission released the final Code on 10 June, setting out practical steps for the businesses that build and use generative AI to mark […] The post EU publishes its AI content labelling playbook ahead of the AI Act’s August deadline appeared first on AI News.

#AI#Large Language Models#NLP

Kimi K2.7-Code cuts thinking tokens 30% — but practitioners say the benchmarks don't check out

Moonshot AI released Kimi K2.7-Code this week, an open-source update to its K2 coding model family, claiming leaner reasoning and double-digit performance gains.

#AI#Anthropic#Claude

Anthropic brings Mythos to the masses with Claude Fable 5, its most powerful generally available model ever

Anthropic today launched two new AI models — Claude Fable 5 and Claude Mythos 5 — marking the company’s first broad release of the powerful “Mythos-class” AI capabilities it previously made available only to participating organizations in its restricted cybersecurity program, Project Glasswing, which it announced two months ago. The company says Fable 5, which is the version most users and developers will get starting today, exceeds every Claude model it has previously made generally available —.

#AI#Technology#Vibe Coding

Agentic AI solved coding — and exposed every other problem in software engineering

Agentic AI is now a core part of the engineering process, driving massive execution leverage and helping us generate more code than ever before. Yet, a difficult question I’ve increasingly heard from business leaders is: if we’re shipping code faster than ever, why aren’t our products improving at the same rate? The reason is that writing code was never the rate limiter.

#AI#Meta#Technology

Meta Business Agent drives AI-powered conversational commerce

Meta has launched Business Agent to automate conversational commerce workflows directly inside its messaging applications. The software allows global retail brands to execute transactions and field support tickets without human intervention.

#AI#Anthropic#Claude

Anthropic says 80% of its new production code is now authored by Claude — how your enterprise can keep up

Anthropic co-founder and CEO Dario Amodei said it was coming, but it still feels like a milestone: More than 80% of the code merged into Anthropic’s production codebase in May wasn't authored by humans, but by its own AI model, Claude, according to a new report shared by the record-breaking AI startup today. This transformation has triggered an 8x increase in the volume of code shipped per engineer per quarter compared to the company’s 2021–2025 baseline, which the company notes means even more .

#AI#Large Language Models#NLP

Walmart’s AI workflows meet the realities of the balance sheet

Walmart has reportedly begun limiting employees’ use of an internal AI assistant called Code Puppy after demands placed on the LLM backing the tool were higher than expected. Employees of Walmart were encouraged to use Code Puppy without any stricture or stipulations as to the limits of use, but Walmart is now assigning employees a […] The post Walmart’s AI workflows meet the realities of the balance sheet appeared first on AI News.

#AI#Anthropic#Claude

Anthropic IPO filing marks AI maturing into enterprise utility

Anthropic’s IPO filing marks the maturation of generative AI from a research-heavy venture phase into a stabilised enterprise utility. Model developers operating in private markets have prioritised rapid iteration and maximum compute performance over predictable billing cycles.

#AI#Microsoft#Technology

Enterprise AI agents keep creating data silos. Microsoft's Build answer is Microsoft IQ and Rayfin.

Every new AI agent your team deploys starts from scratch: no memory of how the business works, where data lives, or what rules apply. And as agentic coding tools spin up applications faster than anyone can govern them, each one risks becoming another silo outside your data layer entirely.

#AI#Artificial Intelligence#Google AI

AI in video game development: How artificial intelligence is reshaping the industry

A Google Cloud survey found that 90% of developers are already integrating AI into their daily work, and on Steam, 7,818 titles disclosed AI use in 2025 alone, a 681% increase over the previous year. AI in video game development is not a side experiment.

#AI#Large Language Models#NLP

MIT's MeMo lets teams swap in a better LLM without retraining — and performance jumps 26%

Enabling LLMs to acquire new knowledge after training remains a major hurdle for enterprise AI — current solutions are either too expensive, too slow, or constrained by context window limits. MeMo, a framework from researchers at multiple universities, encodes new knowledge into a dedicated smaller memory model that operates separately from the main LLM.

#AI#Anthropic#Claude

Anthropic's Claude Opus 4.8 is here with 3X cheaper fast mode and near-Mythos level alignment

Anthropic today released Claude Opus 4.8, an upgrade to its flagship model that ships at the same price as its predecessor, alongside a dramatically cheaper "fast mode" tier and a new feature that lets the model spawn hundreds of parallel subagents for codebase-scale work.

#AI#ChatGPT#OpenAI

DeepSWE blows up the AI coding leaderboard, crowns GPT-5.5, and finds Claude Opus exploiting a benchmark loophole

For months, the leading AI coding benchmarks have told enterprise buyers a comforting but misleading story: the top models are all roughly the same. OpenAI's GPT-5 family, Anthropic's Claude Opus, and Google's Gemini Pro have clustered within a narrow band on Scale AI's SWE-Bench Pro leaderboard, making it nearly impossible for engineering leaders to determine which agent will actually perform best inside their codebases.

#AI#AI Models#Machine Learning

Why prompt debt, retrieval debt, and evaluation debt are quietly reshaping enterprise AI risk

Over the past two decades, technical debt meant outdated architecture, messy code, and poorly maintained documentation. That definition is no longer sufficient in the AI era, where failure modes are more subtle and often non-linear.

#AI#AI Models#Machine Learning

What Your AI agents need a terminal, not just a vector database

When agentic workflows fail, developers often assume the problem lies in the underlying model’s reasoning abilities. In reality, the limited information provided by the retrieval interface is often the primary limiting factor.

#AI#Anthropic#Claude

The Download: coding’s future, the ‘Steroid Olympics,’ and AI-driven science

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. Anthropic’s Code with Claude showed off coding’s future—whether you like it or not At Anthropic’s developer event in London this week, Code with Claude, attendees were asked if they’d shipped code….

#AI#Anthropic#Claude

Alibaba's proprietary Qwen3.7-Max can run for 35 hours autonomously and supports external harnesses like Anthropic's Claude Code

The AI industry has fully entered the "agent era," a paradigm where AI models do far more than generate text — they now actively plan, execute, and course-correct complex tasks over days rather than seconds. Thus, it's perhaps unsurprising to see Chinese e-commerce giant Alibaba's famed Qwen Team of AI researchers release a model capable of performing autonomous agentic AI work over multiple days: that model has arrived in the form of Qwen3.

#Anthropic#AI#Claude

Anthropic’s Code with Claude showed off coding’s future—whether you like it or not

The vibes were strong at Code with Claude, Anthropic’s two-day event for software developers in London that kicked off on May 19, the same day as Google’s I/O in Palo Alto. (A coincidence, not a flex, Anthropic staffers assured me.

#AI#Artificial Intelligence#Google AI

Google says Gemini 3.5 Flash can slash enterprise AI costs by more than $1 billion a year

Google unveiled Gemini 3.5 Flash at its annual I/O developer conference on Tuesday, a new artificial intelligence model that the company says shatters what had become a seemingly iron law of the AI industry: that the smartest models must also be the slowest and most expensive to run.