site:www.marktechpost.com

LongCat-Flash-Omni: A SOTA Open-Source Omni-Modal Model with 560B Parameters with 27B activated, Excelling at Real-Time Audio-Visual Interaction

How do you design a single model that can listen, see, read and respond in real time across text, image, video and audio without losing the efficiency? Meituan’s LongCat team has released LongCat ...

marktechpost

Comparing the Top 6 OCR (Optical Character Recognition) Models/Systems in 2025

Optical character recognition has moved from plain text extraction to document intelligence. Modern systems must read scanned and digital PDFs in one pass, preserve layout, detect tables, extract key ...

marktechpost

Google AI Unveils Supervised Reinforcement Learning (SRL): A Step Wise Framework with Expert Trajectories to Teach Small Language Models to Reason through Hard Problems

How can a small model learn to solve tasks it currently fails at, without rote imitation or relying on a correct rollout? A team of researchers from Google Cloud AI Research and UCLA have released a ...

marktechpost

Ant Group Releases Ling 2.0: A Reasoning-First MoE Language Model Series Built on the Principle that Each Activation Enhances Reasoning Capability

Every Ling 2.0 model uses the same sparse Mixture of Experts layer. Each layer has 256 routed experts and one shared expert. The router picks 8 routed experts for every token, the shared expert is ...

marktechpost

Microsoft Releases Agent Lightning: A New AI Framework that Enables Reinforcement Learning (RL)-based Training of LLMs for Any AI Agent

How do you convert real agent traces into reinforcement learning RL transitions to improve policy LLMs without changing your existing agent stack? Microsoft AI team releases Agent Lightning to help ...

marktechpost

IBM AI Team Releases Granite 4.0 Nano Series: Compact and Open-Source Small Models Built for AI at the Edge

What is new in Granite 4.0 Nano series? Granite 4.0 Nano consists of four model lines and their base counterparts. Granite 4.0 H 1B uses a hybrid SSM based architecture and is about 1.5B parameters.

marktechpost

MiniMax Releases MiniMax M2: A Mini Open Model Built for Max Coding and Agentic Workflows at 8% Claude Sonnet Price and ~2x Faster

Can an open source MoE truly power agentic coding workflows at a fraction of flagship model costs while sustaining long-horizon tool use across MCP, shell, browser, retrieval, and code? MiniMax team ...

marktechpost

Zhipu AI Releases ‘Glyph’: An AI Framework for Scaling the Context Length through Visual-Text Compression

Can we render long texts as images and use a VLM to achieve 3–4× token compression, preserving accuracy while scaling a 128K context toward 1M-token workloads? A team of researchers from Zhipu AI ...

marktechpost

5 Common LLM Parameters Explained with Examples

Max Tokens is the maximum number of tokens the model can generate during a run. The model will try to stay within this limit across all turns. If it exceeds the specified number, the run will stop and ...

marktechpost

A New AI Research from Anthropic and Thinking Machines Lab Stress Tests Model Specs and Reveal Character Differences among Language Models

AI companies use model specifications to define target behaviors during training and evaluation. Do current specs state the intended behaviors with enough precision, and do frontier models exhibit ...

marktechpost

Google vs OpenAI vs Anthropic: The Agentic AI Arms Race Breakdown

In this article we will analyze how Google, OpenAI, and Anthropic are productizing ‘agentic’ capabilities across computer-use control, tool/function calling, orchestration, governance, and enterprise ...

marktechpost

Salesforce AI Research Introduces WALT (Web Agents that Learn Tools): Enabling LLM agents to Automatically Discover Reusable Tools from Any Website

Web agents often fail when layouts shift or when tasks require long sequences. WALT targets this failure mode by mining site functionality offline, then exposing it as tools that encapsulate ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results