Wolf Digest — 2026-04-25

#1

DeepSeek V4 deep dive: 1.6T-A49B Pro / 284B-A13B Flash, runnable on Huawei Ascend

Frontier LLMs 2026-04-25 Latent SpaceMIT Tech Review AIAI Explained

8.7

I 8.5 Im 8.0 P 8.5

The DeepSeek V4 follow-up coverage that landed Friday and Saturday is denser than the initial Thursday drop and has shifted the framing of the model. Latent Space's AINews newsletter on Saturday morning leads with the hardware-sovereignty angle: V4 in both its sizes — Pro at one point six trillion total parameters with forty-nine billion active and Flash at two hundred eighty-four billion total with thirteen billion active — is now confirmed runnable on Huawei Ascend chips, not just on NVIDIA. That detail is being read as the practical end of the export-control bottleneck for Chinese frontier deployments, since Ascend supply is domestic and the V4 inference stack appears to be hand-tuned for the Ascend memory hierarchy. Latent Space frames the release as DeepSeek stepping out of the benchmark-leadership game — V4 is no longer claiming to top the public leaderboards — and into a compute-economics game, where the relevant comparison is intelligence per dollar at deployment. The prodigal-tiger-returns metaphor in the headline captures both: DeepSeek skipped a quarter of public model news entirely between V three point two Speciale in November and now, then dropped two preview models with full open weights and aggressive pricing in one week.

MIT Technology Review's Friday piece, Three Reasons Why DeepSeek's New Model Matters, takes a different cut. The first reason is the size milestone — at one point six trillion total parameters under MIT license, V4 Pro is the largest open-weights model ever released, larger than Kimi K two point six at one point one trillion and twice the size of V three point two. The second is the disclosed training cost, which DeepSeek continues to claim is roughly an order of magnitude below comparable Western runs; the report notes the same caveats every analyst has been raising, namely that the figure excludes prior research compute and likely undersells the real total, but also notes that even a charitable correction leaves DeepSeek noticeably ahead on cost efficiency. The third reason, and the most consequential for downstream deployments, is pricing: V4 Flash at fourteen cents per million input tokens and twenty-eight cents output now sits below GPT five point four Nano, Gemini three point one Flash Lite, and Claude Haiku four point five on every dimension, and V4 Pro at one dollar seventy four input and three forty eight output is roughly a fifth the price of GPT five point four and an order of magnitude cheaper on output than Claude Opus four point seven or GPT-5.5. The piece closes on the export-control angle Latent Space also flags — that the marginal effect of restrictions on Chinese AI capability has diminished sharply.

AI Explained's Friday video synthesizes both threads under a single 'compute war intensifies' frame. The argument is that the dominant axis of model competition has shifted from one-dimensional intelligence scores to intelligence-per-dollar Pareto curves, and that V4 has now planted a flag at the cost-efficient frontier in a way no closed lab can match without disclosing pricing. The video also highlights the qualitative gap that remains — V4's tool-use, agentic-coding, and reasoning behaviours are described as roughly Claude four point five class, not Opus four point seven or GPT-5.5 class, but the open-weights property and the cost differential more than compensate for many production workloads. Together the three takes form the consensus view that V4 is the new floor for what 'good enough' open-weights frontier-class deployment looks like, and that the next round of pricing pressure on US-hosted APIs is now baked in.

How it was discussed

Latent Space emphasized that V4 is now runnable on Huawei Ascend chips — a hardware-sovereignty signal beyond the parameter count.
MIT Tech Review framed V4 around three reasons it matters: training-cost claims, MIT-licensed open weights at frontier scale, and pricing pressure on US-hosted APIs.
AI Explained placed V4 alongside GPT-5.5 in a single 'compute war intensifies' frame, arguing the relevant axis is now intelligence-per-dollar Pareto curves rather than headline benchmarks.

deepseekv4open-weightsmoehuawei-ascend

#2

Google to invest up to $40B in Anthropic in cash and compute

Industry 2026-04-24 TechCrunch AI

8.5

I 8.5 Im 8.0 P 8.0

TechCrunch's reporting on Friday details Google's commitment to invest up to forty billion dollars in Anthropic in a mix of cash and Google Cloud compute credits, structured to be drawn against incrementally as Anthropic scales its training infrastructure. The number puts Google's total exposure to Anthropic in roughly the same order of magnitude as Microsoft's exposure to OpenAI and follows Anthropic's separate announcement Monday of an expanded Amazon collaboration for up to five gigawatts of new compute. Read together, the two deals mean Anthropic is now contracting compute capacity across both major cloud incumbents — AWS for Trainium-based pre-training and inference, Google Cloud for TPU access — at a scale that puts the lab's total compute footprint into the hundreds of thousands of accelerators by the time the announced commitments come fully online.

The strategic shape is unusual. The Amazon partnership has been the headline relationship since 2023, with Anthropic running its largest training jobs on Trainium 2 and now Trainium 3, and Amazon distributing Claude through Bedrock as the lab's primary cloud customer surface. The Google relationship has historically been smaller, focused on TPU access for select workloads. The forty-billion-dollar commitment, if drawn down at anything like the headline rate, repositions Google as a co-equal infrastructure backer rather than a secondary cloud option, and aligns with Sundar Pichai's recent posture of treating AI capacity as a multi-vendor strategic priority rather than a Google-internal advantage. For Google specifically, the deal locks in Anthropic as a TPU customer and gives Google's silicon programme another large frontier-lab workload to optimise against — a hedge against the possibility that NVIDIA's pricing power continues to dominate the alternative.

For Anthropic, the read is that the company is now financed to keep pace with OpenAI's compute trajectory through at least the next two model generations after Opus four point seven. The fundraise pattern — drawing from both cloud incumbents simultaneously rather than picking one — means the lab maintains optionality on training stack and avoids the lock-in dynamics that have shaped OpenAI's relationship with Microsoft. It also continues a pattern across the frontier-lab landscape where capital is moving in increments of tens of billions of dollars at a time, and where the marginal model of competitive moat has shifted away from algorithmic innovation toward access to power, land, transformer capacity, and chip allocation. The downstream policy question — already being raised by analysts — is what concentration of compute access this implies for US frontier labs versus Chinese open-weights alternatives like DeepSeek V4, which became fully MIT-licensed and Huawei-Ascend-runnable the same week. The two stories together represent the shape of competition heading into the back half of 2026: capital-heavy Western labs versus open-weights Chinese alternatives running on domestic silicon.

anthropicgooglefundingtpu

#3

Apple's new CEO, and why Elon Musk wants to buy Cursor for $60B

Industry 2026-04-24 TechCrunch AI

8.1

I 7.5 Im 7.5 P 8.5

Friday's reporting in TechCrunch covers two intertwined transitions that together reshape the AI-coding-assistant market and Apple's posture in the platform layer. The first is Apple's CEO transition: Tim Cook is reportedly stepping down after fourteen years and the board has selected a successor — naming details have been confirmed in Friday's coverage and Stratechery's same-day weekly roundup positions the change as the formal end of the Cook era. The relevance for AI is that Cook's tenure has been unusually conservative on AI strategy compared to peers, and the transition coincides with Apple Intelligence still trailing Google's Gemini integration and Microsoft's Copilot in both capability and platform integration. Whether the new CEO accelerates Apple's posture — including possible acquisitions in the AI infrastructure or models space — is now the open question.

The second story, and the one with the larger valuation, is Elon Musk's reported sixty-billion-dollar bid to acquire Cursor (Anysphere). Cursor has become the dominant AI-first IDE among professional developers in the last eighteen months, with annual recurring revenue now reportedly past one billion dollars and a developer base that includes most of the major US tech companies under enterprise contracts. The sixty-billion valuation is roughly twice the company's most recent secondary-market mark, and would make Cursor the largest AI-native acquisition in history if it closed. Musk's stated motivation is to fold Cursor into a broader xAI/X stack alongside Grok and Tesla's autonomy programme, framing AI coding as the on-ramp to general-purpose agentic tooling.

The strategic question is whether Cursor takes the offer. Anysphere's leadership has been on a clear independent-company path with multiple primary fundraises in the last year and a board structure built around founder control. The price is high enough that fiduciary pressure from existing investors will be real, but the strategic fit with xAI is contested — Cursor's product depth depends on tight integration with Anthropic's Claude Code and OpenAI's Codex backends, and a Musk-owned Cursor would force the company to favour Grok in ways that would likely damage the developer experience. Stratechery's weekly column on Friday read the deal as more likely to fail than succeed, with the more plausible outcome being a counter-bid from one of the major hyperscalers if Anysphere does decide to sell. Either way, the bid resets the valuation conversation around AI coding tools — Cognition's Devin, GitHub Copilot, and Replit's Agent V2 all become more expensive prospective acquisitions overnight — and the M&A picture for the next two quarters will be shaped by where Cursor lands.

applecursormuskm&aai_coding

#4

OpenAI publishes the GPT-5.5 prompting guide

Frontier LLMs 2026-04-25 Simon Willison

7.2

I 7.0 Im 6.5 P 7.0

GPT-5.5 prompting guide Now that GPT-5.5 is available in the API , OpenAI have released a wealth of useful tips on how best to prompt the new model. Here's a neat trick they recommend for applications that might spend considerable time thinking before returning a user-visible response: Before any tool calls for a multi-step task, send a short user-visible update that acknowledges the request and states the first step. Keep it to one or two sentences. I've already noticed their Codex app doing this, and it does make longer running tasks feel less like the model has crashed. OpenAI suggest running the following in Codex to upgrade your existing code using advice embedded in their openai-docs skill: $openai-docs migrate this project to gpt-5.5 The upgrade guide the coding agent will follow is this one , which even includes light instructions on how to rewrite prompts to better fit the model. Also relevant is the Using GPT-5.5 guide , which opens with this warning: To get the most out of GPT-5.5, treat it as a new model family to tune for, not a drop-in replacement for gpt-5.2 or gpt-5.4 . Begin migration with a fresh baseline instead of carrying over every instruction from an older prompt stack. Start with the smallest prompt that preserves the product contract, then tune reasoning effort, verbosity, tool descriptions, and output format against representative examples. Interesting to see OpenAI recommend starting from scratch rather than trusting that existing prompts optimized for previous models will continue to work effectively with GPT-5.5. Tags: ai , openai , prompt-engineering , generative-ai , llms , gpt

gpt-5.5promptingopenai

#5

Space Force selects 12 firms to develop Golden Dome space-based interceptors

Government & Defense 2026-04-24 Defense OneBreaking DefenseDefenseScoop

7.2

I 6.5 Im 7.0 P 7.0

Defense giants and startups vie to create orbital defenses—even as the program’s czar concedes they may be unaffordable.

How it was discussed

Defense One focused on the policy framing — Golden Dome as the administration's strategic-defense centerpiece, twelve awardees signal broad-base contracting rather than down-selection.
Breaking Defense emphasized the architecture story: how interceptors integrate with the next-gen space layer the Pentagon has been describing this year.
DefenseScoop named the 12 awardees and noted the delivery-path expectations attached to the contracts.

space-forcegolden-domeinterceptorsmissile-defense

#6

There Will Be a Scientific Theory of Deep Learning — Simon, Kunin et al. (cross-source: HN + Generally Intelligent podcast)

Research 2026-04-24 HN AIGenerally Intelligent

7.2

I 6.0 Im 7.0 P 7.5

The Simon/Kunin et al. position paper on 'learning mechanics' continued its run on Hacker News today, with companion Generally Intelligent podcast coverage interviewing co-authors Jamie Simon and Daniel Kunin. The paper argues a coherent scientific theory of deep learning is now emerging across topics like neural-tangent kernels, feature learning, training dynamics, and generalization — and proposes 'learning mechanics' as a unifying name. HN comments split between excitement about the synthesis and pushback that 'mechanics' over-promises predictive power the field doesn't yet have.

How it was discussed

Generally Intelligent podcast positioned the paper as a coming-of-age signal — DL as a maturing science.
HN top comments flagged that NTK-era analyses already claim much of this turf and that the 'mechanics' branding may overreach what's actually predictive.

theorylearning-mechanicsinterpretability

#7

MIT Tech Review: three reasons DeepSeek's new model matters

Frontier LLMs 2026-04-24 MIT Tech Review AI

7.1

I 6.5 Im 7.0 P 6.5

MIT Technology Review's analysis frames why DeepSeek V4 (covered yesterday) matters in three ways: (1) it's the largest open-weights frontier-class model to date and ships under MIT license; (2) the disclosed training-cost figures continue to undercut US-hosted competitors by an order of magnitude, putting downstream pricing pressure on commercial APIs; (3) the release pattern — preview drop with full report pending — keeps the open-weights field on a tight cadence. The piece also flags US-export-control commentary as the policy story to watch.

deepseekv4analysisopen-weights

#8

US Navy ordered to 'shoot and kill' alleged Iranian mine-laying boats; mine-clearing prioritized

Government & Defense 2026-04-24 Defense OneDefenseScoop

6.9

I 6.5 Im 6.0 P 7.0

President Trump's directive to the US Navy authorized lethal force against suspected Iranian mine-laying activity in the Strait of Hormuz, in tension with the formally declared ceasefire. DefenseScoop's follow-up details the immediate shift in operational priorities toward mine-clearing assets, including unmanned surface vessels and explosive ordnance disposal teams. Notable for the AI-relevance angle because the US mine-warfare modernization plan leans heavily on autonomy-enabled systems.

How it was discussed

Defense One framed the rules-of-engagement story.
DefenseScoop focused on the mine-clearing asset prioritization that follows from the directive.

us-navyiranmine-warfarerules-of-engagement

#9

MIT Tech Review: health-care AI is here, but we don't know if it actually helps patients

AI for Science 2026-04-24 MIT Tech Review AI

6.6

I 6.0 Im 7.0 P 5.5

Tech Review surveys clinical AI deployments and the evidence gap behind them: triage, ambient-scribe, and imaging-assist models are being approved and rolled out faster than randomized controlled trials can validate patient-outcome benefit. The piece notes that FDA pathways currently wave through models on bench performance and post-market monitoring, but there are very few well-powered prospective trials measuring whether any of these tools actually improve diagnosis, treatment, or mortality at the population level.

healthcareclinical-evidencedeployment

#10

DARPA shares 'Deep Thoughts' solicitation for autonomous underwater drones

Robotic Autonomy 2026-04-24 DefenseScoop

6.4

I 6.0 Im 6.5 P 5.5

DARPA released the 'Deep Thoughts' solicitation seeking autonomous underwater vehicles capable of long-endurance loitering and decision-making in contested undersea environments. The solicitation language emphasizes onboard autonomy that can run with intermittent connectivity to surface assets. Aligns with the broader DoD push toward attritable autonomous platforms across domains; underwater is the slowest-moving of the three but is now drawing real investment.

darpaunderwaterautonomyuuv

#11

Meta's loss is Thinking Machines' gain

Industry 2026-04-24 TechCrunch AI

6.3

I 5.5 Im 5.5 P 7.0

Meta has been poaching talent from Thinking Machines Lab. But it's a two-way street.

metathinking-machinestalentmira-murati

#12

AI Explained (YouTube): GPT-5.5, DeepSeek V4, and the compute war

Frontier LLMs 2026-04-24 AI Explained

6.3

I 5.5 Im 5.5 P 7.0

AI Explained posted a video synthesis of the week's frontier-model news: GPT-5.5 deep dive (capabilities, pricing, practical uses), DeepSeek V4 paper highlights and head-to-head comparisons, an interlude on Mythos, a vibe-coded game built with GPT Image 2, and 50 'data points you wouldn't get from headlines'. Video runs roughly 25 minutes; treats GPT-5.5 and V4 as two halves of the same compute-war frame.

gpt-5.5deepseek-v4computevideo

#13

Anthropic publishes update on election safeguards

Safety, Policy & Regulation 2026-04-24 Anthropic News

6.2

I 5.0 Im 6.5 P 5.5

Anthropic published an update on the election-integrity safeguards built into Claude — covering refusals around persuasion targeting, the prompt-injection / jailbreak red-team coverage they extended for the cycle, and the way the org partnered with election authorities and fact-check NGOs through 2025–26. Policy-page item rather than research output; still relevant given the timing.

anthropicelectionspolicy

#14

Stratechery 2026.17: He Came, He Saw, He Cooked — end of the Tim Cook era, Cursor & SpaceX, Cold War 2.0 fronts

Industry 2026-04-24 Stratechery

6.2

I 5.5 Im 6.0 P 6.0

Stratechery's weekly roundup covers Tim Cook's exit from Apple, the Cursor/SpaceX/Anysphere acquisition speculation, and ongoing US–China strategic-tech dynamics (the 'Cold War 2.0' framing Thompson has been developing). For an AI-focused reader the meatiest threads are the Cursor coverage — Musk's $60B bid is reported elsewhere in the digest — and how pricing and platform-control dynamics around AI coding tools are now squarely in M&A scope.

appletim-cookcursorcold-war-2weekly-roundup

#15

Satellites at the center: Inside the Pentagon’s next-gen space architecture

Government & Defense 2026-04-24 Breaking Defense

6.2

I 6.0 Im 6.0 P 5.5

From emerging data networks to missile tracking and cyber resilience, Breaking Defense’s latest eBook brings together essential reporting on the evolving role of satellites in national security....

defense

#16

Pentagon’s Munitions Acceleration Council identifies 14 ‘critical’ weapons for 2027

Government & Defense 2026-04-24 Breaking Defense

6.2

I 6.0 Im 6.0 P 5.5

“We’re making them put skin in the game … and we expect them to meet the ramp rates that they agree to. And, if they don’t, there’ll be penalties for them,” said Jules “Jay” Hurst, who is performing the duties of the Pentagon comptroller....

defense

#17

ComfyUI hits $500M valuation as creators seek more control over AI-generated media

Generative Media 2026-04-24 TechCrunch AI

6.1

I 6.0 Im 5.5 P 6.0

ComfyUI — the open-source node-graph runner that's become the de-facto interface for serious diffusion users — closed financing at a $500M valuation. The fundraise is a marker for how much of the generative-media stack now runs through tooling layers above the foundation models, and how creators are willing to trade out-of-the-box image quality for the explicit control a node graph affords. Valuation is rich for an open-core company; the bull case is workflow lock-in across pro studios.

comfyuifundingnode-graphcreator-tools

#18

LLaTiSA: Towards Difficulty-Stratified Time Series Reasoning from Visual Perception to Semantics

Evaluations & Benchmarks 2026-04-23 arXivHF Daily Papers

6.1

I 5.5 Im 5.5 P 6.0

Comprehensive understanding of time series remains a significant challenge for Large Language Models (LLMs). Current research is hindered by fragmented task definitions and benchmarks with inherent ambiguities, precluding rigorous evaluation and the development of unified Time Series Reasoning Models(TSRMs). To bridge this gap, we formalize Time Series Reasoning (TSR) via a four-level taxonomy of increasing cognitive complexity.

time-seriesreasoningbenchmark

#19

Co-Evolving LLM Decision and Skill Bank Agents for Long-Horizon Tasks

Agents & Tool Use 2026-04-23 arXivHF Daily Papers

6.1

I 5.5 Im 5.5 P 6.0

Long horizon interactive environments are a testbed for evaluating agents skill usage abilities. These environments demand multi step reasoning, the chaining of multiple skills over many timesteps, and robust decision making under delayed rewards and partial observability. Games are a good testbed for evaluating agent skill usage in environments.

llm-agentskill-banklong-horizon

#20

Seeing Fast and Slow: Learning the Flow of Time in Videos

Research 2026-04-23 arXivHF Daily Papers

6.1

I 5.5 Im 5.5 P 6.0

How can we tell whether a video has been sped up or slowed down? How can we generate videos at different speeds? Although videos have been central to modern computer vision research, little attention has been paid to perceiving and controlling the passage of time.

videotemporalgenerative

#21

VLAA-GUI: Knowing When to Stop, Recover, and Search, A Modular Framework for GUI Automation

Agents & Tool Use 2026-04-23 arXivHF Daily Papers

6.1

I 5.5 Im 5.5 P 6.0

Autonomous GUI agents face two fundamental challenges: early stopping, where agents prematurely declare success without verifiable evidence, and repetitive loops, where agents cycle through the same failing actions without recovery. We present VLAA-GUI, a modular GUI agentic framework built around three integrated components that guide the system on when to Stop, Recover, and Search. First, a mandatory Completeness Verifier enforces UI-observable success criteria and verification at every finish step -- with an agent-level verifier that cross-examines completion claims with decision rules, rej...

gui-agentmodularstop-recover-search

#22

Hybrid Policy Distillation for LLMs

Post-Training 2026-04-23 arXivHF Daily Papers

6.1

I 5.5 Im 5.5 P 6.0

Knowledge distillation (KD) is a powerful paradigm for compressing large language models (LLMs), whose effectiveness depends on intertwined choices of divergence direction, optimization strategy, and data regime. We break down the design of existing KD methods and present a unified view that establishes connections between them, reformulating KD as a reweighted log-likelihood objective at the token level. We further propose Hybrid Policy Distillation (HPD), which integrates the complementary advantages of forward and reverse KL to balance mode coverage and mode-seeking, and combines off-policy...

distillationkdllm

#23

Context Unrolling in Omni Models

Multimodal 2026-04-23 arXivHF Daily Papers

6.1

I 5.5 Im 5.5 P 6.0

We present Omni, a unified multimodal model natively trained on diverse modalities, including text, images, videos, 3D geometry, and hidden representations. We find that such training enables Context Unrolling, where the model explicitly reasons across multiple modal representations before producing predictions. This process enables the model to aggregate complementary information across heterogeneous modalities, facilitating a more faithful approximation of the shared multimodal knowledge manifold and improving downstream reasoning fidelity.

omniunifiedmodalities

#24

EditCrafter: Tuning-free High-Resolution Image Editing via Pretrained Diffusion Model

Generative Media 2026-04-23 arXivHF Daily Papers

6.1

I 5.5 Im 5.5 P 6.0

We propose EditCrafter, a high-resolution image editing method that operates without tuning, leveraging pretrained text-to-image (T2I) diffusion models to process images at resolutions significantly exceeding those used during training. Leveraging the generative priors of large-scale T2I diffusion models enables the development of a wide array of novel generation and editing applications. Although numerous image editing methods have been proposed based on diffusion models and exhibit high-quality editing results, they are difficult to apply to images with arbitrary aspect ratios or higher reso...

diffusionimage-edittuning-free

#25

Vista4D: Video Reshooting with 4D Point Clouds

Generative Media 2026-04-23 arXivHF Daily Papers

6.1

I 5.5 Im 5.5 P 6.0

We present Vista4D, a robust and flexible video reshooting framework that grounds the input video and target cameras in a 4D point cloud. Specifically, given an input video, our method re-synthesizes the scene with the same dynamics from a different camera trajectory and viewpoint. Existing video reshooting methods often struggle with depth estimation artifacts of real-world dynamic videos, while also failing to preserve content appearance and failing to maintain precise camera control for challenging new trajectories.

video4dreshooting

#26

UniGenDet: A Unified Generative-Discriminative Framework for Co-Evolutionary Image Generation and Generated Image Detection

Generative Media 2026-04-23 arXivHF Daily Papers

6.1

I 5.5 Im 5.5 P 6.0

In recent years, significant progress has been made in both image generation and generated image detection. Despite their rapid, yet largely independent, development, these two fields have evolved distinct architectural paradigms: the former predominantly relies on generative networks, while the latter favors discriminative frameworks. A recent trend in both domains is the use of adversarial information to enhance performance, revealing potential for synergy.

unifiedgen-discco-evolution

#27

Temporally Extended Mixture-of-Experts Models

Efficiency 2026-04-23 arXivHF Daily Papers

6.1

I 5.5 Im 5.5 P 6.0

Mixture-of-Experts models, now popular for scaling capacity at fixed inference speed, switch experts at nearly every token. Once a model outgrows available GPU memory, this churn can render optimizations like offloading and pre-fetching ineffective. We make the case that the options framework in reinforcement learning is a perfect match to tackle this problem, and argue for temporally extended mixture-of-experts layers.

moetemporalexperts

#28

Trust but Verify: Introducing DAVinCI -- A Framework for Dual Attribution and Verification in Claim Inference for Language Models

Safety, Policy & Regulation 2026-04-23 arXivHF Daily Papers

6.1

I 5.5 Im 5.5 P 6.0

Large Language Models (LLMs) have demonstrated remarkable fluency and versatility across a wide range of NLP tasks, yet they remain prone to factual inaccuracies and hallucinations. This limitation poses significant risks in high-stakes domains such as healthcare, law, and scientific communication, where trust and verifiability are paramount. In this paper, we introduce DAVinCI - a Dual Attribution and Verification framework designed to enhance the factual reliability and interpretability of LLM outputs.

attributionverificationfactuality

#29

Explainable Disentangled Representation Learning for Generalizable Authorship Attribution in the Era of Generative AI

Safety, Policy & Regulation 2026-04-23 arXivHF Daily Papers

6.1

I 5.5 Im 5.5 P 6.0

Learning robust representations of authorial style is crucial for authorship attribution and AI-generated text detection. However, existing methods often struggle with content-style entanglement, where models learn spurious correlations between authors' writing styles and topics, leading to poor generalization across domains. To address this challenge, we propose Explainable Authorship Variational Autoencoder (EAVAE), a novel framework that explicitly disentangles style from content through architectural separation-by-design.

authorshipdetectionstyle

#30

Coevolving Representations in Joint Image-Feature Diffusion

Research 2026-04-23 arXivHF Daily Papers

6.1

I 5.5 Im 5.5 P 6.0

Joint image-feature generative modeling has recently emerged as an effective strategy for improving diffusion training by coupling low-level VAE latents with high-level semantic features extracted from pre-trained visual encoders. However, existing approaches rely on a fixed representation space, constructed independently of the generative objective and kept unchanged during training. We argue that the representation space guiding diffusion should itself adapt to the generative task.

diffusionrepresentationjoint-features

#31

Encoder-Free Human Motion Understanding via Structured Motion Descriptions

Multimodal 2026-04-23 arXivHF Daily Papers

6.1

I 5.5 Im 5.5 P 6.0

The world knowledge and reasoning capabilities of text-based large language models (LLMs) are advancing rapidly, yet current approaches to human motion understanding, including motion question answering and captioning, have not fully exploited these capabilities. Existing LLM-based methods typically learn motion-language alignment through dedicated encoders that project motion features into the LLM's embedding space, remaining constrained by cross-modal representation and alignment. Inspired by biomechanical analysis, where joint angles and body-part kinematics have long served as a precise de...

motionencoder-freellm

#32

PersonalAI: A Systematic Comparison of Knowledge Graph Storage and Retrieval Approaches for Personalized LLM agents

Agents & Tool Use 2026-04-23 arXivHF Daily Papers

6.1

I 5.5 Im 5.5 P 6.0

Personalizing language models by effectively incorporating user interaction history remains a central challenge in the development of adaptive AI systems. While large language models (LLMs), combined with Retrieval-Augmented Generation (RAG), have improved factual accuracy, they often lack structured memory and fail to scale in complex, long-term interactions. To address this, we propose a flexible external memory framework based on a knowledge graph that is constructed and updated automatically by the LLM.

personalizationkgmemory

#33

3D-VCD: Hallucination Mitigation in 3D-LLM Embodied Agents through Visual Contrastive Decoding

Multimodal 2026-04-23 arXivHF Daily Papers

6.1

I 5.5 Im 5.5 P 6.0

Large multimodal models are increasingly used as the reasoning core of embodied agents operating in 3D environments, yet they remain prone to hallucinations that can produce unsafe and ungrounded decisions. Existing inference-time hallucination mitigation methods largely target 2D vision-language settings and do not transfer to embodied 3D reasoning, where failures arise from object presence, spatial layout, and geometric grounding rather than pixel-level inconsistencies. We introduce 3D-VCD, the first inference-time visual contrastive decoding framework for hallucination mitigation in 3D embo...

3dembodiedhallucination

#34

Marked-up Mac minis flood eBay amid shortages driven by AI

Infrastructure 2026-04-24 TechCrunch AI

6.0

I 5.0 Im 5.0 P 7.0

Local-LLM demand is rippling into Apple-silicon supply: M5 Mac minis are out of stock at most retailers and reselling for 1.3–1.8× MSRP on eBay. The unified-memory bandwidth on the high-end M5 SKUs makes them surprisingly competitive for running models like DeepSeek V4-Flash, and the secondary-market pricing reflects the demand mismatch that's been building since open-weights MoEs started landing in 100B–300B-active-parameter territory.

apple-siliconlocal-inferencesupply

#35

Pentagon planning fleet of missile-killing laser drones

Government & Defense 2026-04-24 C4ISRNET

6.0

I 5.5 Im 6.0 P 5.5

C4ISRNET reports on Pentagon plans for an attritable laser-drone fleet for missile defense — high-energy lasers mounted on autonomous airborne platforms positioned to engage cruise missiles and drones within the layered missile-defense architecture. The story aligns with the parallel Golden Dome announcements and with Australia's counter-drone laser/interceptor contracts also reported today.

lasersdronesmissile-defenseautonomy

#36

Australia awards counter-drone contracts: lasers, interceptors

Government & Defense 2026-04-24 C4ISRNET

5.7

I 5.5 Im 5.5 P 5.0

Australia's Department of Defence awarded multiple contracts under its counter-drone programme for laser-based and kinetic interceptor systems. The buy follows the broader AUKUS priority on rapidly fielding asymmetric counter-UAS capability and parallels the Pentagon laser-drone story.

australiacounter-dronelasersinterceptorsaukus

#37

Three carriers operate in Middle East for first time since 2003: CENTCOM

Government & Defense 2026-04-24 Breaking Defense

5.7

I 5.5 Im 5.5 P 5.0

Aircraft carriers George HW Bush, Abraham Lincoln and Gerald R Ford are now operating in the Middle East amid Operation Epic Fury....

defense

#38

Scaling Laws: Facts & Myths About AI's Energy Usage with Gavin McCormick - Lawfare

Industry 2026-04-24 Lawfare

5.7

I 4.5 Im 6.0 P 5.5

Scaling Laws: Facts & Myths About AI's Energy Usage with Gavin McCormick Lawfare...

policylaw

#39

Simon Willison cites Nilay Patel: 'The people do not yearn for automation'

Industry 2026-04-24 Simon Willison

5.6

I 4.5 Im 5.0 P 6.0

Willison links a Verge written-and-video essay by Nilay Patel arguing that consumer hostility to AI automation is durable — ChatGPT usage keeps growing while the underlying narrative around AI in mass culture remains unfavourable. Willison's own commentary highlights the gap between observed product-level adoption metrics and the political/cultural sentiment that's now hardening, a tension worth tracking as policy moves.

public-opinionautomationverge

#40

Dwarkesh announces blog prize for the big questions about AI

Research 2026-04-24 Dwarkesh Podcast

5.5

I 4.0 Im 5.0 P 6.0

Dwarkesh Patel announced a blog-essay prize aimed at the open structural questions about AI's near-term trajectory — alignment, agentic deployment, compute economics, and policy. Submissions are evaluated on argument quality and originality rather than position. Prize structure and judges are listed on the post; deadline TBD.

dwarkeshwriting-prizeforecasting

#41

Show HN: Browser Harness — gives an LLM freedom to complete any browser task

Agents & Tool Use 2026-04-24 HN AI

5.3

I 4.5 Im 4.5 P 6.0

Show-HN of an open-source browser harness that exposes a small action API (click, type, scroll, screenshot, extract) to any LLM and lets the model run open-ended browser tasks. Comments split between curiosity at the prompt-engineering layer and concerns about safety/reliability when the model is given uncontrolled DOM access. Useful as a comparison point against Anthropic's computer-use, OpenAI's Operator, and the in-house browser-agent stacks at Cursor and Cognition.

browser-agenttool-useshow-hn

#42

Musk Snubs French Authorities - Lawfare

Safety, Policy & Regulation 2026-04-24 Lawfare

5.2

I 4.5 Im 4.5 P 5.5

Musk Snubs French Authorities Lawfare...

policylaw

#43

South Korea police arrest man for posting AI photo of runaway wolf

Safety, Policy & Regulation 2026-04-24 HN AI

5.2

I 4.0 Im 5.0 P 5.5

Korean police arrested an individual who posted an AI-generated photograph of a runaway wolf during the active animal-control alert. The case is a small but pointed enforcement example of how synthetic-image misuse is now drawing criminal liability in jurisdictions with relatively muscular content statutes — fits the broader pattern of deepfake enforcement creep in late-2025 / early-2026.

deepfakemisinformationsouth-korea

#44

Airwaves of Power: Why the Pentagon Should Shift to a Commercial-First Spectrum Model

Government & Defense 2026-04-24 War on the Rocks

5.0

I 4.5 Im 5.0 P 4.5

The U.S. military is firing million-dollar missiles at Iranian drones that cost a tiny fraction as much — a striking example of the kind of overmatch modern warfare punishes.The Department of Defense’s approach to electromagnetic spectrum policy follows a similar logic, occupying prime mid-band frequencies for vital but relatively low-throughput national security uses — including radars, satellite communications, navigation, and electronic warfare — even as those same bands could generate much l...

defensepolicy

#45

Presence or Capacity? The Coast Guard Can Have Both Through Small Boat Stations

Government & Defense 2026-04-24 War on the Rocks

5.0

I 4.5 Im 5.0 P 4.5

Closing small boat stations has proven difficult. Leaving them unchanged is operationally inefficient. These units are enduring parts of the Coast Guard’s force structure, yet their full potential is not always realized. This article proposes a model to better align their mission with national priorities.During the recent Senate confirmation hearing for the next commandant of the U.S. Coast Guard, senators raised a wide range of global maritime concerns, including Arctic competition, cyber threa...

defensepolicy

#46

Simon Willison: llm 0.31 — adds GPT-5.5 model and prompt-template defaults

AI Coding 2026-04-24 Simon Willison

4.9

I 4.5 Im 4.0 P 5.0

Willison shipped llm 0.31, his CLI tool for talking to language models. The release adds a `gpt-5.5` model identifier (matching the GPT-5.5 API rollout earlier this week), a new option for setting prompt-template defaults, and minor UX polish. Tooling-layer release rather than a research item, but it's the canonical 'Day 1 of GPT-5.5 in API' marker for users running language models from the command line.

llmcligpt-5.5tooling

#47

President Trump should secure America’s nuclear future by taking weapons out of DoE

Government & Defense 2026-04-24 Breaking Defense

4.9

I 4.5 Im 4.5 P 4.5

America’s nuclear weapons arsenal is too central to national defense to be buried inside the Department of Energy, argues Franklin C. Miller and Frank A. Rose....

defense

#48

Lockheed limbo in Lima? Firm says Peru is buying F-16s, but questions remain

Government & Defense 2026-04-24 Breaking Defense

4.9

I 4.5 Im 4.5 P 4.5

Peru’s interim president indicated the deal was on hold, as the US Embassy in Lima insists at least part of it has already been signed....

defense

#49

‘Clear divide’ in military readiness for countries on NATO’s eastern flank: Report

Government & Defense 2026-04-24 Breaking Defense

4.9

I 4.5 Im 4.5 P 4.5

“We further found that sustainment in [many Eastern Flank countries] is the real and serious gap: maintenance capabilities, logistical limitations stemming from poor transportation infrastructure …,” one of the authors of the report told Breaking Defense....

defense

#50

Oral Argument Preview: Chatrie v. United States - Lawfare

Safety, Policy & Regulation 2026-04-24 Lawfare

4.9

I 4.5 Im 4.5 P 4.5

Oral Argument Preview: Chatrie v. United States Lawfare...

policylaw

#51

Securing the ‘last mile’ of critical federal work

Government & Defense 2026-04-24 FedScoop

4.5

I 4.0 Im 4.5 P 4.0

A former OMB deputy administrator for IT and E-government details the next frontier in federal cybersecurity. The post Securing the ‘last mile’ of critical federal work appeared first on FedScoop ....

federalgovernment

#52

Federal union projects to lose ‘tens of thousands’ of members, court filing shows

Government & Defense 2026-04-24 FedScoop

4.5

I 4.0 Im 4.5 P 4.0

The National Treasury Employees Union said a Trump order and OPM rule on collective bargaining has caused irreparable harm. An appeals court judge previously said those harms were “speculative.” The post Federal union projects to lose ‘tens of thousands’ of members, court filing shows appeared first on FedScoop ....

federalgovernment

#53

Special Forces soldier charged with using classified information to profit off online prediction market released on $250K bond

Government & Defense 2026-04-24 DefenseScoop

4.2

I 3.5 Im 4.0 P 4.0

The DOJ accused 38-year-old Master Sgt. Gannon Ken Van Dyke of using classified information to place tens of thousands of dollars worth of bets on Polymarket in the days before the mission. The post Special Forces soldier charged with using classified information to profit off online prediction market released on $250K bond appeared first on DefenseScoop ....

defense