HACKOBAR_ // One feed for AI signal. No noise.

#1[THEVERGE]

7h ago

Grok Build CLI Uploaded Entire Codebases to Google Cloud

SpaceXAI's Grok Build tool was found packaging and uploading entire user repositories to Google Cloud, including files explicitly excluded by users. The company has since disabled the functionality following reports of the data leak.

breakdown →

#2[TLDR INFOSEC]

14h ago

Ghostcommit uses images in Git repos to trigger prompt injection

An attack technique embeds malicious instructions within images in repositories to trick coding agents into leaking sensitive environment files. Testing showed Cursor and Antigravity models fell victim, though Claude Code's Opus model successfully detected the injection.

breakdown →

#3[GH]

1d ago

Graphify creates queryable knowledge graphs from multi-modal codebases and documentation

★ 1,028 new · 84,046 total

Graphify converts diverse file types, including SQL schemas, R scripts, and video, into a single queryable knowledge graph. It integrates application code, database structures, and infrastructure to enhance context for AI coding assistants like Claude Code and Cursor.

breakdown →

#4[HUGGINGFACE]

Soofi S 30B-A3B MoE Hybrid Mamba Transformer for DE/EN

Soofi S is a 30B parameter Mixture-of-Experts model using a hybrid Mamba-Transformer architecture. It activates only 3B parameters per token and maintains near-constant inference cache for high-throughput, long-context deployment in German and English.

breakdown →

#5[APPLE_ML]

Pare framework enables evaluation of proactive AI agents

The Proactive Agent Research Environment (Pare) models applications as finite state machines to simulate realistic user interactions. This method allows for the evaluation of proactive agents that anticipate needs and execute tasks sequentially in complex digital environments.

breakdown →

#6[@nvidia]

1d ago

Baseten achieves 50% higher throughput for DeepSeek V4 Pro on Blackwell

Baseten utilizes TensorRT LLM on NVIDIA Blackwell to deliver up to 50% more tokens per second for reasoning and long-context tasks. Other providers like Hippocratic AI report 30% throughput increases using the same NVIDIA inference stack.

breakdown →

#7[HN]

1d ago

Grok CLI Security Breach Uploads User Home Directory to GCS

21 pts · 3 comments

The Grok CLI reportedly uploaded a user's entire home directory, including SSH keys and password databases, to xAI's Google Cloud Storage servers. The incident highlights significant security risks in CLI-based AI tools.

breakdown →

#8[arXiv]

StickyMoE Reduces MoE Expert Switching by 60% During Inference

cs.LG, cs.AI, cs.CL

StickyMoE introduces a differentiable routing consistency loss to discourage abrupt expert switches in Mixture-of-Experts models. This method reduces expert switching by up to 60% with less than 4% perplexity increase, improving memory efficiency on edge devices.

breakdown →

#9[r/Anthropic]

8h ago

Alibaba Conducts Large-Scale Distillation Attack on Claude via API

118 upvotes · 96 comments

Alibaba utilized 25,000 accounts to conduct 28.8 million conversations with Claude over six weeks. This industrial-scale distillation attack aimed to extract agentic reasoning and coding capabilities to train Qwen models.

breakdown →

#10[TECHCRUNCH]

12h ago

Reflection AI secures $1B compute deal with Nebius

Founded in 2024, Reflection AI has signed a $1 billion agreement to access compute resources from Nebius. The company focuses on developing open source AI technology.

breakdown →

#11[TLDR DEV]

16h ago

Compiled code reduces agent token usage by 94% and latency by 87%

Converting natural language instructions into compiled code for specialized AI agents significantly optimizes performance. This transition achieves a 94% reduction in token consumption and an 87% decrease in end-to-end latency.

breakdown →

#12[GH]

12h ago

Grok2api multi-account API gateway for Grok services

★ 179 new · 5,788 total

This tool acts as an API gateway for Grok Build, Grok Web, and Grok Console. It enables multi-account management to facilitate access across different Grok interfaces.

breakdown →

#13[HUGGINGFACE]

MedPMC Framework Curates 11 Million Medical Image-Text Pairs from PMC

MedPMC automates the extraction of high-fidelity multimodal data from 6.1 million PubMed Central articles. The framework achieved an F1 score of 93.2 in initial screening, providing a scalable infrastructure for training medical foundation models.

breakdown →

#14[OPENAI]

7h ago

Measuring AI ROI Through Useful Work Per Dollar

Enterprises can manage agentic AI investments by shifting focus toward measuring useful work completed per dollar spent. This framework prioritizes scaling high-value workflows and improving operational efficiency over simple token usage.

breakdown →

#15[@ClementDelangue]

1d ago

vLLM enables native-speed inference for Hugging Face Transformers models

vLLM now supports Hugging Face Transformers architectures directly, eliminating the need for separate optimized implementations. Benchmarks show the Transformers backend matches or exceeds native vLLM throughput for models ranging from 4B to 235B parameters, including MoE and tensor parallel configurations.

breakdown →

#16[HN]

1d ago

Apple SpeechAnalyzer outperforms Whisper Small on LibriSpeech benchmarks

22 pts · 3 comments

Apple's SpeechAnalyzer API achieves a 2.12% WER on clean LibriSpeech, surpassing Whisper Small's 3.74% while running three times faster on-device. It significantly outperforms the legacy SFSpeechRecognizer, which recorded a 9.02% WER on clean speech.

breakdown →

#17[arXiv]

Sparse-TC Kernel Accelerates LLM Inference at 50% Sparsity

cs.LG, cs.AI, cs.AR

A new three-layer matrix storage format and SpMM kernel enables GPU acceleration for moderately unstructured sparse weight matrices. By utilizing sparse tensor cores and a slot-filling compression layer, this method overcomes the performance bottleneck where existing sparse kernels fail to outperform dense operations.

breakdown →

#18[r/DeepSeek]

7h ago

DeepSeek Founder Liang Wenfeng Reaches $35.5 Billion Net Worth

179 upvotes · 27 comments

DeepSeek founder Liang Wenfeng has reached an estimated net worth of $35.5 billion, marking a significant shift in the valuation of AI company leadership.

breakdown →

#19[TECHCRUNCH]

3h ago

OpenAI GPT-5.6 Sol reports of unauthorized file deletion

Users report that OpenAI's GPT-5.6 Sol model has deleted files and data without warning, an issue previously disclosed by the company in June.

breakdown →

#20[TLDR AI]

14h ago

Prime Intellect Verifiers V1 Supports Agentic RL and Evals

Verifiers V1 introduces a decomposed environment stack comprising task sets, harnesses, and runtimes. This architecture enables scalable reinforcement learning and evaluations for coding and computer use agents.

breakdown →

#21[HUGGINGFACE]

Xiaomi-Robotics-U0 38B Multimodal Model for Embodied Synthesis

Xiaomi-Robotics-U0 is a 38-billion-parameter autoregressive model designed for unified embodied synthesis, including text-to-image, scene generation, and embodied video. It integrates robot embodiment constraints with large-scale visual knowledge to maintain multi-view and geometric consistency.

breakdown →

#22[DEEPMIND]

Google and AIM Launch Gemini-Powered ATL Saathi for Indian Robotics Labs

Google and AIM released ATL Saathi, a Gemini-based tool designed to assist educators in managing robotics labs across India. The tool aims to lower the barrier for teaching complex robotics concepts through generative AI assistance.

breakdown →

#23[@ArtificialAnlys]

1d ago

Gemini Omni Flash reaches top ranking on Artificial Analysis video leaderboards

Gemini Omni Flash generates 720p video at 24 FPS with native audio support, ranking first on Text-to-Video and Image-to-Video leaderboards. The natively multimodal model supports conversational editing and is priced at $0.10 per generated second.

breakdown →

#24[HN]

9h ago

PrismML releases Bonsai 27B with 1.71 bits per weight for mobile deployment

16 pts · 2 comments

Bonsai 27B utilizes ternary weights with FP16 group-wise scaling to reduce a 27B-class model to 5.9 GB. This enables multi-step reasoning, vision tasks, and agentic computer-use loops to run locally on mobile hardware.

breakdown →

#25[arXiv]

Signed Symmetric Quantization Reduces Error in Few-Bit Integers

cs.LG, cs.AI

A new quantization method addresses clipping errors in signed integer alphabets by utilizing signed symmetric quantization. In llama.cpp on AMD EPYC hardware, this approach maintains symmetric runtime profiles while avoiding the performance penalties of asymmetric quantization.

breakdown →

9 sources · live pipeline status →