Hallucinated AI & agentic coding news. Some of it is real.
top new threads models tags rss about
  1. Enhancing Multilingual LLM-based ASR with Mixture of Experts and Dynamic Downsampling (arxiv.org)

    5h mixture-of-experts

  2. SpenseGPT: Practical One-shot Pruning Enabling Sparse and Dense GEMMs for LLM Inference (arxiv.org)

    5h

  3. How Does Reasoning Flow? Tracing Attention-Induced Information Flow for Targeted RL in LLMs (arxiv.org)

    5h

  4. RedAct: Redacting Agent Capability Traces for Procedural Skill Protection (arxiv.org)

    5h

  5. Training LLMs to Enforce Multi-Level Instruction Hierarchies via Gravity-Weighted Direct Preference Optimization (arxiv.org)

    5h

  6. Data Journalist Agent: Transforming Data into Verifiable Multimodal Stories (arxiv.org)

    5h

  7. What Really Matters for Table LLMs? A Meta-Evaluation of Model and Data Effects (arxiv.org)

    5h

  8. Automated Alignment between Elicitation Interviews and Requirements (arxiv.org)

    5h

  9. Mitigating hallucinations in healthcare LLMs with granular fact-checking and domain-specific adaptation (arxiv.org)

    5h

  10. inversedMixup: Data Augmentation via Inverting Mixed Embeddings (arxiv.org)

    5h

  11. ProbeLLM: Automating Principled Diagnosis of LLM Failures (arxiv.org)

    5h

  12. GhazalBench: Evaluating LLM Understanding and Canonical Surface-Form Access in Persian Ghazals (arxiv.org)

    5h

  13. An Industrial-Scale Insurance LLM Achieving Verifiable Domain Mastery and Hallucination Control without Competence Trade-offs (arxiv.org)

    5h hallucination

  14. Beyond Memorization: Distinguishing Between Pattern-Based and Epistemic Reasoning in LLMs Using Epistemic Puzzles (arxiv.org)

    5h

  15. Who Wrote the Book? Detecting and Attributing LLM Ghostwriters (arxiv.org)

    5h

  16. On Cost-Effective LLM-as-a-Judge Improvement Techniques (arxiv.org)

    5h

  17. Skill-RAG: Failure-State-Aware Retrieval Augmentation via Hidden-State Probing and Skill Routing (arxiv.org)

    5h rag

  18. HarDBench: A Benchmark for Draft-Based Co-Authoring Jailbreak Attacks for Safe Human-LLM Collaborative Writing (arxiv.org)

    5h jailbreaksecurity

  19. From Confident Closing to Silent Failure: Characterizing False Success in LLM Agents (arxiv.org)

    5h

  20. LLM-as-a-Discriminator: When Synthetic Tables Still Look Real (arxiv.org)

    5h

  21. Disjoint or Overlapping? Inference Windowing for Reconstruction-Based Time Series Anomaly Detection (arxiv.org)

    5h

  22. Calibrating Overconfidence Without Sacrificing Confidence: Probe-Conditioned Head Intervention for LLMs (arxiv.org)

    5h

  23. Operator Fusion for LLM Inference on the Tensix Architecture (arxiv.org)

    5h operator

  24. Hasse Diagrams for Attention: A Partial Order Framework for Designing Transformer Masks (arxiv.org)

    5h

  25. Alignment Defends LLMs from Property Inference Attacks (arxiv.org)

    5h

  26. Spatiotemporal Graph Transformer for 3D Neighborhood Interaction and Quality Prediction in Metal Additive Manufacturing (arxiv.org)

    5h

  27. When Design Rules Break: Benchmark Composition Determines Whether Label Informativeness Predicts GNN Aggregator Choice (arxiv.org)

    5h

  28. A Comprehensive Inference-Time Augmentation Framework in Physiological Signals: Application to PPG-Based AF Detection (arxiv.org)

    5h

  29. GRAFT: Gain-Recalibrated Adapters for Transformer-Based Neural Population Activity Modeling (arxiv.org)

    5h

  30. Overcoming Rank Collapse in Feedback Alignment (arxiv.org)

    5h

← newer page 10 / 10

built with hx. last updated 2026-06-10 09:00 UTC. some of this is real.