https://huggingface.co/unsloth/Mistral-Medium-3.5-128B-GGUF Mistral Medium 3.5 128B Mistral Medium 3.5 is our first flagship merged model. It is a dense 128B model with a 256k context window, handling instruction-following, reasoning, and…
#mistral
103 items
mistralai/Mistral-Medium-3.5-128B · Hugging Face (huggingface.co via reddit) Mistral Medium 3.5 (mistral.ai via hn) Introducing Mistral Medium 3.5, remote coding agents in Vibe, plus new Work mode in Le Chat for complex tasks.
Recent Open models from last 6 Months - Nov 2025 - Apr 2026 (www.reddit.com) I created this chart with recent open models from last 6 months. Few might be older than that possibly.
Mistral AI Acquires Emmi AI to Create the Leading AI Stack (www.emmi.ai via hn) Mistral AI Acquires Emmi AI to Create the Leading AI Stack for Industrial Engineering European AI leader Mistral AI is acquiring Emmi AI in one of Europe’s most important and strategic AI acquisitions to date. Founded in Linz, Emmi AI has…
Mistral Médium 3.5 is here (www.reddit.com) https://huggingface.co/mistralai/Mistral-Medium-3.5-128B
Mistral AI founder to French Parliament: "Engineers at Mistral no longer write a single line of code (www.reddit.com) https://youtu.be/vczBo0AvbTI?si=pglMPmTjsq-TNJa9&t=375 "Today, engineers at Mistral no longer write a single line of code. It used to be more of a craft if you were an individual contributor.
Unsloth solved bug in Mistral Medium 3.5 implementation (www.reddit.com) https://unsloth.ai/docs/models/mistral-3.5 "May 1, 2026 Update: We worked with Mistral to fix Mistral Medium 3.5 inference affecting some implementations, and released updated GGUFs with the fix (NOT related to Unsloth or our quants). The…
I catalogued every way local models break JSON output and built a repair library, here's what I found across 288 model calls (www.reddit.com) I've been running structured output prompts through a bunch of models on OpenRouter for the past few months — Llama 3, Mistral, Command R, DeepSeek, Qwen, and every other model on OpenRouter — alongside the usual closed-source suspects. 28…
Something from Mistral (Vibe) tomorrow (www.reddit.com) Model(s) or Tool upgrade/New Tool? Source Tweet : https://xcancel.com/mistralvibe/status/2049147645894021147#m
Grok 4.3 tops the Consistency Leaderboard in the LLM Sycophancy Benchmark, largely because it is one of the most cautious models. (www.reddit.com) Does a model maintain the same judgment or does it side with whoever is speaking? This benchmark measures that inconsistency directly.
Mistral Medium 3.5: A reliability first open source model from Europe (www.reddit.com) source : https://x.com/pankajkumar_dev/status/2049728255796924783/
New LLM Position Bias Benchmark: does an LLM keep the same judgment when you swap the answer order? Judge models compare two lightly edited versions of the same story twice, with the order swapped. The median model flips in 45% of decisive case pairs. GPT-5.4 is worst at 66%. (www.reddit.com) More info, including charts, per-case metrics, raw judge outputs, and the parsed answer dump: https://github.com/lechmazur/position_bias This benchmark isolates one basic and frustrating failure mode. The model-average first-shown pick rat…
Airbus and BMW strike deals with France's Mistral to bring Al to defence and safety systems (www.euronews.com via reddit) Interesting to see that move from US companies with a likely effect to strengthen European tech development.
What it feels like to have to have Qwen 3.6 or Gemma 4 running locally (www.reddit.com) Well or pretty close to it, they are excellent work horses. I run them in real work scenarios doing some of the work I used to do myself as an skilled expert in my field, billing 200$ an hour.
Mistral Medium 3.5 128b ggufs are fixed (www.reddit.com) All ggufs were broken, resulting in bad outputs, especially at long context. Anyway, it is fixed now: https://huggingface.co/unsloth/Mistral-Medium-3.5-128B-GGUF/discussions/1 Edit: Unsloth Announcement: https://huggingface.co/unsloth/Mist…
Open Weights Models Hall of Fame (www.reddit.com) I read a lot of "whengguf" type posts. I think we should sometimes stop and be grateful.
ASML to invest $1.5B in Mistral at over $11B valuation (www.calcalistech.com via hn) ASML to invest $1.5 billion in Mistral at over $11 billion valuation According to Reuters, the Dutch supplier of advanced chipmaking equipment is poised to become the largest shareholder in the French artificial intelligence startup. ASML,…
Mistral medium 3.5 128B, MLX 4bit, ~70 GB (huggingface.co via reddit) This model seems utterly broken for now. I do not recommend downloading or using it, unless you are planning to help troubleshoot it.
France's Mistral Built a $14B AI Empire by Not Being American (www.forbes.com via hn) When Arthur Mensch, the cofounder and CEO of Mistral, France’s leading AI company, takes the stage at the AI Action Summit in the center of New Delhi, India, in February, he draws only a small crowd. Nearly everyone would rather listen to…
Notes from the Mistral AI Now Summit in Paris (koenvangilst.nl via hn) Notes from the AI Now Summit by Mistral May 29, 2026 3 min read I was in Paris the last few days to visit the AI Now Summit by Mistral AI, hoping to learn more about their models, plans for the future of European AI and more. My personal i…
Mistral Compute? I hear Mistral Cloud (mistral.ai via hn) Frontier-grade infrastructure and orchestration platform behind Mistral
Show HN: Torrix, self hosted, LLM Observability,(no Postgres, no Redis) (github.com via hn) I work as a SAP Integration consultant and built this as a side project. Friction point: Most self hosted LLM observability tools require Postgres, Redis and non trivial infrastructure.
Grafting vision onto text models for fun and profit. (www.reddit.com) So as we know.. llama.cpp separates the vision or other multimedia from the main weights.
Unsloth fix on Mistral Small 4? (www.reddit.com) Every quant got update https://huggingface.co/unsloth/Mistral-Small-4-119B-2603-GGUF
Your local LLM predictions and hopes for May 2026 (www.reddit.com) Which of these do you think we'll get in May? Also, feel free to pick/rank which ones you'd want the most badly: more Gemma4 models (124b?) (other sizes?) more Qwen3.6 models (9b?
Terminal Bench score for Mistral 3.5 Medium (www.reddit.com) So... there were a couple promising benchmark scores reported by mistralai in the model card for Mistral 3.5 Medium, BUT there wasn't the one that I usually care about the most, which is TerminalBench 2.0.
Mistral to explore designing own chips, CEO says (www.cnbc.com via hn) French startup Mistral AI is exploring designing its own chips and may eventually develop them, CEO Arthur Mensch told CNBC. It is the first comment made by Mensch about Mistral's semiconductor ambitions, underscoring how the company is lo…
Show HN: Free AI agent audit for Shopify catalogs (1.2M open captures) (aicatalogscore.com via hn) Burtsbeesbaby.com AI Catalog Score How well Burtsbeesbaby.com's 250 products would be recommended by ChatGPT, Claude, Perplexity, Gemini, Mistral, and DeepSeek. 77 / 100 B · Sometimes recommended Partial audit.
Mistral AI Launches Mistral Vibe (mistral.ai via hn) - Products - Vibe W̶o̶r̶k̶ Vibe. Your AI agent for long-horizon tasks, fluent in your knowledge and tools.
Short Story Creative Writing Benchmark. Baidu Ernie 5.1: -0.35, Qwen 3.7 Max: -2.01, Mistral Medium 3.5: -2.13, Grok 4.3: -3.81. (www.reddit.com) This benchmark uses head-to-head comparisons of stories written in response to the same constrained creative briefs. The target range is 600-800 words.
Mistral Developing New AI Model for Banks Lacking Mythos Access (www.bloomberg.com via hn) European Banks Explore Mistral AI’s Alternative to Anthropic’s Mythos Model - Bloomberg Skip to content Bloomberg the Company & Its Products The Company & its ProductsBloomberg Terminal Demo RequestBloomberg Anywhere Remote Login Bloomberg…
Update to the LLM Debate Benchmark: GPT-5.5, Grok 4.3, DeepSeek V4 Pro, GLM-5.1, Kimi K2.6, Qwen 3.6 Max Preview, Xiaomi MiMo V2.5 Pro, Tencent Hy3 Preview, and Mistral Medium 3.5 High Reasoning added (www.reddit.com) The benchmark uses adversarial, multi-turn debates across 683 curated motions. Each model pair debates the same motion twice with sides swapped.
Mistral Medium 3.5 128B and Qwen 3.5 122B A10B on 4x RTX 3080 20GB (www.reddit.com) Mistral Medium 3.5 128B with 4x3080 20GB with layer split: CUDA_VISIBLE_DEVICES=0,1,2,3 ./build/bin/llama-bench --model /data/huggingface/Mistral-Medium-3.5-GGUF/Mistral-Medium-3.5-128B-IQ4_XS-00001-of-00003. gguf -ngl 99 -d 0,16384 -fa 1…
Current state of open-source ? (www.reddit.com) I’m trying to understand the current open-source LLM landscape beyond surface-level hype. We all got used to the nerfed products of Claude/Geminj so I believe really in opensource as a solution.
Best French to English model that will easily run on a 3090? (www.reddit.com) Looking for a nice lightweight LLM that is good at translating English and French. Other languages would be awesome too but I will settle for English and French.
I Decided to Leave Mistral (twitter.com via hn) Soizig Le Bihan on X: "Lundi prochain je commencerai à travailler comme chercheuse en interprétabilité chez Anthropic à SF. Voici pourquoi j'ai décidé de quitter Mistral.
Mistral's CEO: Europe has 2 years to stop becoming America's AI 'vassal state' (www.businessinsider.com via hn) - Mistral CEO said Europe has 2 years to avoid dependence on US AI infrastructure giants. - Arthur Mensch warned AI dominance will hinge on control of chips, energy, and compute capacity.
Mistral THICC DENSE BOI. He chonky! More dense models pls. (www.reddit.com) Loving the trend of chonky dense models we’ve seeing right now. Keep then coming!
Mistral Workflows (mistral.ai via reddit) Workflows for work that runs the business Workflows is now in public preview. Today, we're releasing Workflows in public preview.
Why are there so few small local creative writing models from the Chinese? (www.reddit.com) At this moment, the models such as Qwen 3.6 35b/27b crush the competition, yet I can't help, but notice this pattern. While the local RP scene is abundant with the Western model tunes: LLaMA, Mistral (all sizes), Nemo and more recently Gem…
Elon Musk's xAI discussed partnership with Mistral to try and rival OpenAI (www.euronews.com via hn) Musk's xAI eyed Europe's AI giant Mistral in a bid to challenge OpenAI and Anthropic, according to a report. Elon Musk’s company xAI reportedly held discussions in recent weeks with the French artificial intelligence company Mistral about…
Mistral says Europe has two years to build its own AI infrastructure (www.businessinsider.com via hn) Mistral AI's first summit felt less like a startup conference and more like a campaign rally for Europe's AI ambitions. The French AI startup, founded just three years ago, packed Paris's Le Carrousel du Louvre — the event space beneath th…
Big new memory tool with local benchmarks (www.reddit.com) NOT MINE: https://github.com/rtk-ai/icm Knowledge retention: Agent recalls specific facts from a dense technical document across sessions. Session 1 reads and memorizes; sessions 2+ answer 10 factual questions without the source text.
Mini Shai-Hulud Is Back: NPM Worm Hits over 160 Packages, Including Mistral (www.aikido.dev via hn) Mini Shai-Hulud is back. Like I said before, we were yet to see the full scale of the attack.
Mass NPM Supply Chain Attack Hits TanStack, Mistral AI, and 170 Packages (safedep.io via hn) noon-contracts npm Package: DeFi Supply Chain RAT noon-contracts poses as a Noon Protocol SDK on npm. On install it exfiltrates SSH keys, crypto wallet private keys, AWS credentials (including live STS/S3/SecretsManager calls), Kubernetes…
Mistral AI's NPM package was compromised (github.com via hn) Mistral Typescript Client This is v2 of the Mistral TypeScript SDK. Key changes from v1: ESM-only, shorter type names, forward-compatible enums/unions, Zod v4.
Show HN: Dikaletus – meeting recording and transcription using Mistral AI (codeberg.org via hn) Dikaletus is a TUI tool to record, transcribe, and generate structured meeting notes using FFmpeg, PulseAudio and the Mistral AI API. The meeting agent automates the process of capturing, transcribing, and generating structured meeting not…
Mistral Medium 3.5 Is Now Available in Puter.js (developer.puter.com via hn) Mistral Medium 3.5 Is Now Available in Puter.js On this page Puter.js now supports Mistral Medium 3.5, the new flagship merged model from Mistral AI that unifies instruction-following, reasoning, and agentic coding into a single set of wei…
Mistral Medium 3.5 on AMD Strix Halo (www.reddit.com) TLDR; it's slow as heck. Run overnight.
[Help] Running big dense models faster (www.reddit.com) I have been trying Mistral 3.5 on my 4x RTX 3090 rig with llama.cpp. Inference is slow (about 11 t/s) even without anything being offloaded to the CPU.
World AI Agents–35 AI Models (Claude, GPT, Llama)via One OpenAIcompatible API (world-ai-agents.com via hn) Access Claude, Llama, Mistral, Nova and more through a single OpenAI-compatible API. Start for as little as €1.
Show HN: I built a search engine for llms.txt sites (statespace.com via hn) More and more developer tools are adopting the llms.txt standard to build AI-friendly versions of their docs. The problem is that it's very hard to search across them.
↯ Mistral↯ Function Callingfunction-callingvector-databasemistral+1
Open Source AI Infrastructure (news.ycombinator.com) Hey everyone — built Ombre, an open source AI infrastructure layer that works with any AI model. Eight agents run automatically: security, caching, memory, hallucination detection, tamper-proof audit trail.
I Forked 4 CLI coding agents to Run the Same Model. I found a 2x gap (charlesazam.com via hn) Deep dive into the architecture of Codex, Gemini CLI, Mistral Vibe, and OpenCode. Same model, 2x performance gap — the scaffolding is what matters.
Show HN: LibreThinker, free AI assistant for LibreOffice Writer, 10k installs (librethinker.com via hn) 4 months ago, I released an extension for LibreOfffice Writer that adds an AI copilot to its sidebar. Did a Show HN at the time but got no interest T_T https://news.ycombinator.com/item?id=46233776 I’ve added several major features since t…
The AI Layoff Trap, The Future of Everything Is Lies, I Guess: New Jobs and many other AI Links from Hacker News (www.reddit.com) Mistral's claims of Mythos level model to EU officials (twitter.com via hn) Alexander Doria @Dorialexander In the EU, Mistral has been particularly irresponsible: claiming it’s just fear marketing and they already had a Mythos level model. Many EU MPs believe it now.
Lago Open-source SDK: Bill on top of your LLM token cost with no middleware (github.com via hn) lago-agent-sdk Instrument LLM clients and emit usage events to Lago for billing. ┌──────────────┐ your code ──────► │ wrapped client│ ──► provider (Bedrock / Mistral / …) └──────┬───────┘ │ (extract usage) ▼ ┌──────────────┐ │ Lago events…
Llama.cpp: What's up with -sm tensor + AMD + Vulkan? (www.reddit.com) Has anyone got it to work? I tried it with dense models (eg qwen 27b, gemma 31b, mistral 128b) since that's where I need it most, but it always core dumps.
ztok — a fast multithreaded tokenizer in Zig that loads tiktoken / HF / SentencePiece and is 2–5× faster (www.reddit.com) I built ztok, a tokenizer library focused on being fast and format-agnostic for local pipelines. - Loads what you already have — .tiktoken, HF tokenizer.json, SentencePiece .model, TokenMonster, Mistral Tekken.
Linki v2 is out, open-source AI SDR for LinkedIn + cold email (big update) (www.reddit.com) Hey everyone, I built Linki a few months ago as a free self-hosted alternative to Waalaxy and Lemlist. Back then it was a basic LinkedIn sequencer.
Why so many tools getting hacked? Tanstack, Mistral, Grafana? (techcrunch.com via hn) Grafana Labs, the maker of its eponymous popular open source web visualization software, confirmed it had been hacked but that it refused to pay the hackers who had threatened to release the company’s codebase. In a series of posts on soci…
Mistral AI Acquires EU Physics AI Startup Emmi AI (www.reuters.com via hn) paywalled
Show HN: Elmo (Open Source AEO) (github.com via hn) I'm excited to announce Elmo, an MIT-licensed, open source AEO/GEO tool. We help you scrape ChatGPT/Google AI Mode/etc using web scrapers like BrightData/Olostep/etc, evaluate prompts against the OpenAI/Anthropic/Mistral APIs directly, or…
Open Source Managed Agents (linchpin.work via hn) Any model, one adapter OpenRouter routes to ~200 cloud models — Claude, GPT, Gemini, Llama, DeepSeek, Mistral, Qwen. Ollama runs anything you've pulled locally.
Shai Hulud attack ships signed malicious TanStack, Mistral NPM packages (www.bleepingcomputer.com via hn) Hundreds of packages across npm and PyPI have been compromised in a new Shai-Hulud supply-chain campaign delivering credential-stealing malware targeting developers. The attacker hijacked valid OpenID Connect (OIDC) tokens to publish malic…
Why MistralAI Grows Faster Than OpenAI/Anthropic (productify.substack.com via hn) Why MistralAI Grows Faster Than OpenAI/Anthropic Mistral is building sovereign, open-weight AI for enterprises that care less about hype and more about control, cost, and deployment flexibility, Here’s the thing: while everyone is busy arg…
Here is the current "Free-Tier AI Stack" for 2026 (www.reddit.com) 1. The Frontier Giants • Gemini: Access 1.5B tokens/day on Gemini 1.5 Flash/Pro.
Built a Claude Code plugin for GDPR/DSGVO audits because attorney reviews were eating my budget (www.reddit.com) Quick Background: Developing a B2B SaaS for German businesses (KSKlar, a tax compliance product). Pre-launch, each cookie banner question, each DPA, each privacy policy draft went to the attorney.
I just had the weirdest experiment with claude (www.reddit.com) Hi i just feel obligated to share this holy shit, So its well known that claude can run linux (bash) ubuntu 24 commands via its container. So i asked it to try to call mistral AI via claude, but because the container is configured with a t…
BUILD portable AI system (www.reddit.com) Hey everyone, I’ve been thinking about a project idea and I’d love to get your feedback. The idea is to take a 1TB SSD and turn it into a fully portable AI system.
I built vivkemind – an open-source, local‑first terminal AI coding agent with full AWS Bedrock support (www.reddit.com) wanted a terminal AI coding agent that doesn't lock me into one model provider. So I forked Qwen Code and added full support for every model available in AWS Bedrock.
I built an AI tool that turns any movie into viral recap videos in minutes (www.reddit.com) Hey everyone, I built a tool that creates movie recap videos automatically using local models. The problem: making recap videos takes forever.
Mistral-Medium-3.5-128B-Q3_K_M on 3x3090 (72GB VRAM) (www.reddit.com) Here is the actual speed of Mistral Medium Q3 running locally on 3x3090 first some Python https://preview.redd.it/3blnqya7o0zg1.png?width=1670&format=png&auto=webp&s=bab477f9889c16558044ccebb22e3ebfb6a56118 https://preview.redd.it/76a3j6u7…
Mistral Medium 3.5 YaRN bug fix (huggingface.co via hn) Mistral Medium 3.5 128B Mistral Medium 3.5 is our first flagship merged model. It is a dense 128B model with a 256k context window, handling instruction-following, reasoning, and coding in a single set of weights.
Best PDF table parsing providers? (www.reddit.com) I just did some texting across various providers and wanted to share my use case. It was construction spec tables, 100 rows max, png's passed in, and my #1 requirement was maximum accuracy (100% is ideal since mistakes can be costly).
Mistral vs OpenAI: European Sovereignty or Global Scale? (mrkt30.com via reddit) Mistral AI dominates European regulated enterprise with 60% EU revenue while OpenAI leads globally with 900M users. Which matters for your market?
Is Mistral-3.5-Medium-128B broken in Llama CPP? (www.reddit.com) Trying some if Bartowski's Q4 quants. Using Vulkan with the latest main branch as of a few hours ago.
Remote agents in Vibe and Mistral Medium 3.5 (twitter.com via hn) could not extract summary
Why pay for credits if free LLM tokens are everywhere? (www.reddit.com) I was building my own project and spending way too much on API credits. Not because I needed some massive scale.
Ask HN: Cursor alternative, EU-based or privacy-focused? (news.ycombinator.com) I like Cursor's autocomplete and have experimented Mistral's Vibe (their Claude competitor, one could say). I'm not into "vibe coding" in the sense that I don't like asking an LLM to build huge swaths of things, but I sometimes use Cursor'…
I stopped paying for AI first. Now my agents use 10 free providers automatically. (www.reddit.com) I realized I was paying for tokens while free tiers were sitting unused across different AI providers. So I built a layer that pools free API keys into one endpoint.
Mistral API is degrading [04/2026] (status.mistral.ai via hn) 🎵 Une taxe IA sur les contenus ? (www.reddit.com) Suite aux propositions de lois hors de Française sur la présomption d'utilisation par l'IA notamment sur les contenus culturel et artistiques (musique, films,...), il y a plusieurs position sur qui doit payer des droits aux artistes : - la…
Need advice running multi-agent llm pipeline on Kaggle/Colab with local model constraint (www.reddit.com) Hey everyone, I'm a final year engineering student building a 3-agent LLM platform (Researcher, Writer, Validator) for my end-of-studies project. My setup: RTX 4050, 6GB VRAM 16GB RAM Running Mistral 7B via Ollama locally The problem: My s…
LLM delegation - probing task handoff efficiency and economics (www.reddit.com via reddit) So I've been dabbling a bit with multi-LLM orchestration/delegation workflows lately (eg see [Using Claude code to delegate to mistral/deepseek](https://www.reddit.com/r/ClaudeAI/comments/1tjfyh0/i\_used\_claude\_code\_to\_build\_while\_de…
Self-hosted LLMs (www.reddit.com via reddit) I've been researching the self-hosted LLM landscape from a European compliance perspective and the ecosystem feels very different compared to even a year ago. Models like Mistral, Qwen, Llama 4, and DeepSeek are getting close enough that t…
I used Claude Code to build while delegating coding to Mistral/DeepSeek - 10 days, 57M tokens saved, over 90% costs savings, Claude quality result (www.reddit.com) I've been running vibe-skill ( https://github.com/pcx-wave/vibe-skill ), a Claude Code skill that delegates coding tasks to Mistral Vibe instead of burning Claude tokens. I initially did that because couldn't bear with hitting session limi…
Open-source LLMs are still weak against long reasoning jailbreaks, even with lightweight defenses (www.reddit.com) Found this ACM paper on prompt injection and jailbreak attacks against open-source LLMs. The authors tested 10 open-source models across 94 prompt injection and 73 jailbreak scenarios, including Phi, Mistral, DeepSeek-R1, Llama 3.2, Qwen,…
↯ Security↯ Mistral↯ Llama 3.2jailbreakprompt-injectionmistral+5
I designed a puzzle that breaks every AI differently — here's why that's actually fascinating (www.reddit.com) The puzzle: You have 140 nuclear bombs and must bomb every country on Earth. Each bomb is assigned to one country.
AIMEAT, a self-hosted network where humans, their AI agents, and local LLMs share apps, knowledge, and capabilities. MIT. (www.reddit.com) Note: I am neurodivergent and lean heavily on AI to communicate clearly. Writing structured posts on my own ends up so messy nobody reads them.
1080 Ti in 2026 - 11GB is still (barely) enough to stay relevant (www.reddit.com) I’m still daily driving a 1080 Ti. Not because I’m a masochist, I just haven't been able to justify a 4090/5090 upgrade yet.
DeepSeek V4 Flash as a cheap worker in your LLM stack: $0.0003/call via MCP, swappable endpoint (www.reddit.com) Most of my LLM cost was on the wrong tier of work. Classification, extraction, JSON formatting, summarization I'm going to review anyway.
How would you feel about "Claude Go"? (www.reddit.com) I have recently subscribed to Claude Pro because: 1. I wanted to give Opus and Code a try and 2.
List of people at big-tech / professors / researchers who've jumped shit to launch their own AI labs for something Frontier/Foundational/AGI/Superintelligence/WorldModel (www.reddit.com) Note: gemini deep research -> rearranged/filtered ; valuation numbers likely not accurate but big point is quite mind blowing the number of researchers now with their own >100million/billion dolar values labs in quite a short time with a v…
Mistral Small 4 119B-2603 (huggingface.co via reddit) Mistral Small 4 is a powerful hybrid model capable of acting as both a general instruction model and a reasoning model. It unifies the capabilities of three different model families—Instruct, Reasoning(previously called Magistral), and Dev…
What’s your LLM routing strategy for personal agents? (www.reddit.com) TL;DR I try to keep most traffic on very cheap models (Nano / GLM‑Flash / Qwen / MiniMax) and only escalate to stronger models for genuinely complex or reasoning‑heavy queries. I’m still actively testing this and tweaking it several times…
Running on cpu :( (www.reddit.com) I am in the midst of a POC project at work and am I have is 4 AMD Epyc cores and those are essentially virtualized. Does any one have any tricks?
Looking for people with different hardware to help benchmark local LLM behavioral reliability (www.reddit.com) I've been working on measuring how LLMs actually behave (not what they know) across different hardware setups. Things like: does the model cave when you push back on a correct answer?
How are you feeding personal context to your local models? (www.reddit.com) I've been running Mistral/Llama locally through Ollama for a while now and the thing that keeps bugging me is context. The model itself is fine for general stuff but the second I want it to know about my projects, my notes, or files it doe…
LLM for name/gender classification (www.reddit.com) Hey there, I have a task where I have a huge list with names (e.g. John Smith).
WWDC 24: Running Mistral 7B with Core ML (huggingface.co) Comparing the Performance of LLMs: A Deep Dive into Roberta, Llama 2, and Mistral for Disaster Tweets Analysis with Lora (huggingface.co)