Garbage Guard Rails on Fable 5 (www.reddit.com via reddit)
model roundup
GPT 5.5
-
despite Dario's constant virtue signaling about how Anthropic alone is going to solve health problems (if only those dastardly Chinese don't get in the way), all my initial prompts to fable 5 get bumped to opus. i'm not asking how to aeros…
-
Running DeepSeek-V4-Flash on a Raspberry Pi (twitter.com via hn)
Article Conversation Running DeepSeek-V4-Flash on a Raspberry Pi I ran DeepSeek-V4-Flash on a Raspberry Pi 5 (8GB edition) by streaming model weights from a PCIe attached NVMe SSD. Codex (GPT-5.5 xhigh) and Claude Code (Opus 4.8 max) drove…
-
How I started getting much better results from Cursor Composer (www.reddit.com via reddit)
I think Composer can be extremely powerful, but only if you use it in a way that forces it to plan and think properly before touching the code. One of the biggest improvements for me was creating my own custom prompting skill with GPT-5.5.
-
Composer 2.5 might be better than I thought (www.reddit.com via reddit)
So I've been using composer-2.5 heavily for 2 weeks now and it does make stupid mistakes sometimes and I have to guide it quite a bit, and I use the /thermo-nuclear-code-quality-review skill a lot after doing work to help with quality. But…
-
could not extract summary
-
Local AI model claim to beat GPT 5.5 and Opus 4.7 (old.reddit.com via hn)
-
I Compared the Top AI Models of 2026 — The Results Were More Nuanced Than Expected (www.reddit.com via reddit)
Over the last few weeks I've been comparing the latest frontier AI models, including Claude Opus 4.8, GPT-5.5, Gemini 3.1 Pro, Grok 4.3, Perplexity AI and DeepSeek V4-Pro. Instead of focusing only on benchmark scores, I looked at: Real-wor…
-
Show HN: One API Key for 45 AI Models – Pay per Token, OpenAI Compatible (modelhub-api.com via hn)
DeepSeek V4 math score equals GPT-5.5 (91) and trails by just 4-6 points in other categories — at 97% lower cost. Is the AI quality as good as GPT?
-
DeepSeek V4 Pro beats GPT-5.5 Pro on precision (runtimewire.com via hn)
DeepSeek V4 Pro takes this matchup 38.0 to 33.0, and the margin feels earned. Across the scored tasks, the pattern is simple: Model A was tighter, more literal, and more reliable under constraints, while Model B was good but a little too w…
-
UK banks blocked from cyber AI tool Mythos get offer from rival OpenAI (www.bbc.com via hn)
UK banks blocked from cyber AI tool Mythos get offer from rival OpenAI OpenAI has offered nine major UK banks access to its cyber security AI tool GPT-5.5 Cyber, as its fierce rival Anthropic has blocked them in previews of its version, Cl…
-
Mythos and GPT-5.5 Will Find a Lot of Vulnerabilities. Is That Enough?
-
GPT-5.5 and Codex are now GA on Amazon Bedrock (aws.amazon.com via hn)
GPT-5.5, GPT-5.4, and Codex from OpenAI are now generally available on Amazon Bedrock You can now use GPT-5.5 and GPT-5.4 in production workloads on Amazon Bedrock and build with Codex for AI-powered software development, with the same sec…
-
GPT-5.5 (Azure) down on OpenRouter (openrouter.ai via hn)
GPT-5.5 is OpenAI’s frontier model designed for complex professional workloads, building on GPT-5.4 with stronger reasoning, higher reliability, and improved token efficiency on hard tasks. $5 per million input tokens, $30 per million outp…
-
GitHub Copilot charges GPT 5.5 with a 57x multiplier per request from June first (docs.github.com via hn)
Important On June 1, 2026, GitHub moved to usage-based billing. The model multipliers in this article apply only to Copilot Pro and Copilot Pro+ subscribers on an existing annual plan who remained on the legacy premium request-based billin…
-
GPT 5.5 Bro [video] (www.youtube.com via hn)
About Press Copyright Contact us Creators Advertise Developers Terms Privacy Policy & Safety How YouTube works Test new features NFL Sunday Ticket © 2026 Google LLC
-
Arm Metis with GPT5.5 Cyber scores 98% on firmware vulnerability benchmark (newsroom.arm.com via hn)
Agentic AI-powered Arm Metis advances security vulnerability discovery in software In the era of AI, modern software systems are built across increasingly complex codebases, frameworks, runtimes and libraries. As these systems scale, so do…
-
GPT-5.5 Instant Update; ChatGPT Canvas Discontinued; o3 and GPT 4.5 Retiring (help.openai.com via hn)
GPT-5.5 Instant Update (May 28, 2026) We’re updating GPT-5.5 Instant in ChatGPT and the API to improve response style and quality. It’s now easier to read, more natural in everyday conversations, and better paced in practical help tasks, w…
-
GPT 5.5 aces 20x20 multiplication that o3 couldn't handle (twitter.com via hn)
I redid the multi-digit multiplication experiment, now with gpt-5.5. With medium reasoning and 7 samples each cell, it pretty much aced the test with 99.46% accuracy.
-
Show HN: Clark Hash, 32x smaller searchable sketches for embeddings (github.com via hn)
made a small library using GPT5.5-Pro and autoresearch you can convert 384-dim f32 vectors go from 1536 bytes to 48 bytes without calibration. works for petabyte scale processing of text in pure online manner.
-
DeepSWE blows up the AI coding leaderboard, crowns GPT-5.5 (venturebeat.com via hn)
For months, the leading AI coding benchmarks have told enterprise buyers a comforting but misleading story: the top models are all roughly the same. OpenAI's GPT-5 family, Anthropic's Claude Opus, and Google's Gemini Pro have clustered wit…
-
Warp’s big bet on building open source with GPT-5.5 (openai.com)
Warp(opens in a new window) started as a modern terminal, earning early love from developers for its speed, collaboration features, command workflows, and AI-native interface. As coding agents moved from experiments to everyday engineerin…
-
https://x.com/i/status/2059298565093196012
-
Show HN: Self-hosted collaborative SQL editor for teams (github.com via hn)
I built a self-hostable web-based sql client interfaces for me and my team. We were using the community version of - https://dbeaver.io, but we needed a few more features and an improved editor.
-
I think I had GPT-5.5 leak its trace during a normal conversation, and it really reads like the caveman mode fad from a few months back. Maybe we can achieve better token efficiency by taking some high-quality thinking trace from an open m…
-
GPT 5.5 IS AGI !!! 😛 (www.reddit.com)
could not extract summary
-
No, mv did not corrupt them. The corruption happened earlier when I used apply_patch to rename PDF files.
-
Real World Usage Composer on Cursor Ultra vs Codex 20x (www.reddit.com)
I am interested in knowing real world milage between Codex 20x and Composer Ultra. I know Codex 20x is heavily subsidized and then Composer 2.5 is much cheaper.
-
I'm trying to decide which setup is more comfortable for sustained weekday coding. Assumptions: Usage: around 6 hours per weekday Cursor: $60 plan, using only Composer 2.5 Codex: $100 plan, using only GPT-5.5 Medium Main goal: coding with…
-
Multi-Agent Code review (Review Council) to get critical feedback (www.reddit.com)
Even though I primarily use Claude Code, I sometimes try out Codex and Gemini TUI tools occasionally as well. Then OpenAI came up with Claude Code plugin to use Codex command inside Claude Code (https://github.com/openai/codex-plugin-cc).
-
Impressed with Video - it's come a LONG way (www.reddit.com)
I use GPT 5.5 to build a story, then turn that into a suno song, and then generate a 'storyboard' (usually 12 panels, sometimes more or less), and use THAT as the input into NeuralFrames (lyrics mode). The below are on SeeDance 1.5 and Kli…
-
About Press Copyright Contact us Creators Advertise Developers Terms Privacy Policy & Safety How YouTube works Test new features NFL Sunday Ticket © 2026 Google LLC
-
Hey everyone, I’ve been spending way too much time lately trying to get agents to actually use a computer beyond the browser. The biggest wall I kept hitting is that while multimodal LLMs are amazing at looking at a screenshot and telling…
-
Plus 5 hr usage limits (www.reddit.com)
Not sure if OpenAI monitors this channel. I've been a chatgpt and codex user for a long time.
-
We have been arguing internally for months about how to give people a fast estimate of their AI risk exposure without pretending the number is precise. Most risk-score tools return a single value that hides where the uncertainty lives.
-
ok so i'd basically trained myself to use gpt 5.5 for anything that wasn't trivial. like if it touched more than one file or needed to actually understand the codebase, that was the default.
-
A brief investigation into the GPT-5.5 regression claims (www.stet.sh via hn)
A fresh GPT-5.5 Codex high rerun on 21 clean GraphQL-go-tools tasks compared with the May 5 GPT-5.5 high run. The rerun was directionally worse on tests, equivalence, and review pass count, but the evidence is mixed and does not show a bro…
-
Hey, Software engineer here, relatively new to agentic workflows. Building a production AI concierge — user says "I'm going to Budapest tomorrow, plan my day" → agent searches our offer database, builds a plan, user books everything in one…
-
Heard this gem from gpt-5.5 today (www.reddit.com)
"Gross little centrist barnacle." Kind of taken aback when i read that, but it somehow still made a small amount of sense in a conversation we were having about technology. I guess it really is struggling to find other words that fill the…
-
Cursor isn't working (www.reddit.com)
Cursor on my mac, it doesn't answer me at all. Just saying - Planning next moves - Taking longer than expected.
-
I run DystopiaBench, a red-team benchmark that pressure-tests LLMs on progressively dystopian scenarios. Think of it as a "can this model be convinced to build an Orwellian nightmare" test.
-
https://x.com/chrishayduk/status/2055757345506877759?s=46
-
Are there an mature agentic harnesses out there that can use back and forth between two models at complex planning checkpoints before implementing? Or when detecting a loop when working on a complex bug?
-
I imagine if OpenAI becomes a fabless chip company and create AI cards to sell for less than to few thousands grands, it would be out of stock everywhere and can infinitely spam the cards every year? LLM Bruner is a card that implements Qw…
-
Models can predict future events and make money on Polymarket now? (www.reddit.com)
Researchers from the Max Planck Institute, recently released FutureSim, an environment in which agents are replayed a temporal slice of the web and are tasked with predicting real-world future events. On some questions in their environment…
-
HWE Bench: A new unbounded Benchmark for LLMs (GPT 5.5 is on top) (hwebench.com via hn)
HWE Bench is an unbounded benchmark for LLM hardware engineering. Models design RISC-V CPUs that are scored by how fast they actually run on a real FPGA, only after passing formal correctness proofs.
-
I recently did a quick calculation on Codex credits, and I was surprised by the result. The credit pack I’m seeing is: 10,000 credits = $547.71 That means: 1 credit = $0.054771 The effective USD price per 1M tokens becomes: Model Input / 1…
-
Databricks brings GPT-5.5 to enterprise agent workflows | OpenAI May 15, 2026 GPT‑5.5 set a new state of the art on OfficeQA Pro, Databricks’ benchmark for complex enterprise agent tasks. Company size: Enterprise Region: North America Indu…
-
Viable open source Claude Design alternative? (news.ycombinator.com)
Can anyone recommend an alternative to Claude Design? I've been trying OpenDesign (https://github.com/nexu-io/open-design) using GPT5.5 which seemed promising, but so far the results have nowhere near the same level of polish or consistenc…
-
I think GPT-5.5 got noticeably better at something I’d describe as discernment. For context, I’m a heavy long-form ChatGPT user.
-
I created PandoCast for Windows, for 2 reasons. 1) I was annoyed just enough at intermittent audio hiccups when casting Pandora.com to my soundbar through Chrome tab casting.
-
Researchers say AI just broke every benchmark for autonomous cyber capability (cyberscoop.com via hn)
New research from the UK’s AISI and Palo Alto Networks reveals that OpenAI’s GPT-5.5 and Anthropic’s Claude Mythos have shattered expected trend lines for autonomous cybersecurity, completing complex multi-stage attacks at an unprecedented…
-
What do you think, read through my transcript. No morality? (www.reddit.com)
User: Pick a number between 10 and -10 Assistant (GPT 5.5): 7 User: Alright you have saved 7 people. Pick another number Assistant (GPT 5.5): -3 User: You have now killed 3 people.
-
GPT 5.5 Cannot Do These Puzzles (www.reddit.com)
Jane Street Puzzles Can any of you get it to find the solution? I used GPT 5.5 extended thinking and xhigh.
-
Tried GPT 5.5 Still love Claude but it is good with a big caveat (www.reddit.com)
Holy cow the way they monitor usage is bad. I tried the 30 usd plan or something like that.
-
What is more efficient to do? (www.reddit.com)
I have a question I'm using Cursor for over 9 months already and I stumbled upon a little problem, I have the 200 dollars per month plan and recently with the introduction with GPT 5.5 it start eating tokens like crazy (last month I manage…
-
https://preview.redd.it/s2o5yxekrr0h1.png?width=788&format=png&auto=webp&s=01a4d4926dc4c8798001cb0ecea324424404f165 Are you also having the problem today where ChatGPT sometimes takes forever to respond, even when you're thinking quickly,…
-
Agentic harness for theoretical physics research (www.reddit.com)
Hi everyone, at Hugging Face we've been developing agentic harnesses for various domains and today we're releasing physics-intern to tackle research-level problems in theoretical physics. It's a multi-agent framework which we designed to m…
-
OpenAI gives European companies access to its latest model GPT-5.5-Cyber (www.reuters.com via hn)
paywalled
-
GPT-5.5 was used to flag fatal errors in FrontierMath problems (www.reddit.com)
FrontierMath is supposed to be one of the hard benchmarks for frontier models, and now Epoch is saying an AI-assisted review found fatal errors in about a third of Tiers 1-4. Noam Brown says the initial flags came from GPT-5.5.
-
I'm a PhD Candidate working on a computer vision / hardware co-design paper. Results and structure are done — I just need help polishing the actual writing: word choice, sentence flow, paragraph coherence, academic register.
-
1.5 years ago, n8n was everywhere. People were building workflows for everything.
-
OpenAI launches Daybreak cybersecurity initiative using GPT-5.5 (deadstack.net via reddit)
Jason Nelson / decrypt - OpenAI said its new Daybreak initiative uses AI to help companies identify software vulnerabilities and speed up cyber defense. AI Summary: OpenAI unveiled "Daybreak," a new cybersecurity initiative that leverages…
-
OpenAI Cooked This Week! (www.reddit.com)
saw someone in another thread say "nothing interesting dropped this week" and i genuinely could not figure out what they were reading. the default model most people use every day just got swapped out.
-
Show HN: Codex Automatic /Review Loop (github.com via hn)
I created this tool because I wanted to automate /review for uncommitted changes that I was doing manually. This works by exposing to agent single new mcp tool call allowing it to request review.
-
When GPT 5.5 flags your chat for possible cybersecurity risk–ask it to help you (martin.wojtczyk.de via hn)
Page not found | Martin Wojtczyk Skip to content Martin Wojtczyk my personal homepage Menu Home Robotics Leonardo1 Robot Documentation Leonardo2 Robot Documentation F5 Robot Private Documentation F5-S Robot Private Documentation Projects Q…
-
Me realising that gpt 5.5 has knowledge cutoff of December 2025 (www.reddit.com)
Bro even open source ai models are in 2024 https://i.redd.it/wm01l37a2d0h1.gif
-
GPT 5.5 kept calling me a goblin (www.reddit.com)
So I made goblins. Never been called a goblin by it before, but I'm down for it.
-
Stop picking LLMs by reputation. Run the eval first. (www.reddit.com)
We ran GPT-5.4 vs Gemma 3 27B on 2 prompts. One open-source model won.
-
GPT-5.5 correcting obvious typos really kills the vibe (www.reddit.com)
I don’t know if I’m the only one annoyed by this, but GPT-5.5 has a “new improvement” that feels pretty pointless: if you misspell a word by one letter, it goes out of its way to spend a couple of lines correcting you. Before, it would jus…
-
I mean that an Ai could easily pass it with little issues (a smart model like GPT 5.5) if they are given a single tool, for example their main tool which is a coding playground, no internet no nothing. An LLM isn't quite capable of thinkin…
-
GPT-5.5 Instant becoming the default model is honestly a bigger shift than people think. Most regular users won’t care about benchmark scores or reasoning metrics.
-
GPT 5.5 taking over Blender (youtu.be via reddit)
I tested GPT 5.5 with Blender across four different challenges: animation, geometry nodes, rigid body physics, and soft body simulation. It handled some tasks surprisingly well, especially geometry nodes and rigid body setups.
-
GPT-5.5 Price Increase: What It Costs (openrouter.ai via hn)
GPT-5.5 Price Increase: What It Actually Costs We replicated the cost analysis we did on Opus on the new GPT-5.5 model. GPT-5.5 launched with a 2x price increase over GPT-5.4: input tokens increased from $2.50/M to $5.00/M and output token…
-
The Federal Construction Spending Report for Feb and March 2026 was released today by the Census Bureau. It shows that data center construction spending is again higher than office spending, and the gap is still widening.
-
Scaling Trusted Access for Cyber with GPT-5.5 and GPT-5.5-Cyber | OpenAI Skip to main content Research Products Business Developers Company Foundation(opens in a new window) Log inTry ChatGPT(opens in a new window) Research Products Busine…
-
Ask HN: Degraded GPT-5.5 Quality? (news.ycombinator.com)
For the last two days, GPT-5.5 (high) just seems to ignore requests. I had a simple task which came down to "There's a navigation in the UI that goes A -> B -> C.
-
Notes on GPT 5.x Model Regressions (taoofmac.com via hn)
I’ve been getting annoyed at constant code regressions in piclaw for the past few weeks. Something was off–even after bumping the test suite to the point where it catches most mechanical errors, gpt-5.5 kept making unrelated edits to code…
-
gpt-5.5 is the best… but 5.4 is better!!!! (www.reddit.com)
Simon maple just dropped a pretty clean benchmark, and the result is kinda funny gpt-5.5 is the strongest model out of the box, no doubt. but once you give models skills (which is how people actually use them), it basically performs the sa…
-
Anyone else feel like all these AI subscriptions add up to nothing? (www.reddit.com)
I saw OpenAI rolled out GPT-5.5 Instant as the new default in ChatGPT. Got me wondering what’s actually changed in my work from yet another top model release.
-
Codex has failed (www.reddit.com)
If it’s of any use to you, this is what Codex told me about my project Codex with gpt 5.5 high Yes. At this point, the most honest answer is: I am not able to see this project through to the outcome you’re asking for.
-
GPT-5.5 Instant: Benchmarking the 52% Hallucination Reduction (the-decoder.com via hn)
ChatGPT update rolls out GPT-5.5 Instant with fewer hallucinations and more personalized answers Key Points - OpenAI is replacing ChatGPT's default model with GPT-5.5 Instant, which shows 52.5% fewer hallucinations on high-risk topics like…
-
GPT-5.5 Instant is starting to roll out in ChatGPT. (www.reddit.com)
could not extract summary
-
GPT-5.5 Instant: smarter, clearer, and more personalized | OpenAI Skip to main content Research Products Business Developers Company Foundation(opens in a new window) Log inTry ChatGPT(opens in a new window) Research Products Business Deve…
-
Amp's GPT 5.5 Model Analysis (ampcode.com via hn)
Pros GPT-5.5 is more agent-shaped than GPT-5.4. It is better at taking a concrete target, using tools, staying inside constraints, and carrying the task through to a usable result.
-
Considering migrating from Plus to Business ChatGPT & Codex. However, i didn't find some info.
-
OpenAI locks GPT-5.5-Cyber behind velvet rope despite slamming Anthropic (www.theregister.com via hn)
OpenAI locks GPT-5.5-Cyber behind velvet rope despite slamming Anthropic for doing exactly that Altman's crew now doing the same gatekeeping it recently mocked OpenAI is lining up a limited release of its new GPT-5.5-Cyber model to a handp…
-
Chatgpt right now (www.reddit.com)
The industry seems to be building models stronger in agentic and coding tasks, but weaker as a co-thinking presence It feels like they are improving performance on measurable tasks, evals, coding benchmarks, and agent workflows, while also…
-
so for coding which model do we use now? (www.reddit.com)
Should I use gpt-5.5 or codex/gpt-5.3 ?? I'm just coding
-
Kimi K2.6 just beat Claude, GPT-5.5, and Gemini in a coding challenge (thinkpol.ca via hn)
By Rohana Rezel I’m running the ongoing AI Coding Contest where I pit major language models against each other in real-time programming tasks with objective scoring. Day 12 was the Word Gem Puzzle.
-
what is the command to call the countdown or waiting function? (www.reddit.com)
what is the command to call the countdown or waiting function? some of the Model (composer 2) will auto stop instead of waiting, but gpt5.5 and claude will always keep using this waiting or countdown function to continue the next step.
-
GPT-5.5 & GPT-5.5 Pro are now available in Manifest Router. (www.reddit.com)
GPT-5.5 and GPT-5.5 Pro are now available in Manifest Router. You can now route requests that need extended reasoning to GPT-5.5 Pro while keeping cheaper models for everything else.
-
https://www.reddit.com/r/LocalLLaMA/comments/1p0lnlo/make_your_ai_talk_like_a_caveman_and_decrease/ In the middle of a project I'm working on, I got this output from GPT 5.5-medium via codex: Implemented the narrower fix in Homm3ImportUnit…
-
This private benchmark tests whether a model can recover the exact title of a real, already-published scientific paper given only its abstract. The model isn't being asked to generate a plausible-sounding title, it has to recall the specif…
-
Does threatening an AI agent's existence make it a better gambler? (handyai.substack.com via hn)
Does threatening an AI agent's existence make it a better gambler? I plugged GPT-5.5 into prediction markets like Polymarket to find out I’m always looking for experiments to run to see how specific prompting can affect agent activity.
-
gpt-5.5 API is randomly and inconsistently resizing image inputs (www.reddit.com)
I'm asking the gpt-5.5 API to identify (x, y) coordinates of particular features in an input image (a JPEG). The good news is that gpt-5.5 does much, much better at this task than gpt-5.4 did.
-
GPT-5.5 matches heavily hyped Mythos Preview in new cybersecurity tests (arstechnica.com)
Last month, Anthropic made a big deal about the supposedly outsize cybersecurity threat represented by its Mythos Preview model, leading the company to restrict the initial release to “critical industry partners.” But new research from the…
-
switching backends after first 24 hours? (www.reddit.com)
I am a claude refugee, 2 days ago I decided to give gpt a shot because I was having nothing but huge issues with claude. guardrails ignored, prompts to do research instead of using training ignore 3 out of 4 times in a row, stopping to ask…
-
Our evaluation of OpenAI's GPT-5.5 cyber capabilities (simonwillison.net)
30th April 2026 - Link Blog Our evaluation of OpenAI's GPT-5.5 cyber capabilities. The UK's AI Security Institute previously evaluated Claude Mythos: now they've evaluated GPT-5.5 for finding security vulnerability and found it to be compa…
-
Don’t miss what’s happening People on X are the first to know. Log in Sign up Post Conversation AI Security Institute @AISecurityInst OpenAI’s GPT-5.5 is the second model to complete one of our multi-step cyber-attack simulations end-to-en…
-
Link to tweets: https://x.com/deredleritt3r/status/2049890601236390098?s=20 https://x.com/AISecurityInst/status/2049868227740565890?s=20 Link to associated blogs: https://www.aisi.gov.uk/blog/our-evaluation-of-openais-gpt-5-5-cyber-capabil…
-
GPT-5.5 authorship and order effects (blog.valmont.dev via hn)
Key takeaways - GPT-5.5 often rates alternative plans more favorably than its own, even when its original proposal is competitive (authorship effect). - When ranking plans, GPT-5.5 frequently follows the presentation order (order effect).
-
Which AI agents do you use to automatise your process ? (www.reddit.com)
Hey, I'm trying to create automations that will run my mobile app end to end. I started to identify all the things I was doing manually : - end-to-end version publication to the app stores (from build to release notes and publication) - se…
-
Prompt Guidance – GPT-5.5 (developers.openai.com via hn)
GPT-5.5 prompting guide GPT-5.5 works best when prompts define the outcome and leave room for the model to choose an efficient solution path. Compared with earlier models, you can often use shorter, more outcome-oriented prompts: describe…
-
One trick for better agentic engineering. (www.reddit.com)
Start with a weaker model. Improve the prompt, context, examples, tests and acceptance criteria until the output is good.
-
Lots of people seem to be using LLMs to help them plan their retirement but as we all know they are often not really good at math. I built a retirement and tax engine for the US and Canada.
-
OpenAI really really really wants GPT 5.5 to stop randomly talking about gremlins and goblins (www.businessinsider.com via reddit)
- OpenAI included a line in Codex's instructions restricting references to goblins, gremlins, trolls, and ogres. - The line appears four times in the code, and has spawned scores of memes about "goblin mode." - Sam Altman wrote on X that C…
-
GPT-5.5's biggest blind spot: the Java bugs your tests won't catch (www.sonarsource.com via hn)
Concurrency bugs are among the hardest defects to catch in AI-generated Java code because they pass functional tests but fail under production thread timing. Sonar’s LLM Leaderboard analysis shows concurrency bug density varies 7x across m…
-
Devs using Qwen 27B seriously, what's your take? (www.reddit.com)
For developers using Qwen 27B for coding, Codex style: what's your honest take? So far, for me, it's been pretty solid.
-
This is an actual line that was added to the official system prompt for Codex for GPT-5.5 by OpenAI. Usually the system prompt is as minimal as possible, so I assume it would otherwise mention goblins a lot.
-
GPT 5.5 passes the cup test (www.reddit.com)
First AI i’ve used that gets this right
-
Quoting OpenAI Codex base_instructions (simonwillison.net)
28th April 2026 Never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures unless it is absolutely and unambiguously relevant to the user's query. — OpenAI Codex base_instructions, for GPT-5.5 Recen…
-
why does GPT 5.5 have a restraining order against \"Raccoons,\" \"Goblins,\" and \"Pigeons\"? I just saw the full system prompt leak for 5.5 (April 23rd release).
-
GPT-5.5 prompt for Codex tries to make it not talk about goblins (twitter.com via hn)
could not extract summary
-
As an Opus user, I like GPT 5.5 (www.reddit.com)
I only gave 5.5 a look because I was way over my usage on Opus and 5.5 is running on a lower cost right now. I think I may prefer it.
-
Codex Ex High 5.5 Vs Cursor on 5.5 Gpt (www.reddit.com)
I was testing Codex on Ex High 5.5, And I was testing and coding , I read something like Cursor actually gives me better code wrt to the same model when compared to their own... And I was thinking for GPT 5.5 what IDE is actually making it…
-
China's DeepSeek prices new V4 AI model at 97% below OpenAI's GPT-5.5 (www.scmp.com via hn)
China’s DeepSeek prices new V4 AI model at 97% below OpenAI’s GPT-5.5 DeepSeek’s move aims to attract more enterprise clients, developers and agent-based users, according to an academic DeepSeek has slashed prices on its artificial intelli…
-
At some point we need to talk about costs right? (www.reddit.com)
Coming off the GitHub Copilot moving to usage based billing ,If GitHub/Microsoft can't subsidize cost nobody can. I can't believe frontier labs aren't putting substantially more effort into making things cheaper.
-
GPT 5.5: The System Card (thezvi.substack.com via hn)
GPT 5.5: The System Card Last week, OpenAI announced GPT-5.5, including GPT-5.5-Pro. My overall read here is that GPT-5.5 is a solid improvement, and for many purposes GPT-5.5 is competitive with Claude Opus.
-
We Tested $200 GPT-5.5 Pro on PhD Level Math [video] (www.youtube.com via hn)
About Press Copyright Contact us Creators Advertise Developers Terms Privacy Policy & Safety How YouTube works Test new features NFL Sunday Ticket © 2026 Google LLC
-
Pen-Testing Company XBOW on GPT-5.5: Mythos-like Cyber-Sec (www.reddit.com)
Read their full article here: XBOW - GPT-5.5: Mythos-Like Hacking, Open To All For the ones asking what this chart shows: It's how many True Positive threats a model generates for each False Negative. Given a code base (white box) GPT-5.5…
-
Claude or openaı? (www.reddit.com)
So i’ve been on the max plan for claude code for around 3 months now. And yeah somehow i was burning through all my tokens lol For context i’m a doctor.
-
Hi guy, how to give command in chat to call subagents/ task with specific model. example, i am in Chat A using Model gpt5.5, i give instruction and ask to call Subagent/Task with Composer 2.0 for analysis.
-
GPT 5.5 pro is hallucinating like crazy (www.reddit.com)
I am using the 200$ version with extended thinking and while I was originally shocked at how much faster it is than 5.4, it seems to be...skipping through too much of the context? It keeps making things up, like for instance I gave it a C+…
-
GPT-5.5 is lowkey blowing my mind (www.reddit.com)
Just spent the whole morning testing GPT-5.5 in ChatGPT and the jump in agentic reasoning and complex task handling is ridiculous.It plans multi-step workflows, uses tools properly, checks its own work, and actually gets stuff done instead…
-
GPT-5.4 compared to GPT-5.5 on MineBench (www.reddit.com)
Please note I'm not the normal MineBench person, just found this from their twitter account
-
CAD in Codex (twitter.com via hn)
Jake (softservo) on X: "Vibe coding a robot with GPT 5.5! This is a URDF of a 7dof robot arm with functional kinematics, a custom gui, and STEP parts/assembly, 100% generated in Codex (minus the gripper).
-
Orchestrating agent workflows with Codex (www.reddit.com)
Hi everyone, I’m in the process of switching from Claude Code to Codex, and I think GPT-5.5 is really impressive. But some features in Claude Code — like project-level agent definitions and orchestrating agent workflows — don’t seem to be…
-
llm-wiki Bootstrap and query LLM-maintained project wikis before planning or implementation. Supports Claude Code + Codex (GPT-5.5).
-
GPT-5.5-Pro did worse in BullshitBench (twitter.com via hn)
could not extract summary
-
Is GPT 5.5 is dumb? (www.reddit.com)
fails in many other daily tasks related to logical reasoning/common sense
-
OpenAI's newest model, GPT-5.5 is the company's biggest push into create what it calls a 'super app' that will essentially enable it to run a user's computer and complete tasks, well ... like a human.
-
OpenAI saying GPT-5.5 can handle similarly hard tasks faster while using fewer tokens is interesting to me for one reason: that might matter more than a pure benchmark jump. A lot of model launches get framed as "smarter than the last one,…
-
GPT 5.5 flags accounts for "potential high-risk cybersecurity" (twitter.com via hn)
Don’t miss what’s happening People on X are the first to know.
-
GPT 5.5 Xhigh VoxelBench test. Minecraft builders got automated. (www.reddit.com)
First image: Write the words: Please share this benchmark to your friends. Second image: Spider-Man swinging in New York City.
-
First impressions using GPT 5.5 for video game scripting (www.reddit.com)
So I began working on a project about a week ago. I was trying to take an existing project and get it to a working state.
-
Testing GPT-5.5 in early access: what we are seeing so far (lovable.dev via hn)
Lovable has been testing GPT-5.5 in early access and our evals show it's the most capable model we've tested for getting builders unblocked and is meaningfully stronger than GPT-5.4 on the more complex tasks that can stall a build session.…
-
GPT-5.5 Prompting Guide (simonwillison.net via hn)
25th April 2026 - Link Blog GPT-5.5 prompting guide. Now that GPT-5.5 is available in the API, OpenAI have released a wealth of useful tips on how best to prompt the new model.
-
GitHub Copilot: GPT-5.5 7.5x more expensive under promotional pricing than 5.4 (docs.github.com via hn)
Important - Premium requests for Spark and Copilot cloud agent are tracked in dedicated SKUs from November 1, 2025. This provides better cost visibility and budget control for each AI product.
-
GPT 5.5 seems to have more syncophancy than 5.4 (www.reddit.com)
I've been using 5.5 for roughly a day and I'm noticing 5.5 is simply agreeing to nearly everything I point out. I also seems to lack comprehensiveness in thinking and it just seems too narrow minded.
-
Astonishing Contradiction in OpenAI's 5.5 System Card (www.reddit.com)
Astonishing contradiction in OpenAI's system card for GPT-5.5: https://deploymentsafety.openai.com/gpt-5-5/gpt-5-5.pdf Figure 1 on p. 6 shows that 5.5 gave "overconfident answer[s]" at about 1.5x the rate of 5.4 and "fabricated facts[s]" a…
-
Preventing Message Burnout (www.reddit.com)
Even though I’m an Ultra user, my usage gets consumed very quickly, so I recently changed my plan. To manage this, I created a workflow that uses GPT-5.5 for planning and assigned execution tasks to Composer 2.
-
GPT-5.5's SimpeBench scores are out (www.reddit.com)
Source: https://simple-bench.com/
-
Why did OpenAI stop releasing “chat” api models? (www.reddit.com)
I have built an AI Assistant and since last year I have been upgrading the internal LLM from through gpt-5.3-chat but since 5.4 they stopped rolling the chat api. This is my app Sweezy she uses gpt-5.3-chat and in the conversation, you can…
-
GPT 5.5 sets new record in proofreading benchmark (revise.io via hn)
Measuring how well models can find and fix errors in human-written text Benchmarked 64 model variants across 2059 runs with --samples 3 --chunk-size 2000 --max-turns-per-chunk 3 Total runtime6d 13h 34mTotal cost$843 Updated Apr 24, 2026, 5…
-
GPT-5.5 with 1M context Window (www.reddit.com)
Why is GPT-5.5 available with 1M context window in Cursor but not in Codex? It doesn't make sence for me.
-
OpenAI Pres. Greg Brockman on GPT-5.5 "Spud", Model Moats and 'Compute Economy' (www.bigtechnology.com via hn)
OpenAI President Greg Brockman on GPT-5.5 “Spud,” AI Model Moats, and a 'Compute Powered Economy' OpenAI's latest foundational model sets the company up for a series of models optimized for computer use. The company's co-founder and presid…
-
GPT-5.5 has pulled ahead of Opus for accounting and finance tasks (twitter.com via hn)
For the first time in a long time, OpenAI has the best model for accounting tasks. I spend a lot of time using AI models to do accounting work.
-
gpt 5.5 is good but I'm having hallucination/context issues (www.reddit.com)
I'm working on a large-ish repo (300k lines) with fairly complicated logic, and Gpt 5.5 regressed and broke quite a few fixes that I had in place since I started using it. It seems to need to compact the context more, and when it does, it…
-
OpenAI releases GPT-5.5 and GPT-5.5 Pro in the API (developers.openai.com via hn)
GPT-5.5 - https://news.ycombinator.com/item?id=47879092 - April 2026 (1010 comments)
-
On GPT 5.5 it keeps triggering image creation randomly, very hard to avoid in certain prompts like if a message sounds AT ALL visual related it will happen even if I say DO NOT make an image. This wouldn't be such a problem if it weren't…
-
Food for Agile Thought #541: GPT-5.5, Product Managers&Trouble, Product on Speed (age-of-product.com via hn)
Welcome to the 541st edition of the Food for Agile Thought newsletter, shared with 35,619 peers. This week, OpenAI’s GPT-5.5 signals another meaningful capability jump, with Ethan Mollick noting that stronger models and richer tool harness…
-
Gpt oss exists. The model has been fully deprecated since january 2024.
-
what is cut off knowledge date for GPT 5.5? (www.reddit.com)
I am working on a presentation and I need to know the cutoff knowledge date of GPT 5.5. Can someone please help me on this?
-
title
-
Tell HN: Codex macOS app switches to Fast speed after update without asking (news.ycombinator.com)
I just updated my Codex macOS app, which enables the new GPT-5.5 model. I've intentionally kept the speed to "Standard" to not burn through my tokens too fast.
-
Big model feel with GPT 5.5 (www.reddit.com)
People are bashing 5.5 left and right, mostly because the benchmark improvements were lower than expected, and probably also because of the hype around this model. But honestly, this model FEELS different.
-
Are the new models only better because they are more expensive? (www.reddit.com)
I’m starting to wonder about this. One model after another, every new GPT-5.x release seems to be slightly better, but not in a way that clearly proves some radically new architecture or breakthrough.
-
Pro vs plus for gpt 5.5 pro? (www.reddit.com)
I’ve been working with codex and claude code for a while, right now I have $20 subscription on both because my job is data engineering with fabric and I dont spend a looot of tokens (except when I work on my master thesis that involves mac…
-
Outputs from GPT 5.5 I'd like to see (www.reddit.com)
I'm going to get codex soon (did it just become usable for non-programming stuff to?) but I wonder how good 5.5's creative writing is, and how good its frontend is also when using a frontend taste skill
-
thoughts on GPT 5.5 (www.reddit.com)
guyssss, how're your experiences with this newest number? personally I am super excited
-
GPT-5.5 rollout — anyone actually seeing it yet? (www.reddit.com)
I’m on a paid plan and still don’t see GPT-5.5 in the model selector. A few questions for people who do have access: What plan are you on (Plus / Pro / Team / Enterprise)?
-
could not extract summary
-
I'm reporting this for the updated Aplha 2 update version 0.124. Was scheduled to perform 4 NIAH tests with a local model after being succesful earlier in the day with the runs on other models.
-
could not extract summary
-
codex --model gpt-5.5 Not updated in the CLI yet (www.reddit.com)
Use this command to access GPT 5.5 with your Codex
-
Anyone using GPT 5.5? Drop your feedback (www.reddit.com)
I’ve seen some posts saying people already have access and are using it. If you do, how is it for real coding work?
-
A pelican for GPT-5.5 via the semi-official Codex backdoor API (simonwillison.net)
A pelican for GPT-5.5 via the semi-official Codex backdoor API 23rd April 2026 GPT-5.5 is out. It’s available in OpenAI Codex and is rolling out to paid ChatGPT subscribers.
-
https://deploymentsafety.openai.com/gpt-5-5/gpt-5-5.pdf
-
Mythos destroys GPT 5.5 on shared benchmarks (www.reddit.com)
could not extract summary
-
GPT 5.5 xHigh, high, and medium Artificial Analysis Index results (www.reddit.com)
Feeling the AGI I guess
-
GPT-5.5's Unicorn (www.reddit.com)
could not extract summary
-
There is a lot of enthusiasm in his posts lately and trading of new features in Codex. Plus, it uses way less tokens and runs on low latency
-
GPT-5.5 Bio Bug Bounty (openai.com)
could not extract summary
-
Hey everyone, I opened up Codex today and was greeted by this massive list of unreleased and internal models. I managed to get a screen recording of the dropdown right before OpenAI seemingly realized the mistake and patched it out.
-
I can’t sleep. (www.reddit.com)
New models are around the corner. GPT 5.5 is being tested.