June 5, 2025

Marek’s Dev Diary: June 5, 2025

What is this

Every Thursday, I will share a dev diary about what we’ve been working on over the past few weeks. I’ll focus on the interesting challenges and solutions that I encountered. I won’t be able to cover everything, but I’ll share what caught my interest.

Why am I doing it

I want to bring our community along on this journey, and I simply love writing about things I’m passionate about! This is my unfiltered dev journal, so please keep in mind that what I write here are my thoughts and will be outdated by the time you read this, as so many things change quickly. Any plans I mention aren’t set in stone and everything is subject to change. Also, if you don’t like spoilers, then don’t read this.

A few months ago, I restructured my schedule to alternate weekly between Space Engineers 2 and AI People. This approach lets me maintain a deep focus on each project. I’m really someone who needs to dig deep into something and work on it until it’s done, rather than switching context every few hours.

AI People

This week’s focus was AI People, specifically exploring AI-assisted programming (what we call “vibe coding”) to accelerate development of our AI NPCs. However, instead of diving into planned features, we spent the week refining our methodology.

Our experience with Cursor + Opus 4 and Gemini Pro 2.5 revealed a frustrating pattern. Initial progress feels promising, but then you hit a wall: request a change, the AI modifies code, you skip review and test directly, discover new bugs or no improvement, repeat. Hours later, you realize you’re going nowhere.

The core issue? Current AI agents approach software engineering fundamentally differently than experienced developers.

How AI agents work today: You describe a feature → AI reasons briefly → finds files → implements changes → declares completion.

How expert developers actually work:

Fully understand requirements and context
Study relevant code thoroughly (nothing more, nothing less)
Break complex changes into testable chunks
Implement with constant awareness of ripple effects
Review for edge cases and unintended consequences
Update all affected elements – comments, references, documentation, architecture diagrams
Write comprehensive tests and iterate based on results
Access running systems for real-time debugging and observation

Current tools can’t replicate this workflow. They also lack game runtime access, can’t insert diagnostic traces, and miss the holistic view that makes great code.

This realization led us to explore building our own SWE agent. We’re studying Claude Code, which implements some of these concepts plus additional capabilities like sub-agents.

Key insights from this exploration:

LLMs feel superhuman in their domains. Yes, they still have gaps and can only handle minute-long tasks rather than day-long projects, but within their scope? Opus 4 writes a complete Tetris game in seconds – a task that would take me days. The bottleneck isn’t intelligence; it’s the scaffolding.

When AI programming fails, it’s rarely the model’s fault. It’s inadequate tooling around it. I’m convinced 2025 will bring revolutionary improvements: correct SWE loop, specialized agents for exploration, coding, testing, reviewing, evaluating, validating, etc; sophisticated code search and indexing, intelligent test automation, multimodal feedback loops. Imagine Gemini analyzing gameplay video to fix bugs autonomously.

We’re validating our approach on smaller codebases and design documents. Design docs are particularly revealing – text changes are far easier to track than code modifications, exposing flawed agent behavior immediately.

Case in point: I asked Cursor to reformat log specifications in our design doc. It updated one section, missed another, left duplicates, never reviewed its work. In a text document, these mistakes jump out instantly. In code, they’d hide among thousands of lines. Classic junior developer behavior – making changes without verifying impact.

How about costs? Sure, spending $100 on tokens in a day might seem expensive. But if the AI delivers in one day what would take me two weeks? That $100 is cheap. Plus you’re iterating in hours instead of weeks. Clear win.

We’re not quite there yet, but I’m confident this year will bring the breakthrough. Once we crack this, we’ll accelerate AI People development dramatically, running parallel experiments and iterating at unprecedented speed.

Space Engineers 2

Given my AI agent focus this week, my SE2 time went into writing the SE2 Vision document – a comprehensive guide defining requirements, constraints, and KPIs for the team.

Our north star: Make SE2 mainstream while delivering 10x improvements across every dimension – art, code, design, quality, performance.

The key insight: SE2 will match or exceed SE1’s complexity, but we’re wrapping it in an accessibility layer. New players start with intuitive, manageable systems. Complexity reveals itself progressively as skills develop.

We’re also prioritizing engaging gameplay loops and meaningful progression. The complexity remains – it just becomes fun to master rather than overwhelming to encounter.

Comments

June 5, 2025
Anonymous

I think that water needs some splashing when falls. The foam alone feels weird

Reply

▾ Replies
- June 5, 2025
  Anonymous
  
  The way the water stream doesn’t move makes it look almost solid, like laminar flow. Except the foaming makes that impossible, so it just looks off.
- June 5, 2025
  K.
  
  Step by step, guys. There was no water, now there is, there were no water physics, now there are, there was no foam, now there is. Turbulence and splashing will definitely come.
- June 6, 2025
  Anonymous
  
  Personally, I thought the presentation was excellent. It looks good already; although it may not be finished, it’s looking good.
June 5, 2025
Grove

Anon, you really want our PCs to spontaneously combust over some splashes of water? It’s already going to look fantastic. This is the unpolished version we’re seeing anyway.

Reply
June 5, 2025
Deon

The term ‘mainstream’ worries me a little. As I look through steamdb charts, the interest in most games there appears to be fleeting and fickle. The few enduring titles, mainly fps, service a specific audience need. Others may remain in the top 100 for a few years and then descend into either obscurity or if they are lucky maintain a steady, but dwindling player base. There are exceptional games that break the mould, they are not mainstream. Mainstream games are subject to fashion and end up on dusty shelves, great games create tradition and clearly are well loved.

Reply

▾ Replies
- June 6, 2025
  Anonymous
  
  Same. SE1 was loved by the player base for the complexity. Trying to streamline it might result in the loss of some of the player base for not much gain… CIV 7 is the best example of this. I am a big fan of leaving core rules in place and expand, modifications are… worrying 🙂
- June 16, 2025
  Anonymous
  
  Me too. I loved SE1 at first, but now i find it so simple, when i play i always have to add mods like Industrial Overhaul to make things a little more complex otherwise my mind gets bored after finishing the first vehicle/ship and i quit. I really hope they add a little more complexity to the game, altough with a gentle start to not scary new players.
June 5, 2025
Anonymous

I’m very excited about Space Engineers 2 and appreciate the direction Keen Software House is taking with accessibility—being easy to learn yet challenging to master.I believe an essential aspect to achieving this goal is a robust and highly customizable input system. Specifically, I’d encourage the implementation of extensive configuration options for various input devices, including joystick axes, customizable linearity curves, deadzones, inversion settings, and comprehensive remapping capabilities. Such flexibility allows all players—from casual to deeply invested—to tailor controls to their unique playstyles.Additionally, considering VR support early in development, even if it remains a niche feature initially, aligns perfectly with your ambitious vision for SE2. VR provides unparalleled immersion and precise spatial awareness, crucial for engineering-focused gameplay. For instance, enabling players to intuitively look around their vehicles through virtual screens or cameras dramatically enhances the experience, particularly when operating complex machinery involving precise tools like drills mounted on articulated pistons and hinges.Ultimately, offering sophisticated controls and VR compatibility not only broadens accessibility but deeply enriches the engineering and design challenges that are central to Space Engineers’ identity.

Reply

▾ Replies
- June 6, 2025
  Anonymous
  
  Every vr headset would explode
- June 16, 2025
  Anonymous
  
  The VR idea is great but for that to be truly interesting, you need more imersion in objects and this is really lacking in SE1 (and also SE2). I think they should play games like Stationeers and Archean to get inspired. Archean even let you customize a dashboard with stuff (see watch?v=ERzYpkEkj6U on youtube for example). SE2 would hugely benefict from more immersion like that imho.
June 5, 2025
Anonymous

If the accessibility abstractions at the start don’t simplify the base layer, I’m not opposed to seeing how it goes. However, consider one thing: the difficulty of the early stages of Space Engineers (finding critical ore and making sure your rover doesn’t flip, among others) has, in a way, created a common struggle that bonds the community together. P.S: Does the 10x improvement target apply to planetary diameters as well? 😉

Reply
June 6, 2025
generic_sf

A huge fan of complex sandboxes, but yeah, the game needed to become more accessible — and neither its loose sandbox progression nor the UI was helping with that.While SE1 has a slew of content, the one thing I wish it did better was to take advantage of that content so players wouldn’t create just for the sake of creation, but also for the sake of solving a problem — which is what has always driven humans to create throughout history.For example, one of the many changes on my modded server was setting Alien Planet’s gravity to 3, slightly buffing wheel power, and making uranium available only on that planet.As a result, with these small changes, players:- Relied more on solar/wind/hydrogen power (which are barely used in vanilla), since uranium wasn’t readily available.- Started building vehicles (which are more or less redundant in vanilla), because they became an efficient means of short-distance transport in high gravity.- Actually utilized parachutes.- Built various stationary mining rigs and on-site assemblies for uranium veins (which are also rarely used in vanilla).In general, they had a blast because they needed to visit a place with a new kind of problem and engineer solutions for it.

Reply
June 6, 2025
Anonymous

I really love the progress of both SE1 and SE2, your work is phenomenal, keep it up, it all looks and feels amazing!

Reply
June 6, 2025
Sardaukai

“Our experience with Cursor + Opus 4 and Gemini Pro 2.5 revealed a frustrating pattern. Initial progress feels promising…. Hours later, you realize you’re going nowhere.”That’s totally correct. I see it the same way. AI is ‘nice’ to get a quick start or just to look at something from another direction. Maybe it will point you, the developer, in the right direction. But I would never give an AI the whole code and say, ‘…do some optimizations.’ We’ll see what happens next year.

Reply
June 7, 2025
DGIVER

Pls, consider adding aerodynamics, it will be an so amazing addition. It can be turntable gameplay option, please consider the idea. This will open up a huge number of possibilities and incredible buildings✨

Reply
June 21, 2025
Anonymous

Keen shall reign supreme

Reply

Marek’s Dev Diary: June 5, 2025

What is this

Why am I doing it

AI People

Key insights from this exploration:

Space Engineers 2

Comments

Leave a comment

Cancel reply

Biography

Blog Archive

Subscribe to Newsletter