Knowledge Hub

A curated collection of articles, papers, and resources on topics I find interesting. Each link comes with a short note on what it covers and why I think it's worth reading.

AI Coding / LLMs for Software Engineering

[Paper]Challenges and Paths Towards AI for Software Engineering

A comprehensive 75-page survey that goes well beyond code generation. It maps out a structured taxonomy of SE tasks, identifies the real technical bottlenecks limiting current AI tools, and proposes concrete research directions. Great for understanding where the field actually is vs. where the hype says it is.

[Thread]Boris Cherny — AI Coding Thread

The creator of Claude Code shares how he actually uses it — subagents, hooks for formatting, permission workflows, MCP integrations with Slack and BigQuery, and why giving Claude a way to verify its own work is the single most important tip.

[Article]My AI Adoption Journey — Mitchell Hashimoto

Mitchell (of HashiCorp fame) documents his honest progression from AI skeptic to strategic user across six phases. The key insight: success comes from "engineering the harness" — continuously improving prompts, tools, and workflows rather than expecting AI to just work out of the box.

[Article]How to Effectively Write Quality Code with AI

A practical guide that argues humans must establish clear architectural decisions, documentation, and testing frameworks upfront. Covers useful patterns like marking code review levels, creating debug systems, and breaking tasks into manageable chunks to keep quality high.

Understanding Agents

[Article]Building Effective Agents — Anthropic

Anthropic's engineering team makes a compelling case that the most successful agent implementations use simple, composable patterns rather than complex frameworks. Distinguishes clearly between workflows (predefined paths) and agents (LLM-directed), and details patterns like prompt chaining, routing, and parallelization. The core message: start simple, add complexity only when it demonstrably helps.

[Guide]A Practical Guide to Building Agents — OpenAI

OpenAI's counterpart to the Anthropic piece. A hands-on guide covering agent design patterns, tool use, and orchestration strategies from the perspective of deploying agents in real business contexts.

Use-Cases / Strong Examples

[Paper]OpenHands: An Open Platform for AI Software Developers as Generalist Agents

A community-driven open-source platform where AI agents write code, use command lines, and browse the web — just like human developers. Evaluated across 15 benchmarks with 2.1K+ contributions from 188+ contributors. Interesting as a concrete, accessible foundation for anyone wanting to build or study autonomous coding agents.

[Report]Lingxi v1.5 Technical Report

Technical report on Lingxi, an open-source AI agent focused on intelligent task automation and software development workflows. Interesting for its practical approach to building and deploying agentic systems.

[Paper]Live-SWE-agent: Can Software Engineering Agents Self-Evolve on the Fly?

This one is wild — an agent that starts with minimal capabilities and autonomously develops its own tools at runtime while solving real software problems. Hit 77.4% on SWE-bench Verified, surpassing all existing agents. The idea of self-evolving agents that get better as they work, rather than being statically designed, feels like an important direction.

Interesting Concepts

[Concept]Cord: Agent Coordination Framework — June Kim

A framework where AI agents dynamically build their own task hierarchies instead of following pre-defined workflows. Uses just five primitives (spawn, fork, ask, complete, read_tree) to let agents decide how to decompose and coordinate work at runtime. The distinction between "spawn" (clean-slate) and "fork" (context-inheriting) is elegant.

[Paper]Evaluating AGENTS.md: Are Repository-Level Context Files Helpful for Coding Agents?

A contrarian finding: repository context files (like AGENTS.md) tend to actually reduce task success rates while increasing costs by 20%+. Both machine-generated and human-written files cause agents to explore too broadly. The takeaway is to keep context files minimal and focused only on what's truly essential.

[Article]The Harness Problem — Can Duruk

Argues that how LLMs are instructed to edit code — the "harness" — matters as much as the model itself. Introduces "hashline," a technique using content hashes instead of exact text matching, which improved 15 different LLMs' coding performance by 5-14 percentage points. A strong case for investing in better interfaces between models and code, not just better models.