Work — Henry Hu

Dec 2025 - Present

Self-initiated project

Catan Learning Environment

An agent harness for LLM play in Settlers of Catan, built around a pure-Python engine and multi-agent training interface.

Built a pure-Python game engine with a PettingZoo AEC wrapper, dual-channel observations, and an async player-trade state machine.
Implemented a multimodal agent loop with persistent strategic memory, turn-trace state, and an event queue for opponent actions.
Built a replay pipeline over 8.5K expert Colonist.io games, mapping Colonist coordinates into engine state and reconstructing observation-action pairs.

PythonAgentsMulti-agent RLGame engines

May 2026 - Present

Self-initiated research project

A study of whether behavior-conditioned steering vectors for personality facets reflect the structure of human psychometric categories.

Generated positive and negative contrastive prompt pairs for facets of the Big Five framework.
Captured per-layer activations from Qwen 2.5 and derived facet-level steering vectors with repeng.
Comparing vectors across facets and domains to test whether model-internal trait geometry matches the established human factor structure.

InterpretabilityActivation steeringQwenrepeng

Mar 2026 - May 2026

Software Engineer Intern

Built core product surfaces for a research workspace: graph canvas, PDF reading, paper chat, and browser capture.

Built the context graph canvas: a force-directed, Yjs-backed workspace with node creation, optimistic edits, focus mode, share links, and a library view.
Shipped the paper-chat and PDF reading surface with streaming chat UI, math rendering, geometry-based PDF highlights, and an embedding trigger feeding the context store.
Designed the browser extension capture pipeline for Twitter, LinkedIn, and Reddit clips, including FastAPI auth refresh and library-side rendering.

Next.jsYjsPDF UIBrowser extensions

Jan 2026 - Feb 2026

Software Engineer Intern

Worked on agentic voice infrastructure, deployment tooling, and codebase consolidation.

Refactored the voice-agent architecture for modular prompt design and dynamic tool binding, reducing time-to-speech latency by 400 ms.
Built internal tooling to move agent deployment logic into a database-backed system.
Consolidated the codebase into a monorepo, removing redundant per-agent codebases and deployments.

Voice agentsTypeScriptTool bindingInfra

Feb 2026

TartanHacks 2026, team of 3

An AI physiotherapist that uses an interactive 3D anatomical model to diagnose muscle pain and generate rehab plans.

Owned the agent backend on the Dedalus SDK with model tiering: a high-reasoning orchestrator dispatches to a faster sub-agent for clinical lookup.
Connected a Qdrant and Nomic clinical knowledge base to streaming tool calls over SSE.
Fed a Convex-backed reactive store that drove the 3D model and chat UI.

AgentsQdrantSSEConvex

Jun 2023 - Aug 2023

Summer Research

Research on code generation behavior from crowdsourced microtask specifications.

Evaluated GPT-3.5 code generation against crowdsourced microtask specs using BDD and TDD test harnesses.
Characterized recurring failure modes in generated code.
Developed prompt patterns that measurably reduced generation errors.

ResearchCode generationTestingPrompting