about
.location
Iceland (UTC)
.hello
I study how large language models behave under conditions they’re rarely evaluated for—persistence, memory, and extended interaction. My focus is not on abstract capability claims, but on how system design choices shape behavior in practice. I come from behavioral and mental health research, where methodological care matters: construct validity, longitudinal observation, and knowing what your metrics actually measure. That background shapes how I approach AI systems. If we change the conditions under which models operate, we should expect different behaviors, and we should measure them carefully. I’m especially interested in work that connects research to deployment: understanding failure modes before they show up in real systems, and designing evaluations that reflect how models are actually used. Currently based in Iceland, working as a PM by day, running experiments on Claude by night.
.work experience
.stack


