Monika Jotautaitė
AI Safety Researcher
About me
I'm an independent AI researcher in a 2 person team, focusing on AI control. My current work is supported by Co-Efficient Giving (formerly Open Philanthropy) and the Berkeley Existential Risk Initiative.
Previously, I worked on designing model evaluations and, in particular, moral value evals.
Research
My work
UK AISI Bounty Program: As an evaluations scientist, I designed and implemented multiple evaluation proposals that were accepted for the UK AISI bounty program. The evalautions I worked on include evaluating models on the following capabilities: LLM elicitation, online gambling, collusion in AI debate and decreasing test-time token usage evaluations as well as the SmartBackdoor paper. I was also a technical program manager with the ASET Benchmarks program mentoring a team of engineers in cybersecurity eval implementation in Inspect at Arcadia Impact.
I organize Women in AI Safety London, a series of networking events. To receive updates on events and opportunities, join our mailing list. If you're interested in organizing a local event, you can apply here.
I occasionally teach at ML4Good bootcamps as a head teacher or a TA. I created new materials on LLM evaluations and RL. Find upcoming programs at ml4good.org/upcoming.
I created AI Safety materials for GirlsWhoML (slides). If you'd like to run this lecture series at your university, reach out here.