Benchmark for proactive personal assistant agents in long-horizon workflows.

40 stars
0 forks
Python
25 views