Tech · Apple Machine Learning
From Where Things Are to What They’re For: Benchmarking Spatial–Functional Intelligence for Multimodal LLMs
Compiled by KHAO Editorial — aggregated from 1 outlet. See llms.txt for citation guidance.
★ Tier-1 Source
From Where Things Are to What They’re For: Benchmarking Spatial–Functional Intelligence for Multimodal LLMs.
Key facts
- SFI-Bench is designed to systematically evaluate two complementary dimensions of advanced reasoning: (1) Structured Spatial Reasoning, understanding complex layouts and forming coherent spatial
- To bridge this gap, they introduce the Spatial-Functional Intelligence Benchmark (SFI-Bench), a video-based benchmark with over 1700 questions derived from diverse, egocentric indoor video scans
- From Where Things Are to What They’re For: Benchmarking Spatial–Functional Intelligence for Multimodal LLMs
- Authors Le Zhangâ **, Jihan Yangâ¡, Soundarya Krishnan, Jimit Majmudar, Xiou Ge, Prasoon Puri, Prathamesh Saraf, Shruti Bhargava, Dhivya Piraviperumal, Yinan Ling, Cindy Pan, Hong Yu, Aishwarya
Summary
Authors Le Zhangâ **, Jihan Yangâ¡, Soundarya Krishnan, Jimit Majmudar, Xiou Ge, Prasoon Puri, Prathamesh Saraf, Shruti Bhargava, Dhivya Piraviperumal, Yinan Ling, Cindy Pan, Hong Yu, Aishwarya Agrawalâ, Bo-Hsiang Tseng. True spatial intelligence for multimodal agents transcends low-level geometric perception, evolving from knowing where things are to understanding what they are for. SFI-Bench is designed to systematically evaluate two complementary dimensions of advanced reasoning: (1) Structured Spatial Reasoning, understanding complex layouts and forming coherent spatial representations, and (2) Functional Reasoning, inferring object affordances and context-dependent utility.  Mila, Université de Montréal.