P1 Programs

Functions and Representations in AI Systems

Program Co-Directors

Program Description

Understanding how AI systems represent data can support quality assurance in various domains. Identifying specific representational contents in healthcare related contexts could improving transparency in clinical decision-making tools and diagnostic systems, and uncover hidden biases in AI models by identifying specific representational contents. 

However, most technical approaches to AI interpretability assume that deep neural networks learn representations of task relevant features. But there is no consensus on how to pinpoint the content of these representations. This derives partly from a lack of clarity over the correct theory of content – an account of how internal states map to contents (represented facts or features). Theories of content have been developed and extensively debated by philosophers for many decades. A highly influential thread in that literature, the teleological approach, involves an appeal to functions. On this account, a representation's content is partly determined by the function (job, purpose, role) that it plays within the broader system. Recently, a number of authors have independently invoked teleological theories of representation in debating the contents represented by large language models (LLMs). Yet there remains different conclusions about which kinds of AI systems represent which contents, as well as dismissing the relevance of functions in determining language model contents. This suggests that how functions are determined, and their role (if any) in fixing the content of AI representations are in much need of elucidation and clarification.

Addressing this issue, the P1 program, Functions and Representations in AI Systems, aims to strengthen the ties in the Danish AI ecosystem and beyond through structured activities and workshops, bringing together researchers across disciplines, hereunder AI and Philosophy, focusing on four key research questions:


  1. What is the relation between the functions of representations (components within an AI system) and the tasks or success criteria of the system as a whole? And how do system-level tasks relate to the objective functions on which AI systems are trained? 
  2. What role do other factors, such as user and designer intentions play in shaping the functions of components? 
  3. How does representational content determination differ between biological organisms (which have evolutionary histories and direct contact with the physical world) and LLMs (which don’t)?
  4. How do these questions relate to the practical goals of technical AI interpretability – e.g. control, prediction, safety?