Copilot Studio is just beginning to reach its full potential, I am sure of it.
📅 In this October 2024, the ability to reason about multimedia content such as images, charts, and diagrams has been added.
Going beyond simple text, the system understands how images can contribute to the meaning of the document, offering richer and more informed answers.
I immediately wanted to test this feature! So, I created an agent by uploading a document that contained text and images, where the images provided information that could not be obtained from the text alone.
🗨️ Through testing, I noticed how the agent takes into account the content of the image, its position, and its relevance in the context of the document.
Not only are isolated images examined, but the following are also considered:
💠 Layout
💠 Interaction with text
💠 Surrounding tables
This process allows for more nuanced responses, making the best use of not only the textual part but also the supporting images.
🔎 By doing some research, I also understood how knowledge sources are managed within an agent, a question I had asked myself in the early days of Copilot Studio.
The underlying knowledge service, powered by Microsoft Dataverse, indexes the uploaded files and generates vector embeddings of the file contents.
This enables Copilot Studio to provide semantic results when users ask questions about the contents of the files. All files are securely stored in Dataverse, benefiting from the robust security and governance controls that apply to all structures of the Microsoft Power Platform, ensuring rigorous security and compliance at every level.
My reflection today is… What could we achieve when we are able to upload videos and audio as well.

Leave a comment