Collage in dark blue and orange with a brain, a lock, dna strings, people and stars

Selecting Your AI Toolset

Module 4.

 

Purpose and Topics

Purpose

To support medical educators in making informed decisions about which AI tools or models to use by evaluating their capabilities, data practices and transparency in the context of educational and clinical work.

Note: Before selecting and evaluating AI tools individually, confirm whether your institution offers any licensed options. Institutional licenses frequently provide faculty with tools at no charge, along with data protection measures and internal support.

Topics/Learning Objectives

Upon completion of this module, individuals will be able to:

  1. Identify types of GenAI tools: standalone LLMs, multi-agent tools, etc.

  2. Compare AI tools across features (e.g., capabilities, data handling, transparency).

The words Capabilities, Data Handling and Transparency in white on a dark background.
Note:
The text and graphics in these modules were co-developed with the assistance of generative AI models such as OpenAI’s ChatGPT, Google’s Gemini and NotebookLM and Microsoft’s CoPilot, drawing on the indicated reference materials. Materials were then edited for relevancy and accuracy.


Part 1

Types of GenAI tools

Categorizing GenAI Tools for Medical Education

When selecting an AI tool, it’s important to understand not only the brand or model but also what class of tool it belongs to. AI tools fall into several categories based on how they function and how users interact with them. Below is a categorized summary with examples relevant to medical education.

Please note that there is often overlap, and some tools have features that would allow them to be categorized in multiple classes. The information in this table represents information current as of Spring 2026.

Tool Classes

 Standalone LLMs aka Chatbots

  • Definition These are foundational language models trained on large-scale data. They generate human-like responses to user prompts and support a wide range of tasks.
    You can also upload documents to these systems to help refine results or scope the AI's work.
  • Representative Tools ChatGPT (OpenAI), Claude (Anthropic), CoPilot (Microsoft), Gemini (Google)
  • Common Educational Uses Explaining complex content, tutoring, brainstorming clinical reasoning and generating study aids

  Learning Companions / Notebooks

  • Definition Tools that allow uploading of multiple types of materials for direct use and analysis. These tools act as assistants that summarize information (text and audio/video), answer questions and help generate insights (through briefing guides, concept maps and infographics) based solely on the provided sources.
  • Representative Tools NotebookLM (Google)
  • Common Educational Uses Content summarization and note-taking, such as condensing long academic papers, meeting notes and reports and for research organization, including structuring and managing research materials for articles and projects

 Embedded AI Assistants

  • Definition AI is integrated directly into existing software platforms, enhancing workflow without requiring a separate interface.
  • Representative Tools Copilot (Microsoft)(Word, PowerPoint, Excel)
    Gemini (Google) (Docs, Slides, Sheets)
  • Common Educational Uses Drafting emails, slide content, lesson planning, developing formulas and functions and summarizing documents in educational workflows

 Multi-LLM Platforms

  • Definition Tools that draw on multiple LLMs or databases and emphasize accuracy, sourcing or synthesis of cited information.
  • Representative Tools Perplexity
  • Common Educational Uses Cost-effective tool for search and comparison of multiple models

 Research Tools

  • Definition Research tools that streamline activities like clinical decision support, literature review, and synthesis by extracting and summarizing key information from academic papers.
  • Representative Tools OpenEvidence, Consensus, Scispace, Research Rabbit
  • Common Educational Uses Research literacy, evidence-based practice, citation tracking, clinical Q&A

 Open-Source and Developer Tools

  • Definition These tools offer transparency, flexibility and customization for those with programming or system-integration needs.
  • Representative Tools Mistral, LLaMA (Meta), GitHub, Hugging Face
  • Common Educational Uses Custom medical AI models, integration into research platforms and educational tool development

Sample Use Case Alignment

  • ChatGPT, Claude, CoPilot, Gemini → General questions, academic brainstorming and case-based Q&A

  • NotebookLM → Creating teaching or study guides from sources like PDF, TXT, MaD, MP3, etc.

  • Copilot → Creating teaching materials for direct output to .pptx, .docx or .xlsx format

  • Consensus, OpenEvidence → Scientific or clinical queries with citation-backed answers

  • Perplexity → Exploratory research with live web and source access


Part 2

Comparing Tools Across Features

How do medical educators select the best GenAI tools for academic work?

Now that you have considered the different types of GenAI tools, let's consider features, such as capability, cost, safety and transparency. No single tool is best for all situations. The ideal choice depends on what you need the tool to do, how you’ll use it and what constraints exist (e.g., privacy policies or budgets). A word about terminology. The general public sometimes calls tools "models', 'chatbots', 'AI companions', 'AI' or 'bots', but over time, we hope you'll get more specific as you learn more about tools. 

Here are four essential criteria for comparison, with questions to guide tool selection:

Criteria 1. Capabilities

AI tools, sometimes called models, vary widely in what they are designed to do. While many may appear similar on the surface, each tool has distinct strengths, limitations and intended use cases. Selecting the right tool depends not only on what it can generate, but also on how well its capabilities align with your specific educational, clinical or scholarly task.

Rather than viewing AI tools as interchangeable, faculty should consider what each tool does best—whether that is generating content, synthesizing information, working with documents or providing evidence-based responses.

When choosing an AI tool, it is helpful to evaluate how its capabilities match your needs, including the type of task, the level of complexity and the importance of accuracy and sourcing.

Consider: Does the model support your task (e.g., lesson planning, question creation, research synthesis)? Can it handle long documents or images? Does it provide sources or links? Can it produce output in a particular format (e.g., .pptx)?

Criteria 2. Cost and Access

Evaluating Cost and Access

In addition to capabilities and data practices, faculty must consider the cost structure and access model of AI tools. These factors directly influence which tools can be used equitably, sustainably and in alignment with institutional policies.

AI tools vary widely in how they are accessed—ranging from free public versions to subscription-based models and institutionally licensed platforms. Importantly, access level often determines not only cost, but also available features, data protections and compliance with institutional standards.

When selecting AI tools, faculty should consider:

  • Whether a free version is sufficient for the intended task

  • What additional features are included in paid or subscription tiers

  • Whether an institutional license is available or required

  • Whether all learners have equitable access to the tool

Understanding these differences helps ensure that tool selection supports both educational goals and responsible, equitable use.

Criteria 3. Safety and Privacy

How do different AI models handle data?

AI tools evolve rapidly, and their features and capabilities may change over time. Rather than focusing only on specific features, it is more important to understand how each tool handles data—particularly how it stores, processes and potentially retains user inputs.

A key distinction across AI systems is how they manage memory and data storage. Some tools may retain, log or use inputs to improve models, while others—particularly enterprise systems—are designed to limit retention and protect user data.

Understanding these differences helps faculty make informed decisions about when and how to use AI tools, especially when working with sensitive educational, research or clinical information.

About Enterprise-Level Protections

Many institutions now provide access to enterprise-level AI tools (e.g., Copilot in Microsoft 365) designed for use in academic, clinical and administrative environments. These systems differ from publicly available AI tools in how they manage data, privacy and security.

Enterprise-level protection refers to institutionally managed AI environments that are designed to safeguard sensitive information and meet regulatory requirements.

When using AI through your institution’s enterprise login, your prompts and uploaded content are handled within a secure, controlled environment. In these systems:

  • Data are not used to train public AI models

  • Information remains within the institution’s secure environment

  • Systems are configured to comply with FERPA, HIPAA (when applicable) and enterprise data security standards

Using enterprise AI tools is the preferred approach when working with institutional, educational or clinical information, as they provide significantly stronger protections than publicly available platforms.

Criteria 4. Transparency

How do medical educators evaluate transparency in AI models?

In addition to understanding how AI tools handle data, it is equally important to evaluate how transparent they are in generating responses. Transparency refers to how clearly a tool communicates the sources, reasoning and limitations behind its outputs.

AI systems vary widely in this regard. Some tools provide explicit citations or allow users to trace information back to original sources, while others generate responses without clear attribution or insight into how conclusions were formed.

For faculty, transparency is essential for determining whether AI-generated content can be trusted, verified and used in educational or clinical contexts.

When evaluating AI tools, consider the following:

  • Are citations provided?

  • Can the cited sources be verified?

  • Is it possible to trace how the answer was generated?

  • Are potential errors or “hallucinations” easy to identify and correct?

Understanding these differences helps faculty select appropriate tools and model critical evaluation skills for learners.

Conclusion

Selecting the right tool means balancing functionality with trustworthiness. Educators and students should start by identifying the task, then use these feature criteria—capabilities, cost, safety and transparency—to evaluate fit.


Supplementary Materials & Resources

Supplementary Materials

  • PowerPoint

  • Flashcards

  • Google NotebookLM or GEM

Resources

Articles

Videos

Association of American Medical Colleges (AAMC) Webinar Series:

Websites

Association of American Medical Colleges (AAMC) 
Elsevier
Ethan Mollick (Substack)
Purdue University

Note: The text and graphics in these modules were co-developed with the assistance of generative AI tools such as OpenAI’s ChatGPT, Google’s Gemini and NotebookLM and Microsoft’s CoPilot, drawing on the indicated reference materials. The materials were then edited for relevance and accuracy.