Friday, February 14, 2025

Data Governance for Reliable AI: From Source to Insight

I am building a workshop to help adult students and professionals make the best use of emerging AI tools for both organizational and personal use. Two themes that will be present in the workshop include, using Retrieval Augmented Generation (RAG) for improving context and being mindful of information privacy within the context of AI.

The overall theme of the workshop is to unlock the potential of AI while ensuring quality and reliability. This workshop explores the critical role of continuous improvement, data governance, data provenance, and data lineage in building trustworthy AI systems. Discover practical strategies for implementing robust data management frameworks to address challenges in data quality, compliance, and model performance, leading to more effective AI solutions.

Key Topics:

  • Continuous Improvement: Learn why iteratively refining data is crucial for reliable AI outcomes.
  • Data Governance: Understand the importance of data governance in AI.
  • Data Provenance & Lineage: Discover how tracking data's origin, journey, and transformations enhances transparency, supports ethical practices, improves decision-making, and reduces hallucinations in AI applications.

AI Approaches: RAG

The workshop will also touch on Retrieval-Augmented Generation (RAG), a technique that fundamentally improves AI systems by enabling them to access and utilize specific, real-time information from organizational documents, databases, and knowledge bases, rather than relying solely on their training data. RAG enhances accuracy and reliability, as demonstrated in applications like healthcare and legal work. Unless an organization builds its own Large Language Model (LLM), everything it does with AI could be considered RAG.

For more detail in moving beyond good prompt engineering you need to consider RAG, please consider this blog post for further insight; https://criticaltechnology.blogspot.com/2024/12/rag-and-agents-how-ai-is-learning-to.html

Tools for Data Governance

The workshop will include a demo of NotebookLM. NotebookLM is designed with robust privacy features that make it particularly relevant for Canadian professionals handling sensitive information. The platform's key privacy feature is that uploaded documents are never used to train its AI models, ensuring data remains private and secure.

Most of the demo's will be in using NotebookLM, the RAG tool built by Google. To better understand NotebookLM and its security position, please consider this recent blog post; https://criticaltechnology.blogspot.com/2025/02/keeping-your-data-private-in-notebooklm.html

PIPEDA and Data Governance

It's important to understand Canada's Personal Information Protection and Electronic Documents Act (PIPEDA). While PIPEDA doesn't explicitly address AI, its technology-neutral principles establish crucial guidelines for handling personal data in AI projects. These include obtaining proper consent, limiting data collection, implementing security measures, maintaining transparency, ensuring data accuracy, and practicing accountability.

For more detail of how AI intersects with PIPEDA enjoy this blog post highlighting seven important impacts; https://criticaltechnology.blogspot.com/2025/02/ai-and-your-personal-project-navigating.html

If you are interested in attending this workshop feel free to sign up. All are welcome. Reserve your spot here: https://lnkd.in/ecKQ-reB