?
World 4 Β· AI Ethics & SafetybeginnerAges 10+EthicsSafetySociety

AI Ethics and Safety

Explore the real-world impact of AI. Essential knowledge for responsible AI use.

3 hours6 lessons600 XP total

Course Syllabus

6 lessons
1

What Can Go Wrong with AI?

12 min25 XP

A grounded look at real AI failures β€” from self-driving car accidents to hiring algorithms that discriminate. Why ethics matters now.

  • AI systems can fail dramatically when they encounter situations that differ from their training data β€” a self-driving car trained mostly on sunny California roads may behave dangerously in Finnish winter conditions it was never designed for.
  • Autonomous vehicle accidents have caused real fatalities β€” the 2018 Uber self-driving car death in Arizona revealed that edge cases (unexpected pedestrian behavior at night) remain genuinely unsolved problems in AI safety.
  • Amazon's internal AI hiring tool, built to screen resumes, was scrapped in 2018 after it was discovered to systematically penalize resumes that included the word 'women's' (e.g. 'women's chess club') because historical hiring data skewed male.
  • AI-generated medical advice and misdiagnoses have caused patient harm β€” an AI skin cancer detector trained mostly on light-skinned patients performed significantly worse on darker skin tones, a bias that could lead to missed diagnoses.
  • System failures in AI can cascade: a flawed AI trading algorithm at Knight Capital caused a 45-minute malfunction in 2012 that lost $440 million β€” illustrating how AI errors in high-stakes systems can compound faster than humans can respond.
  • Legal accountability for AI harm is an unresolved problem β€” when a self-driving car causes an accident, is it the manufacturer, the software developer, the AI company, or the passenger who bears legal responsibility? Courts worldwide are still deciding.
  • AI can fail confidently β€” models produce wrong answers with high certainty scores, showing no signs of uncertainty or hesitation. This 'hallucination with confidence' is particularly dangerous in medical, legal, and financial contexts.
  • Real-world AI deployments reveal failure modes that laboratory testing never anticipated β€” this is why staged rollouts, human-in-the-loop review systems, and robust monitoring are essential components of responsible AI deployment.
  • Understanding AI failures isn't about being afraid of AI β€” it's about deploying it wisely, with appropriate human oversight, clear accountability chains, and honest acknowledgment of what the technology cannot yet do reliably.
2

AI Bias: Real Examples

15 min30 XP

Understand how AI systems learn and amplify social biases. Explore landmark cases in facial recognition, criminal justice, and lending.

  • AI bias does not come from a malicious programmer β€” it originates from the training data itself, which reflects real-world historical inequalities, and the algorithm faithfully learns and reproduces those patterns at scale and speed.
  • COMPAS (Correctional Offender Management Profiling for Alternative Sanctions) is an AI tool used in US courts to predict recidivism β€” ProPublica's 2016 investigation found it labeled Black defendants as higher reoffending risk at roughly twice the rate of white defendants.
  • MIT researcher Joy Buolamwini found that facial recognition systems from major tech companies had 34% higher error rates for darker-skinned women compared to lighter-skinned men β€” meaning the AI was significantly less accurate for people least like its training data.
  • Word embedding models like Word2Vec, trained on internet text, encode societal stereotypes β€” 'doctor' and 'engineer' cluster near 'he', while 'nurse' and 'receptionist' cluster near 'she', reflecting historical gender imbalances in those professions.
  • Feedback loops amplify bias over time β€” a predictive policing AI sends more police to 'high risk' neighborhoods, leading to more arrests there, which reinforces the 'high risk' classification, creating a self-fulfilling prophecy that data can't distinguish from truth.
  • Historical data is particularly dangerous for AI: a loan approval AI trained on 50 years of lending data will learn that certain zip codes (which correlate with race due to historical redlining) are 'high risk', perpetuating decades-old discrimination in a new form.
  • Fairness is mathematically difficult β€” researchers have proven that several intuitive definitions of fairness (equal accuracy, equal false positive rates, equal positive prediction rates) are mathematically incompatible, meaning no algorithm can satisfy all definitions simultaneously.
  • Mitigation strategies exist but require conscious effort: diverse and representative training data, regular auditing across demographic subgroups, fairness-aware algorithms, diverse AI development teams, and external bias audits by independent organizations.
  • Being able to identify AI bias β€” and ask hard questions about where a system's training data came from and whose perspective it reflects β€” is a critical skill for anyone who uses, deploys, or is affected by AI systems in the modern world.
3

Privacy in the Age of AI

18 min35 XP

How AI systems collect, infer, and exploit personal data. Understand your rights and practical steps to protect your privacy.

  • AI can infer highly sensitive personal attributes from seemingly harmless data β€” your browsing history, app usage patterns, typing speed, and location data can collectively reveal your political beliefs, health conditions, financial stress, and sexual orientation.
  • A famous Stanford study showed that a simple machine learning model trained on Facebook likes could predict personality traits, political orientation, sexual orientation, and drug use β€” from data users never considered private.
  • Browsing history reveals far more than people realize β€” regularly visiting certain health websites, attending support group forums, or researching specific medications can tell an advertiser (or insurer) about conditions you've never disclosed.
  • Federated learning is a privacy-preserving technique where AI is trained locally on your device and only aggregated model updates (not your raw data) are shared with a central server β€” Apple uses this for keyboard prediction and Google for Gboard.
  • Differential privacy adds carefully calibrated statistical noise to data before it's used for AI training β€” Apple and Google use this to collect usage statistics while ensuring no individual's behavior can be reconstructed from the aggregate data.
  • GDPR (EU's General Data Protection Regulation) gives European citizens the right to access what data is held about them, correct inaccuracies, request deletion ('right to be forgotten'), and object to automated decision-making.
  • GDPR Article 22 specifically addresses AI β€” if an algorithm makes a significant decision about you (credit, insurance, hiring), you have the right to request a human review and an explanation of how the decision was made.
  • Data minimization is a core GDPR principle: organizations should collect only the minimum data necessary for their stated purpose β€” in practice, many AI systems collect far more data than needed, creating privacy risks and legal liability.
  • Practical steps to protect your privacy: use a VPN, review app permissions, opt out of ad personalization, use privacy-focused search engines, and check your data rights under GDPR or CCPA using tools like access request forms.
4

Deepfakes and Misinformation

20 min40 XP

Understand how deepfakes are made, how to spot them, and the societal threat of AI-generated misinformation at scale.

  • Deepfakes use AI models (originally GANs, now primarily diffusion models) to synthesize convincing video and audio of real people β€” replacing faces, cloning voices, and generating entirely fictional footage of events that never happened.
  • Creating a convincing deepfake once required a Hollywood-level team and weeks of work β€” today, free consumer apps can produce a realistic face-swap video in minutes from a single photo, with no technical skills required.
  • The vast majority of deepfakes online β€” estimates suggest over 96% β€” are non-consensual intimate imagery (NCII) targeting women: fake explicit content created without consent and used for harassment, blackmail, and reputation destruction.
  • Political deepfakes represent a serious democratic threat β€” fabricated videos of world leaders declaring war, making racist statements, or announcing policy reversals have already been distributed on social media and briefly mistaken for real events.
  • Voice cloning deepfakes are increasingly used for fraud β€” criminals clone a CEO's voice from public recordings to call the CFO and authorize wire transfers, a scam called 'vishing' that has cost companies millions of dollars.
  • Detection clues for deepfakes include: unnatural blinking patterns, hair that doesn't move realistically, teeth and earring anomalies, facial edges that blur or flicker at the boundary, inconsistent lighting between the face and background, and the 'uncanny valley' feeling.
  • C2PA (Coalition for Content Provenance and Authenticity) metadata cryptographically signs content at the moment of capture, recording the camera, location, time, and any edits β€” Adobe, BBC, Sony, and major AI companies support this standard.
  • Watermarking AI-generated content β€” invisibly embedding markers in pixels that survive compression and cropping β€” is being developed by companies like Google (SynthID) as a way to identify AI-generated images and video at scale.
  • Critical media literacy is the most robust long-term defense against deepfakes: question the source, verify through multiple trusted outlets before sharing, use reverse image search, and remember that seeing is no longer believing.
5

AI and Jobs: Fear vs Reality

15 min30 XP

Separate hype from reality on AI's job market impact. Which roles are at risk, which are safe, and how to position yourself for the future.

  • McKinsey's 2023 research estimates that 12 million workers in the US alone may need to change occupations by 2030 due to AI automation β€” and those transitions are historically hardest for lower-income workers with fewer resources to retrain.
  • Routine cognitive tasks face the highest automation risk: data entry, basic writing and translation, simple customer service responses, standard legal document review, and repetitive data analysis are all being automated with current AI tools.
  • Creative, empathetic, and complex physical tasks remain harder to automate β€” therapists, nurses, plumbers, teachers, and creative directors rely on human connection, physical dexterity, and genuine improvisation that AI still cannot replicate reliably.
  • AI is simultaneously creating entirely new job categories: prompt engineers, AI trainers who label data and give feedback, AI auditors who test for bias and safety, AI product managers, and AI ethicists who embed responsible practices in development.
  • Historical precedent suggests technology eliminates tasks within jobs rather than eliminating jobs entirely β€” the calculator didn't eliminate accountants, it freed them to do more complex financial analysis. ATMs didn't eliminate bank tellers, they freed tellers for relationship work.
  • The key strategic question isn't 'will AI take my job?' but 'how do I use AI to be dramatically more productive than my peers?' β€” the real labor market risk is not AI replacing workers, but workers who use AI replacing workers who don't.
  • The workers most at risk are those in the middle of the skill distribution β€” AI is automating many mid-skill cognitive tasks while demand grows for both highly specialized expertise and for human judgment, creativity, and interpersonal skills.
  • Learning to use AI tools effectively right now, while the technology is still relatively new, provides a significant first-mover advantage β€” the skills, workflows, and intuitions you build today will compound into a meaningful career edge over the next decade.
  • The most resilient career strategy combines deep domain expertise in a field you care about with strong AI tool literacy β€” being the person in your industry who understands both the domain AND how to leverage AI within it is an increasingly rare and valuable position.
6

Building Responsible AI

22 min45 XP

Learn the principles and frameworks used by AI labs and governments to ensure AI is safe, fair, and beneficial to all of humanity.

  • The EU AI Act (2024) is the world's first comprehensive AI regulation β€” it classifies AI systems by risk level (unacceptable, high, limited, minimal) and imposes requirements proportionate to that risk, from outright bans to mandatory audits.
  • Unacceptable-risk AI banned by the EU AI Act includes real-time biometric surveillance of public spaces, social scoring systems (like China's citizen credit scores), and AI that manipulates people using subliminal psychological techniques.
  • High-risk AI applications β€” including medical devices, credit scoring, hiring tools, educational assessment, law enforcement, and critical infrastructure β€” require extensive documentation, third-party auditing, and human oversight.
  • Constitutional AI (developed by Anthropic) trains models using a written set of principles β€” instead of relying solely on human raters, the AI learns to evaluate and improve its own outputs against a codified list of values, making alignment more transparent.
  • RLHF (Reinforcement Learning from Human Feedback) is the alignment technique that transformed raw language models into helpful assistants β€” human raters compare pairs of model outputs, their preferences train a reward model, and that reward model guides further training.
  • Red-teaming means deliberately trying to break your own AI system before deployment β€” a team of researchers attempts prompt injections, jailbreaks, adversarial inputs, and misuse scenarios to discover vulnerabilities that normal testing would miss.
  • Model cards are standardized documentation that every responsible AI release should include β€” they describe the model's intended use cases, training data sources, performance metrics across different demographics, known limitations, and ethical considerations.
  • The 5 pillars of responsible AI (Microsoft's framework): Fairness (equitable across groups), Reliability and Safety (performs as intended under all conditions), Privacy and Security (protects personal data), Inclusiveness (works for all people), and Transparency (explainable decisions).
  • Responsible AI is not just an ethical choice β€” it's increasingly a legal and business necessity. Companies that deploy biased or unsafe AI face regulatory fines (up to 4% of global revenue under the EU AI Act), lawsuits, and severe reputational damage.

Ready to Start Learning?

Create a free account to track your progress, earn XP and badges, and unlock your certificate.