Module 2 Assignment: Data handling with Python libraries

Contents

Module 2 Assignment: Data handling with Python libraries#

Scenario#

You are advising an engineering team converting prototype AI code into a maintainable application component. The stakeholders are: software engineer, ML engineer, QA lead, product owner, and operations reviewer.

Task#

Answer the module question: How do Python tools support reproducible data work?

Use the module lab and course readings to produce: tested Python AI component with interface contract, CI evidence, and deployment notes focused on data handling with python libraries: Load, validate, and transform a dataset..

Required Evidence#

Define the decision or system boundary in one paragraph.
Identify the dataset, proxy data, or evidence source you used: synthetic API requests, validation outcomes, latency measurements, and test-case results.
Compare at least two alternatives, baselines, policies, or designs.
Report one quantitative result or structured scoring table.
Explain two failure modes and one mitigation for each.
State what additional evidence would be required before real deployment.

Submission#

Submit the completed notebook plus a 900-1200 word memo. The memo must include clear headings for context, method, evidence, risks, recommendation, and open questions.

# Assignment workspace for Module 2: Data handling with Python libraries
module = 2
decision = "How do Python tools support reproducible data work?"
artifact = "tested Python AI component with interface contract, CI evidence, and deployment notes focused on data handling with python libraries: Load, validate, and transform a dataset."

alternatives = [
    {"option": "baseline_or_manual_process", "strength": "", "risk": "", "evidence": ""},
    {"option": "ai_assisted_or_advanced_option", "strength": "", "risk": "", "evidence": ""},
]

recommendation = {
    "decision": decision,
    "recommended_option": "",
    "minimum_evidence_before_pilot": [],
    "monitoring_metric": "",
    "rollback_trigger": "",
}

{"module": module, "artifact": artifact, "alternatives": alternatives, "recommendation": recommendation}

{'module': 2,
 'artifact': 'tested Python AI component with interface contract, CI evidence, and deployment notes focused on data handling with python libraries: Load, validate, and transform a dataset.',
 'alternatives': [{'option': 'baseline_or_manual_process',
   'strength': '',
   'risk': '',
   'evidence': ''},
  {'option': 'ai_assisted_or_advanced_option',
   'strength': '',
   'risk': '',
   'evidence': ''}],
 'recommendation': {'decision': 'How do Python tools support reproducible data work?',
  'recommended_option': '',
  'minimum_evidence_before_pilot': [],
  'monitoring_metric': '',
  'rollback_trigger': ''}}

Acceptance Criteria#

Your submission is complete only if another reviewer can reproduce your reasoning from the evidence you provide. You do not need production-grade data, but you must be explicit about proxy-data limits and what would change with real institutional data.