Module 2 Lab: Data handling with Python libraries

Module 2 Lab: Data handling with Python libraries#

Load, validate, and transform a dataset.

Lab Context#

This lab uses synthetic API requests, validation outcomes, latency measurements, and test-case results as a safe proxy for the course setting. It is not a substitute for institutional data, but it lets you practice the reasoning, metrics, and documentation pattern before working with real records.

Lab Tasks#

  1. Run the baseline analysis.

  2. Identify the decision the metric supports.

  3. Change one threshold, score weight, or input assumption.

  4. Compare the result before and after your change.

  5. Record one deployment risk that the synthetic data cannot reveal.

import numpy as np

requests = [
    {"id": "r1", "input_valid": True, "latency_ms": 180, "expected": 1, "actual": 1},
    {"id": "r2", "input_valid": True, "latency_ms": 245, "expected": 0, "actual": 0},
    {"id": "r3", "input_valid": False, "latency_ms": 30, "expected": "reject", "actual": "reject"},
    {"id": "r4", "input_valid": True, "latency_ms": 390, "expected": 1, "actual": 0},
]

valid_predictions = [r for r in requests if r["input_valid"]]
accuracy = sum(r["expected"] == r["actual"] for r in valid_predictions) / len(valid_predictions)
p95_latency_proxy = float(np.quantile([r["latency_ms"] for r in requests], 0.95))
contract_failures = [r["id"] for r in requests if r["expected"] != r["actual"]]

test_report = {
    "accuracy_on_valid_requests": accuracy,
    "p95_latency_proxy_ms": p95_latency_proxy,
    "contract_failures": contract_failures,
    "next_test_to_add": "Add one malformed-input and one slow-dependency case.",
}
test_report
{'accuracy_on_valid_requests': 0.6666666666666666,
 'p95_latency_proxy_ms': 368.24999999999994,
 'contract_failures': ['r4'],
 'next_test_to_add': 'Add one malformed-input and one slow-dependency case.'}
reflection = {
    "what_changed": "",
    "metric_before": "",
    "metric_after": "",
    "interpretation": "",
    "synthetic_data_limit": "",
    "next_real_world_evidence_needed": "",
}
reflection
{'what_changed': '',
 'metric_before': '',
 'metric_after': '',
 'interpretation': '',
 'synthetic_data_limit': '',
 'next_real_world_evidence_needed': ''}