Claude AI · Open Government Data · Policy Analysis

Civic Analytics
Skills for
Claude

A framework for evidence-based city policy analysis. Built on world-class methodologies from the Bloomberg Centers at JHU & HKS, J-PAL at MIT, and The GovLab at NYU & Northeastern — connecting Claude AI to live open data from Boston, San Francisco, Seattle, and Washington DC.

5 Core Skills
4 City Data Sources
7 Supporting Files
29 MCP Eval Prompts

The Five Phases

Frame. Analyze.
Communicate. Benchmark. Perform.

Each skill guides Claude through a rigorous phase of city policy work, connecting directly to open data APIs via Model Context Protocol servers.

01 — FRAME
Problem Framing
Bloomberg · JHU & HKS

Stop solving the wrong problem. Scope challenges before touching data, map stakeholders, interrogate assumptions, and write a human-centered problem statement with explicit equity dimensions.

Key output: A bounded problem statement with success criteria before any data query runs.
02 — ANALYZE
Policy Analysis
J-PAL · MIT

Five levels of evidence — descriptive, diagnostic, equity, counterfactual, synthesis — with explicit claim-strength labeling so analysts never overstate what administrative data can prove.

Key output: Findings labeled by confidence level with honest limitations and equity analysis.
03 — COMMUNICATE
Communication
GovLab · NYU & Northeastern

Translate analysis into audience-appropriate deliverables: executive memos, policy briefs, community fact sheets, dashboards — each with inclusive design and genuine engagement mechanics.

Key output: The right document for the right audience, with feedback loops built in.
04 — BENCHMARK
Cross-City Benchmarking
Boston · San Francisco · Seattle · DC

Is Boston's problem a Boston problem, or every city's problem? Uses San Francisco, Seattle, and DC open data MCPs for rigorous cross-city comparison — including performance management efficiency ratios — with J-PAL-calibrated claim language.

Key output: Normalized benchmark table with confidence levels, cost-per-outcome comparisons, and peer-city learning recommendations.
05 — PERFORM
Performance Management
Results for America · PerformanceStat

Connect what the city spends and who it employs to what it actually delivers. Cost per outcome, workload per FTE, and overtime signals from budget, payroll, and operational data — 311, permits, violations, and public safety.

Key output: Performance summary with efficiency ratios, staffing stress indicators, and year-over-year investment trends.

Source Methodologies

Built on world-class
public innovation frameworks

Bloomberg Centers
Path to Public Innovation
Problem framing, stakeholder mapping, assumption interrogation, human-centered scoping
JHU & Harvard Kennedy School
J-PAL · MIT
Evidence-to-Policy
Claim strength labeling, equity analysis, counterfactual thinking, causal humility
The GovLab
Open Governance
Data collaboratives, collective intelligence, consequential engagement, democratic legitimacy
NYU & Northeastern University
InnovateUS
Democratic Engagement
Accessible public communication, co-design methodology, plain-language standards
Results for America
PerformanceStat
Cost-per-outcome analysis, workload-per-FTE ratios, budget-to-performance trending, staffing efficiency signals
CitiStat tradition

Open Data Sources

Four cities.
Live MCP connections.

Skills connect to open government data via Model Context Protocol servers — no manual downloads or API keys. Data flows directly into analysis on demand.

Primary · Massachusetts
Boston

Primary analysis city. Full dataset coverage: 311 service requests, building permits, public safety, demographics, housing, transportation, environment.

Boston Open Data MCP
Peer City · California
San Francisco

Peer city for benchmarking. Comparable population (~870K), dense urban, strong Socrata data infrastructure. Covers 311, permits, public safety, budget, and payroll for performance comparisons.

San Francisco Open Data · Socrata
Peer City · Washington
Seattle

Peer city for benchmarking. Comparable population (~750K), tech-sector city, Socrata platform. FindIt FixIt service request data enables direct 311 comparison.

Seattle Open Data · Socrata
Peer City · Washington DC
Washington DC

Peer city for benchmarking. Closest population match (~690K), CitiStat tradition (Boston's model), high urban density. DC Open Data uses ArcGIS platform.

DC Open Data · ArcGIS

MCP Quality Assurance

29 ground-truth prompts.
Four levels of rigor.

The Boston Open Data MCP eval suite measures whether Claude retrieves accurate, complete, and structurally correct data across all domains — from schema validation to cross-dataset analytical queries.

29 Total Eval Prompts
4 Rigor Levels
6 Data Domains
Ground truth · March 30, 2026
LEVEL 1 — STRUCTURAL
Schema & Discovery

Tests dataset discovery and schema retrieval. Can the MCP find the right datasets, return correct field names, and confirm structural integrity? No domain knowledge required — just that the plumbing works.

Target: 100% · Weight: 15%
LEVEL 2 — FACTS
Single-Dataset Retrieval

Tests accurate counts, rankings, and aggregations from individual datasets. Covers 311, crime, building permits, Blue Bikes, and population — all with ground-truth tolerances set against immutable historical records.

Target: ≥95% · Weight: 35%
LEVEL 3 — MULTI-STEP
Filtering, Grouping & Ranking

Tests chained operations: filter by category, aggregate by group, sort, and interpret. Requires the MCP to execute multiple tool calls correctly and return ranked results in the right order.

Target: ≥90% · Weight: 30%
LEVEL 4 — CROSS-DATASET
Analytical Queries

Tests cross-dataset joins: per-capita rates, year-over-year comparisons, and multi-domain synthesis. Requires the MCP to navigate multiple datasets, calculate derived metrics, and reach directionally correct conclusions.

Target: ≥80% · Weight: 20%
# Aggregate scoring formula
Overall = (0.15 × L1_Structure) + (0.35 × L2_Facts) + (0.30 × L3_MultiStep) + (0.20 × L4_CrossDataset)

# Stability classes: Immutable · Very Stable · Stable · Slowly Growing
# See evals.json + EVAL_PLAN.md in the Boston MCP Eval/ directory

Quickstart

Start analyzing in
three steps

Setup
1

Clone the repository

Get all skill files and supporting materials from GitHub.

2

Upload the skills to Claude

Add SKILL.md files to your Claude environment. The master orchestrator routes to sub-skills automatically.

3

Connect your MCP data sources

Add Boston, San Francisco, Seattle, and DC open data MCP servers to enable live data access.

Example prompts
# Full workflow "Investigate whether Boston's 311 response times are equitable across neighborhoods. Frame the problem, analyze the data, compare to Seattle, write a brief for the Mayor." # Cross-city benchmark "Compare Boston, San Francisco, Seattle, and DC on building permit approval times. What can Boston learn from the better performers?" # Communication package "Write a 1-page memo for the Mayor AND a community fact sheet for Roxbury residents from the same 311 equity analysis." # Performance management "How much is Boston spending per 311 trash request resolved on time? Are Public Works employees being overworked relative to that volume?"

Supporting Materials

Everything a government
analyst needs

TEMPLATES.md
6 fill-in-the-blank output formats: exec memo, policy brief, one-pager, community fact sheet, benchmark summary, presentation deck
CHECKLISTS.md
Pre-flight and review checklists for all 5 phases. Catches the most common analytical errors before they reach stakeholders.
PROMPTS.md
25+ example prompts organized from simple to expert-level, covering every skill and common government use cases.
REFERENCE.md
Full dataset catalog for all 3 cities with resource IDs, field names, the 311 schema change table, and performance management datasets.
Performance_Management_Skill.md
Phase 5 skill: cost-per-outcome, workload-per-FTE, multi-year budget trends, and staffing efficiency — across 311, permits, violations, and public safety data.
EXAMPLE-311-equity.md
Complete end-to-end worked example: 311 response equity + cross-city benchmark, showing every tool call and analytical decision.
Boston MCP Eval/
29-prompt eval suite with ground truth (March 2026). evals.json for automated grading; EVAL_PLAN.md for methodology, scoring formula, and stability classifications.

Design Principles

Built for real
city work

Rigorous
Every claim labeled with confidence level. Limitations stated, not buried. Administrative data rarely proves causation — the skills say so.
Human-centered
Problems framed around residents' lived experience. Success measured by resident outcomes, not bureaucratic process.
Equity-focused
Every analysis asks who benefits and who is burdened. Asset-based framing required. Geographic proxies noted with their limits.
Transparent
Data sources cited with IDs. Methodology reproducible. Findings shareable as open knowledge for others to verify and build on.
Actionable
Communication designed to create action, not just awareness. Recommendations specify who, what, and when. Feedback loops built in.