Claude AI · Open Government Data · Policy Analysis

Civic Analytics
Skills for
Claude

A framework for evidence-based city policy analysis. Built on world-class methodologies from the Bloomberg Centers at JHU & HKS, J-PAL at MIT, and The GovLab at NYU & Northeastern — connecting Claude AI to live open data from Boston, San Francisco, Seattle, and Washington DC.

View on GitHub ↗ Get started

5 Core Skills

4 City Data Sources

7 Supporting Files

29 MCP Eval Prompts

The Five Phases

Frame. Analyze.
Communicate. Benchmark. Perform.

Each skill guides Claude through a rigorous phase of city policy work, connecting directly to open data APIs via Model Context Protocol servers.

01 — FRAME

Problem Framing

Bloomberg · JHU & HKS

Stop solving the wrong problem. Scope challenges before touching data, map stakeholders, interrogate assumptions, and write a human-centered problem statement with explicit equity dimensions.

Key output: A bounded problem statement with success criteria before any data query runs.

02 — ANALYZE

Policy Analysis

J-PAL · MIT

Five levels of evidence — descriptive, diagnostic, equity, counterfactual, synthesis — with explicit claim-strength labeling so analysts never overstate what administrative data can prove.

Key output: Findings labeled by confidence level with honest limitations and equity analysis.

03 — COMMUNICATE

Communication

GovLab · NYU & Northeastern

Translate analysis into audience-appropriate deliverables: executive memos, policy briefs, community fact sheets, dashboards — each with inclusive design and genuine engagement mechanics.

Key output: The right document for the right audience, with feedback loops built in.

04 — BENCHMARK

Cross-City Benchmarking

Boston · San Francisco · Seattle · DC

Is Boston's problem a Boston problem, or every city's problem? Uses San Francisco, Seattle, and DC open data MCPs for rigorous cross-city comparison — including performance management efficiency ratios — with J-PAL-calibrated claim language.

Key output: Normalized benchmark table with confidence levels, cost-per-outcome comparisons, and peer-city learning recommendations.

05 — PERFORM

Performance Management

Results for America · PerformanceStat

Connect what the city spends and who it employs to what it actually delivers. Cost per outcome, workload per FTE, and overtime signals from budget, payroll, and operational data — 311, permits, violations, and public safety.

Key output: Performance summary with efficiency ratios, staffing stress indicators, and year-over-year investment trends.

Source Methodologies

Built on world-class
public innovation frameworks

Bloomberg Centers

Path to Public Innovation

Problem framing, stakeholder mapping, assumption interrogation, human-centered scoping

JHU & Harvard Kennedy School

J-PAL · MIT

Evidence-to-Policy

Claim strength labeling, equity analysis, counterfactual thinking, causal humility

The GovLab

Open Governance

Data collaboratives, collective intelligence, consequential engagement, democratic legitimacy

NYU & Northeastern University

InnovateUS

Democratic Engagement

Accessible public communication, co-design methodology, plain-language standards

Results for America

PerformanceStat

Cost-per-outcome analysis, workload-per-FTE ratios, budget-to-performance trending, staffing efficiency signals

CitiStat tradition

Open Data Sources

Four cities.
Live MCP connections.

Skills connect to open government data via Model Context Protocol servers — no manual downloads or API keys. Data flows directly into analysis on demand.

Primary · Massachusetts

Boston

Primary analysis city. Full dataset coverage: 311 service requests, building permits, public safety, demographics, housing, transportation, environment.

Boston Open Data MCP

Peer City · California

San Francisco

Peer city for benchmarking. Comparable population (~870K), dense urban, strong Socrata data infrastructure. Covers 311, permits, public safety, budget, and payroll for performance comparisons.

San Francisco Open Data · Socrata

Peer City · Washington

Seattle

Peer city for benchmarking. Comparable population (~750K), tech-sector city, Socrata platform. FindIt FixIt service request data enables direct 311 comparison.

Seattle Open Data · Socrata

Peer City · Washington DC

Washington DC

Peer city for benchmarking. Closest population match (~690K), CitiStat tradition (Boston's model), high urban density. DC Open Data uses ArcGIS platform.

DC Open Data · ArcGIS

MCP Quality Assurance

29 ground-truth prompts.
Four levels of rigor.

The Boston Open Data MCP eval suite measures whether Claude retrieves accurate, complete, and structurally correct data across all domains — from schema validation to cross-dataset analytical queries.

29 Total Eval Prompts

4 Rigor Levels

6 Data Domains

Ground truth · March 30, 2026

LEVEL 1 — STRUCTURAL

Schema & Discovery

Tests dataset discovery and schema retrieval. Can the MCP find the right datasets, return correct field names, and confirm structural integrity? No domain knowledge required — just that the plumbing works.

Target: 100% · Weight: 15%

LEVEL 2 — FACTS

Single-Dataset Retrieval

Tests accurate counts, rankings, and aggregations from individual datasets. Covers 311, crime, building permits, Blue Bikes, and population — all with ground-truth tolerances set against immutable historical records.

Target: ≥95% · Weight: 35%

LEVEL 3 — MULTI-STEP

Filtering, Grouping & Ranking

Tests chained operations: filter by category, aggregate by group, sort, and interpret. Requires the MCP to execute multiple tool calls correctly and return ranked results in the right order.

Target: ≥90% · Weight: 30%

LEVEL 4 — CROSS-DATASET

Analytical Queries

Tests cross-dataset joins: per-capita rates, year-over-year comparisons, and multi-domain synthesis. Requires the MCP to navigate multiple datasets, calculate derived metrics, and reach directionally correct conclusions.

Target: ≥80% · Weight: 20%

# Aggregate scoring formula
Overall = (0.15 × L1_Structure) + (0.35 × L2_Facts) + (0.30 × L3_MultiStep) + (0.20 × L4_CrossDataset)

# Stability classes: Immutable · Very Stable · Stable · Slowly Growing
# See evals.json + EVAL_PLAN.md in the Boston MCP Eval/ directory

Quickstart

Start analyzing in
three steps

Setup

Clone the repository

Get all skill files and supporting materials from GitHub.

Upload the skills to Claude

Add SKILL.md files to your Claude environment. The master orchestrator routes to sub-skills automatically.

Connect your MCP data sources

Add Boston, San Francisco, Seattle, and DC open data MCP servers to enable live data access.

Example prompts

# Full workflow
"Investigate whether Boston's 311
response times are equitable across
neighborhoods. Frame the problem,
analyze the data, compare to
Seattle, write a brief for
the Mayor."

# Cross-city benchmark
"Compare Boston, San Francisco,
Seattle, and DC on building permit
approval times. What can Boston
learn from the better performers?"

# Communication package
"Write a 1-page memo for the Mayor
AND a community fact sheet for
Roxbury residents from the same
311 equity analysis."

# Performance management
"How much is Boston spending per
311 trash request resolved on time?
Are Public Works employees being
overworked relative to that volume?"
      

Supporting Materials

Everything a government
analyst needs

TEMPLATES.md

6 fill-in-the-blank output formats: exec memo, policy brief, one-pager, community fact sheet, benchmark summary, presentation deck

CHECKLISTS.md

Pre-flight and review checklists for all 5 phases. Catches the most common analytical errors before they reach stakeholders.

PROMPTS.md

25+ example prompts organized from simple to expert-level, covering every skill and common government use cases.

REFERENCE.md

Full dataset catalog for all 3 cities with resource IDs, field names, the 311 schema change table, and performance management datasets.

Performance_Management_Skill.md

Phase 5 skill: cost-per-outcome, workload-per-FTE, multi-year budget trends, and staffing efficiency — across 311, permits, violations, and public safety data.

EXAMPLE-311-equity.md

Complete end-to-end worked example: 311 response equity + cross-city benchmark, showing every tool call and analytical decision.

Boston MCP Eval/

29-prompt eval suite with ground truth (March 2026). evals.json for automated grading; EVAL_PLAN.md for methodology, scoring formula, and stability classifications.

Design Principles

Built for real
city work

Rigorous

Every claim labeled with confidence level. Limitations stated, not buried. Administrative data rarely proves causation — the skills say so.

Human-centered

Problems framed around residents' lived experience. Success measured by resident outcomes, not bureaucratic process.

Equity-focused

Every analysis asks who benefits and who is burdened. Asset-based framing required. Geographic proxies noted with their limits.

Transparent

Data sources cited with IDs. Methodology reproducible. Findings shareable as open knowledge for others to verify and build on.

Actionable

Communication designed to create action, not just awareness. Recommendations specify who, what, and when. Feedback loops built in.

Civic AnalyticsSkills forClaude

Frame. Analyze.Communicate. Benchmark. Perform.

Built on world-classpublic innovation frameworks

Four cities.Live MCP connections.

29 ground-truth prompts.Four levels of rigor.

Start analyzing inthree steps