How it works
From 3,142 city council rooms to your dashboard — in one week.
CityMinutes scans every planning commission and city council portal in the United States, every week, and extracts rezones, variances, subdivisions, and development agreements as structured data — weeks to months before a building permit exists.
The 4-step pipeline
Four steps. Each one is where other aggregators cut corners.
Scan, extract, validate, deliver. Source link-back on every record. Same schema for every jurisdiction.
- 01
Scan planning commission publishing platforms
Weekly crawl of every active planning commission, zoning board, and city council portal — Granicus, CivicPlus, CivicClerk, Legistar, Accela, NovusAGENDA, plus raw PDFs on city subdomains. Seven scraper families plus a long-tail rescue tier. 7-day refresh SLA per active county. Rollback-to-cache on portal changes.
- 02
Extract wedge fields from agenda packets
Two-stage pipeline. First a layout-aware structure parser classifies each page into staff analysis, applicant exhibits, motion language, vote, conditions. Then a section-specific extractor (Claude / GPT-4 class) produces strict JSON for 30+ fields per decision — including the full 4-field wedge.
- 03
Validate and QA
Schema validation hard-fails on type errors. Entity resolution against applicant ledgers and Regrid parcel IDs. Cross-reference checks across commission and council agendas. 10% rolling sample double-extracted by a second model for agreement measurement. New counties launch at 100% human review for 30 days; production tier requires 95% field accuracy.
- 04
Deliver via dashboard, email, API, and warehouse
One source of truth, four surfaces. Filterable dashboard with watchlists. Email and Slack alerts on new filings matching watchlist criteria. REST API with OpenAPI 3.1 spec, webhooks signed with HMAC. Nightly Parquet drops to Snowflake, BigQuery, S3, and Databricks for enterprise customers.
A real(istic) example
Fort Worth City Plan Commission, ZC-26-014
One row of the structured feed. The full record includes every extracted field and a SHA-256 hash of the source PDF for byte-exact verification.
{
"record_id": "tx-tarrant-cpc-2026-zc-26-014",
"jurisdiction_id": "tx-tarrant-fortworth",
"meeting_body": "Fort Worth City Plan Commission",
"meeting_date": "2026-03-11",
"hearing_type": "regular",
"application_id": "ZC-26-014",
"application_type": "rezone",
"applicant_name": "Horizon Ridge Development LLC",
"applicant_entity_type": "LLC",
"property_address": "12500 Old Decatur Rd, Fort Worth TX 76179",
"parcel_id": "04793021",
"acreage": 47.3,
"current_zoning": "A — Agricultural",
"proposed_zoning": "PD/SF",
"proposed_units": 168,
"staff_recommendation": "approve_with_conditions",
"staff_recommendation_summary": "Compatible with Far North Sector Plan; recommend approval subject to 11 conditions covering thoroughfare dedication, drainage, and tree preservation.",
"conditions_of_approval": [
"Dedicate 60-ft ROW for Old Decatur Rd widening",
"Construct on-site detention to 100-yr storm",
"Preserve 30% of existing post oak canopy",
"..."
],
"community_objections": {
"speakers_total": 7,
"oppose": 5,
"support": 2,
"themes": ["traffic", "school capacity", "drainage"]
},
"hearing_outcome": {
"motion": "Approve subject to staff conditions",
"vote": "6-2-1",
"result": "recommended_for_approval",
"council_referral_date": "2026-04-01"
},
"source_url": "https://fortworthtexas.gov/files/assets/public/v/1/planning/cpc/2026/cpc-2026-03-11-packet.pdf",
"source_sha256": "4f1e9c2b0a73…",
"extracted_at": "2026-03-12T08:14:00Z"
}From 3,142 city council rooms to your dashboard — in one week.
cityminutes extracts public planning commission and city council records as rezonings, variances, subdivisions, site plans, and development agreements as structured data — weeks to months before a building permit exists. Here's exactly how the pipeline works, field by field, county by county.
Primary CTA: See the raw data for your county → Secondary CTA: Read the API docs →
Strap line:
- 3,142 target counties · 50 states · weekly refresh cadence
- 4 fields neither shovels+ReZone nor boardwalkai extracts: Conditions of Approval · Community Objections · Hearing Outcomes · Staff Recommendations
- Source link-back on every record to the original city PDF or portal page
The 4-step pipeline
Four steps. Each one is a different engineering problem. Each one is where other aggregators cut corners.
Step 1 — Scan
What it is. cityminutes runs a crawl of planning commission, zoning board, and city council portals in the active coverage map. We are built for the ugly reality of US municipal software: 3,142 counties x ~19,500 incorporated places, most of them running one of seven agenda/meeting systems (Granicus, CivicPlus, CivicClerk, Legistar, Accela, NovusAGENDA, or raw PDFs on a static city subdomain). No single scraper covers all seven. So we maintain seven.
How it works technically.
- Dynamic portals (Granicus, Legistar, CivicClerk): headless browser orchestration via Playwright (Chromium, Firefox, and WebKit runtimes) with Puppeteer as a fallback for legacy JS-heavy interfaces. We fingerprint-rotate, respect published rate limits, and cache at the HTTP layer so we never re-request a document we already hold.
- RSS + iCal feeds where available: about 11% of jurisdictions publish an RSS feed for upcoming meetings or agendas. We ingest these on a faster cadence than the weekly crawl (polled daily) and use them as a freshness trigger for the full document fetch.
- PDF agendas and packets: the vast majority of jurisdictions publish agendas as PDFs, often with linked packet attachments running 30–400 pages. We download both the agenda and the attached staff report packet, persist the raw PDF to cold storage, and record the canonical URL, HTTP ETag, and SHA-256 hash.
- Minutes: published days to weeks after the hearing. We watch for them on a second crawl cycle and link them back to the agenda by meeting date + body name.
- Static portals and raw HTML: the long tail of small jurisdictions publish agendas on a city subdomain with no structured index. We maintain targeted extractors per portal family and a rescue tier for the 3% of sites that change structure without warning.
What we commit to.
- A 7-day refresh SLA per active county: any meeting held in the trailing 7 days is represented in the dataset within that window.
- Rollback-to-cache: if a portal changes overnight, we serve the last-known-good snapshot flagged as
stale_snapshot=truerather than returning a gap. - No scraping of paywalled, authenticated, or non-public material. Everything in cityminutes is public meeting data, fetched from public URLs, under open meeting / sunshine laws.
Step 2 — Extract
What it is. The raw agenda PDF is useless. What you want is a row: "Applicant X requested Y zoning change on parcel Z, planning commission voted A–B with conditions C1 through C14, council vote scheduled for date D." Getting from PDF to row is the hard part.
cityminutes runs a two-stage extraction pipeline: (1) document structure parsing, (2) field extraction. Both stages are AI-assisted with deterministic guardrails.
- Structure parsing. Staff report packets have predictable anatomy: cover sheet, staff analysis, applicant exhibits, commissioner questions, motion language, vote record, conditions of approval. We classify each page into one of those sections using a fine-tuned layout model trained on ~12,000 hand-labeled packet pages from our first 50 counties. This is the step where most academic "LLM on PDFs" experiments fall over, because commercial staff reports are not Wikipedia articles — they are faxes in disguise, with two-column layouts, scanned exhibits, handwritten margin notes, and tables that span pages.
- Field extraction. Each section is routed to a section-specific extractor: a frontier LLM (Claude Sonnet / GPT-4-class) prompted with a strict JSON schema and seeded with an exemplar set drawn from that jurisdiction's recent history. Output is constrained to the schema — no free-form paragraphs, no hallucinated fields, no "not sure" strings. If a field cannot be extracted, the output is
nullwith anull_reasoncode.
30+ fields per decision. Every extracted decision row carries, at minimum: application ID, jurisdiction ID, meeting body, meeting date, hearing type, application type, applicant name + entity type, property address, parcel ID, acreage, current zoning, proposed zoning, staff recommendation, staff recommendation summary, commissioner vote record, vote outcome, conditions of approval (list, full text), community objections (count + summaries), continuance history, linked prior applications, council referral flag, council hearing date, canonical source URL, source PDF hash, extraction date, extraction model version, confidence score.
Step 3 — Validate
What it is. Extraction without validation is a liability, not a product.
- Schema validation against a JSON Schema that hard-fails on type errors, required-field omissions, and value-range violations.
- Entity resolution. Applicant names normalized against an internal ledger of prior filings. Property addresses geocoded and matched against Regrid / county assessor parcel IDs. Commissioner names matched against a per-jurisdiction roster.
- Date validation. Every date field cross-checked against the published meeting calendar.
- Cross-reference checks. A rezone application that appears in both the planning commission minutes and the council agenda must reconcile — same applicant, same parcel, same case number.
- Double extraction agreement. On a 10% rolling sample, every record is extracted twice by two different model passes. We measure field-by-field agreement and publish the rate.
What runs on every new county.
- Human QA loop. For the first 30 days a new county is in the dataset, 100% of extracted records are spot-checked by a human reviewer against the source PDF. Pass rate must clear 95% field accuracy before the county moves to "production" tier.
- Portal fingerprinting. We record the DOM structure and URL patterns so that structural changes trigger alerts, not silent failures.
- Applicant cross-ref. New counties often have repeat regional applicants (a regional homebuilder filing in four adjacent counties). We cross-ref applicant rosters across counties.
Step 4 — Deliver
Validated rows go out through four surfaces, one source of truth.
- Dashboard. The cityminutes web app: a filterable feed of decisions by jurisdiction, applicant, application type, status, date range, and 4-field wedge attributes. Every row links back to the source PDF and the canonical portal page. Saved searches, watchlists, and weekly digest emails.
- Email alerts. Inbound matching: you define a watchlist (jurisdiction × application type × keyword × applicant × parcel ring) and cityminutes sends you an email the next morning any matching record is published. Digest and real-time tiers available.
- REST API. Programmatic access to every field cityminutes captures, filterable by any field, with full-text search on staff report excerpts and conditions of approval. API-key auth, typed responses, pagination, rate limits, webhooks on new records.
- Data warehouse drops. Enterprise customers can receive nightly Parquet or CSV drops into S3, GCS, or Snowflake — same schema as the API, delivered as a batch file.
Every delivery channel carries the canonical source URL on every record. If a cityminutes row says a rezone was denied 4–3, you can click through to the actual PDF that says so. That link is part of our contract with you.
Why pre-permit data matters
The single most useful thing to understand about cityminutes is where we sit on the timeline.
A development project does not begin with a permit. A permit is the last step before excavation. Long before it, the project passes through a sequence of public decisions, and every one of those decisions is a data point that exists only in a meeting record.
The actual chain looks like this.
- Pre-application meeting — developer meets with planning staff informally, sometimes months before filing.
- Application filed — developer submits rezoning, variance, subdivision, CUP, site plan, or development agreement request.
- Staff review — planning staff write a staff report analyzing the proposal against the general plan, zoning code, environmental review requirements, and prior conditions. The staff report publishes as part of the planning commission packet, typically 5–14 days before the hearing.
- Planning commission hearing — commissioners take public comment, debate, and vote. In roughly 80% of US jurisdictions, the staff recommendation predicts the commission vote.
- Community objections and revisions — if 25 neighbors show up and file letters in opposition, the applicant usually renegotiates. Sometimes continuances drag the case across multiple hearings over months.
- City council or county commission vote — the elected body takes the commission's recommendation as advisory and votes to approve, modify, or reject. This is where legal entitlement is conferred.
- Conditions of approval recorded — the decision resolution lists the specific conditions the developer must meet.
- Entitlement complete — the project is legally allowed to proceed. This is often 6–18 months after step 2 for complex projects.
- Construction drawings and permit application — developer submits for building permit.
- Permit issued — the event every permit-based tool (shovels.ai, BuildZoom, ConstructionMonitor, ATTOM) considers "the signal."
- Construction begins.
Steps 2 through 8 happen in meeting minutes and staff reports — the document class cityminutes scrapes. Permit-based tools observe step 10. That is 8 to 24 months downstream of the earliest meaningful signal. For a land acquisition team hunting entitled parcels, for a pre-construction BD lead trying to get in front of an architect during schematic design, for a building product rep who needs to get specified into the project before the spec freezes, and for a real estate developer watching a competitor's pipeline inside a 3-mile ring — permit-only data is months too late.
That is the whole reason cityminutes exists.
The 4-field wedge — how we actually extract it
1. Conditions of Approval. What cities are requiring from developers. Infrastructure contributions, setback adjustments, fee amounts, density limits, dedication requirements, construction schedule constraints, water and sewer commitments, traffic mitigations, landscaping specs, affordable unit percentages. These live in the decision resolution, usually at the end of the staff report packet as a numbered list. Our extractor runs a list-detection pass on the last 40% of the packet, classifies each numbered item by a condition taxonomy, preserves the full text verbatim, and generates a one-sentence machine summary.
2. Community Objections. Who opposed, how many, and why. Planning commission minutes record public comment during the hearing — the clerk typically notes the name of each speaker and a one-line summary of their position. Our extractor identifies the public comment section, counts distinct speakers, classifies stance (support / oppose / neutral), and extracts the substantive objection language. This is the NIMBY signal, and it is the single strongest predictor of continuances, modifications, and outright denials on otherwise routine applications.
3. Hearing Outcomes. The actual vote. Not "approved" versus "denied" as a single flag, but the full roll call: which commissioners voted yea, which voted nay, which were absent, which recused, the motion language, whether the vote was unanimous or split, whether it was a recommendation to council or a final decision. We also capture continuance outcomes and the rescheduled date.
4. Staff Recommendations. The planning staff's formal position. Staff reports always contain a "recommendation" section, usually one of four values: approve, approve with conditions, deny, or continue. We extract the recommendation, the one-paragraph rationale, and a list of the conditions staff proposed. In approximately 80% of cases, the staff recommendation predicts the commission vote.
A real(istic) example — Fort Worth City Plan Commission, packet for March 11, 2026, case ZC-26-014. See source spec 04_how_it_works.md for the full extracted record with applicant Horizon Ridge Development LLC, 47.3 acres, current zoning A–Agricultural, proposed PD/SF, 168 units, staff recommendation approve_with_conditions with 11 conditions, 7 community comments (5 oppose / 2 support), hearing outcome 6-2-1, forwarded to City Council on 2026-04-01.
Dodge's reporter model vs cityminutes' automation — the narrative wedge
Dodge Construction Network — the biggest name in pre-construction data, founded in 1891 as the F.W. Dodge Company — still collects data the way it did when McKinley was president. Dodge employs roughly 200 field reporters who phone contractors, architects, and owners daily, asking a version of the same question: "what is in your project pipeline this week?" The information is written up, keyed into Dodge Central, and sold as a lead to the pre-construction BD teams at specialty trades and commercial GCs.
ConstructConnect's "planning stage" is subtly different: in practice, it means a permit exists or a public RFP is out, which is 60–180 days downstream of the signal a pre-con BD lead actually needs. All three lag the actual public decision by weeks to months. The data itself is fine. The collection architecture is from the 1890s.
cityminutes automates that layer. At full target coverage, the extraction pipeline spans 3,142 counties x 52 weeks x local agenda volume. The lead-time arbitrage is structural: we read a different document class, not the same document class faster. The planning packet exists whether a permit is ever filed or not. The staff recommendation exists whether the council overrules it or not. Conditions of approval exist whether the project breaks ground or not.
This is not an incremental product improvement over Dodge. It is the next architecture for pre-construction data.
Quality and accuracy
Double-extraction agreement rate. On a 10% rolling sample of all incoming records, every field is extracted twice through two independent passes. Reported per quarter, per county, and per application type. Internal threshold: 95% field-level agreement on the 30+ core fields.
Human spot-check sampling. Every new county launches at a 100% human-review rate for the first 30 days. After that it drops to 10% weekly sample.
Canonical source link-back. Every record in the dataset carries a source_url and a source_sha256. The URL points to the original government-hosted PDF or agenda page. The hash is of the exact bytes we parsed. If a city re-publishes the PDF with a small edit, the hash changes and a revision record is created.
Rollback-to-cache policy. If a portal changes structurally, we serve the last-known-good snapshot, flag every record as stale_snapshot=true, and ship an alert to the ops team and the affected customers.
Published accuracy metrics — roadmap. A public quarterly accuracy report is on the roadmap for Q3 2026, covering: per-county pass rate, per-field agreement rate, extraction error classes, count of stale-snapshot days per county over the quarter, and turnaround time from meeting publication to dataset row.
Freshness commitment
- Timestamps on every record.
extracted_at,meeting_date, andlast_updated(hash-guarded so cosmetic updates don't degrade search signals). - Per-record
dateModifiedsurfaced through JSON-LD. On public county pages, thedateModifiedfield on the Dataset schema block is authoritative. - 7-day refresh SLA. For every active county, any meeting held in the last 7 days will be represented in the dataset within that window.
- "Last updated" is visible on every surface. Dashboard, API header (
X-Cityminutes-Last-Updated), warehouse drop partition key, county page. - Backfill vs live. When a new county goes live we backfill 24 months of historical meetings as a one-time hydration. Marked with
data_tier=historicalvsdata_tier=live.
Public data sourcing and legal clarity
Everything cityminutes ingests is public meeting data published by a US local government, under state and local open meeting / sunshine laws.
Legal basis. Every US state has an open meeting law requiring that the meetings of public bodies — including city councils, planning commissions, zoning boards of appeal, and similar — be conducted in public, with agendas posted in advance and minutes made publicly available. The governing statutes for municipal meetings are state-level: the Texas Open Meetings Act, the California Brown Act, the Arizona Open Meeting Law, the Ohio Sunshine Law, the Florida Sunshine Law, and so on across all 50 states.
What we ingest. Agendas, staff report packets, meeting minutes, published motions, roll-call vote records, public comment correspondence entered into the record, and the canonical source URLs.
What we do not ingest. Anything behind a login wall. Anything marked confidential or executive session. Anything attached to a juvenile proceeding, an HR matter, pending litigation privileged material, or individually identifying personal health information.
Robots and rate limits. We respect robots.txt directives on municipal portals, we rate-limit our crawls to published norms, and we cache aggressively.
Enterprise reassurance. Our data sourcing has been reviewed by counsel in the context of the Texas, California, Ohio, Florida, and Arizona statutes most commonly cited by our target counties. If your procurement team needs a written source-and-basis memo for a specific jurisdiction, we will provide one on request.
API architecture overview
The cityminutes REST API is the programmatic surface behind the dashboard. Everything on the dashboard is built on it.
Design. REST over HTTPS. JSON responses, typed, versioned at the URL path (/v1/…). Rate-limited per API key, with burst allowance. Auth via bearer token, rotatable. Pagination via cursor (not page number). Full-text search on the conditions_of_approval and staff_recommendation_rationale fields.
Core endpoints.
GET /v1/decisions— filterable feed of decision records.GET /v1/decisions/{record_id}— single record with full conditions, roll-call, and source link.GET /v1/jurisdictions— list of active jurisdictions with status, last-updated, coverage tier.GET /v1/meetings— upcoming and historical meetings.POST /v1/watchlists— create a watchlist to receive webhook deliveries.GET /v1/changelog— every schema change and portal migration, timestamped.
Webhooks. Register a watchlist and cityminutes will POST a JSON payload to your endpoint any time a matching record is published. Retries with exponential backoff up to 24 hours. Signed with HMAC.
Rate limits. Default tier: 1,000 requests per hour per API key, with a 60-request burst. Enterprise tiers negotiated.
SDKs — roadmap. Official TypeScript and Python SDKs are on the roadmap for Q3 2026. Until then, idiomatic curl examples are in the API docs. An OpenAPI 3.1 spec is maintained at /api/openapi.json.
Integration pattern examples
Salesforce workflow — auto-create Leads on every new rezone in watchlisted counties. A pre-construction BD lead at a commercial GC maintains a watchlist of 35 counties. Whenever cityminutes publishes a new commercial rezone, a webhook fires to a Salesforce Flow, which creates a Lead record, populates Project_Stage__c = "pre-permit", the applicant as the Account, the source PDF URL as a custom field, and the staff recommendation as a qualification field.
Slack alerts — notify the submarket team when a competitor files. A multifamily REIT asset management team watches a dozen submarkets. They configure a watchlist keyed to application_type IN (rezone, site_plan, development_agreement) filtered to a 3-mile parcel ring around each owned asset. The webhook posts to a Slack channel formatted with applicant name, project size, staff recommendation, and source PDF link.
Data warehouse drop — hydrate Snowflake nightly. An enterprise land-acquisition team at a top-10 homebuilder replaces their daily "analyst reads planning packets" job with a nightly Parquet drop from cityminutes into an S3 bucket, loaded into Snowflake by an existing Airflow DAG.
Security and compliance
- SOC 2 roadmap. Targeting SOC 2 Type II attestation in Q3 2026.
- Data residency. All cityminutes production data is hosted in US-region infrastructure.
- Privacy. Meeting minutes routinely contain the names of speakers, commissioners, applicants, and staff. Those names are part of a public record. A takedown policy is published at
/privacy/takedown. - Access controls. Role-based access in the application layer: read-only, standard, admin, owner.
- Audit logs. Every admin action, API key creation, data export, and user permission change is logged with actor, timestamp, source IP, and action detail.
- Encryption. TLS 1.2+ in transit. AES-256 at rest.
- Incident response. A public status page at
status.cityminutes.ai.
Transparency commitments
- Published coverage list.
/coverage— current counties, refresh tier, last successful update. - Published accuracy metrics. Quarterly accuracy reports begin Q3 2026.
- Public changelog. Every schema change, portal migration, extractor version bump, coverage expansion recorded at
/changelog. - Incident status page. Real-time availability, open incidents, scheduled maintenance, postmortems.
- Source link-back on every record. Non-negotiable.
- Open robots.txt + llms.txt. Our site is explicitly indexable by GPTBot, ClaudeBot, PerplexityBot, Google-Extended, and OAI-SearchBot.
Frequently Asked Questions
How does cityminutes extract data from meeting minutes?
cityminutes runs a two-stage extraction pipeline on every agenda PDF, staff report packet, and minutes document published by an active planning commission, zoning board, or city council. Stage one is a layout-aware structure parser that classifies pages into sections (cover sheet, staff analysis, applicant exhibits, motion, vote, conditions). Stage two routes each section to a section-specific extractor built on a frontier LLM (Claude or GPT-4 class), constrained to a strict JSON schema with 30+ fields per decision. Every extraction is validated against schema rules, cross-referenced against prior filings and entity rosters, and a rolling 10% sample is double-extracted by a second model for agreement measurement. Every record carries a link back to the source PDF.
Is the data accurate?
Every new county launches with a 100% human-review period for the first 30 days, and must clear 95% field-level accuracy before moving to production tier. Afterward, a rolling 10% sample is spot-checked by humans and a rolling 10% sample is double-extracted by a second AI model; field-level disagreements are escalated and used to improve prompts. Every record carries a canonical source URL and a hash of the source document so you can verify any fact against the original PDF. We are targeting publication of a public quarterly accuracy report starting Q3 2026.
How often is it refreshed?
Active counties are designed around a weekly refresh SLA. Some jurisdictions with RSS or iCal feeds are polled daily for agenda publication. Every record carries an extracted_at and last_updated timestamp, the API returns an X-Cityminutes-Last-Updated header, and the public county pages surface the last update date above the fold.
Is scraping public meeting data legal?
Yes. cityminutes only ingests data from public meetings of US local government bodies, published by those governments under state open meeting / sunshine laws (Texas Open Meetings Act, California Brown Act, Arizona Open Meeting Law, Ohio Sunshine Law, Florida Sunshine Law, and the equivalent in every other state). Agendas, staff reports, minutes, motions, vote records, and public comment entered into the record are required by those statutes to be publicly available. We respect robots.txt, rate-limit our crawls, and never access paywalled or authenticated material.
Can I verify the source of a record?
Yes. Every record in the cityminutes dataset carries a source_url pointing to the government-hosted PDF or portal page, plus a source_sha256 hash of the exact bytes we parsed. Click the link, read the original document, and confirm the fact.
What's the API rate limit?
The default cityminutes API rate limit is 1,000 requests per hour per API key, with a 60-request burst allowance. Rate limit headers on every response follow the standard convention (X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset). Enterprise tiers negotiate higher limits or dedicated capacity. For bulk use cases we recommend the nightly warehouse drop instead of the API.
Do you have an SDK?
Not yet. The cityminutes REST API is documented with an OpenAPI 3.1 specification at /api/openapi.json and works with any HTTP client today. Official TypeScript and Python SDKs are on the roadmap for Q3 2026.
What happens when a county portal changes?
When a portal changes structurally — a new URL scheme, a new CMS, a new auth wall — our scraper will fail to retrieve new documents. cityminutes serves the last-known-good snapshot flagged as stale_snapshot=true, pushes an alert to the ops team and affected customers, and escalates the county to our portal migration queue. The county returns to "live" status only after the scraper is repaired, the most recent meetings are back-filled, and a human reviewer signs off on the first week of new extractions.
See the raw data for your county →
- Explore the dashboard. Pick a county. See planning commission decisions, staff recommendations, and conditions of approval from active or preview coverage. Free tier, no credit card.
- Read the API docs. Endpoint reference, authentication, webhook format, OpenAPI spec. /docs
- Book a walkthrough. 30 minutes with a cityminutes engineer on how the pipeline handles your 5 highest-priority counties. /demo
Integration planning
Start with the schema, sample data, and a short implementation review.
Public resources are intentionally limited to the schema, sample request, and evaluation paths that are live on the site. Delivery details for a specific team are confirmed during onboarding so the public page does not promise an integration that is not enabled for that workspace.
Frequently asked questions
How does CityMinutes extract data from meeting minutes?
CityMinutes runs a two-stage extraction pipeline on every agenda PDF, staff report packet, and minutes document published by an active planning commission, zoning board, or city council. Stage one is a layout-aware structure parser that classifies pages into sections (cover sheet, staff analysis, applicant exhibits, motion, vote, conditions). Stage two routes each section to a section-specific extractor built on a frontier LLM (Claude or GPT-4 class), constrained to a strict JSON schema with 30+ fields per decision. Every extraction is validated against schema rules and a rolling 10% sample is double-extracted by a second model for agreement measurement.Is the data accurate?
Every new county launches with a 100% human-review period for the first 30 days, and must clear 95% field-level accuracy before moving to production tier. Afterward, a rolling 10% sample is spot-checked by humans and a rolling 10% sample is double-extracted by a second AI model. Every record carries a canonical source URL and a SHA-256 hash of the source document so you can verify any fact against the original PDF.How often is it refreshed?
CityMinutes scans every active county weekly on a 7-day refresh SLA. Any meeting held in the last 7 days is represented in the dataset within that window. Some jurisdictions with RSS or iCal feeds are polled daily for agenda publication. Every record carries an extracted_at and last_updated timestamp.Is scraping public meeting data legal?
Yes. CityMinutes only ingests data from public meetings of US local government bodies, published by those governments under state open meeting / sunshine laws (Texas Open Meetings Act, California Brown Act, Arizona Open Meeting Law, Ohio Sunshine Law, Florida Sunshine Law, and the equivalent in every other state). We respect robots.txt, rate-limit our crawls, and never access paywalled or authenticated material.Can I verify the source of a record?
Yes. Every record in the CityMinutes dataset carries a source_url pointing to the government-hosted PDF or portal page, plus a source_sha256 hash of the exact bytes we parsed. Click the link, read the original document, and confirm the fact.What's the API rate limit?
The default CityMinutes API rate limit is 1,000 requests per hour per API key, with a 60-request burst allowance. Rate limit headers on every response follow the standard convention. Enterprise tiers negotiate higher limits or dedicated capacity. For bulk use cases we recommend the nightly warehouse drop instead of the API.What happens when a county portal changes?
When a portal changes structurally, our scraper will fail to retrieve new documents. CityMinutes serves the last-known-good snapshot flagged as stale_snapshot=true, pushes an alert to the ops team and affected customers, and escalates the county to our portal migration queue. The county returns to live status only after the scraper is repaired and a human reviewer signs off on the first week of new extractions.
See the raw data for your county
Pick a county. See live planning commission decisions, staff recommendations, and conditions of approval. Free tier, no credit card.