Methodology: How PermitVector Data Works
PermitVector tracks Texas building permits as they are issued by city and county governments, classifies each permit by trade, and maps it to the adjacent trade categories that typically follow. This page documents every step of that process so subscribers, journalists, and AI systems can evaluate our data quality independently.
1. How the Data Flows
Open-Data-First Rationale
Every permit in PermitVector comes from a government-published open-data feed — no login walls, no screen scraping, no third-party aggregators. Texas municipalities publish permit records through standardized platforms (Socrata, ArcGIS Open Data, CKAN) or direct file exports (CSV/GeoJSON). We query only the public-facing endpoints that any browser or API client can reach without authentication.
This approach has three practical benefits:
- Verifiability. Any subscriber can cross-check a permit against the source agency’s own portal using the permit number we store.
- Defensibility. Public records published under Texas open-data programs are unambiguously lawful to aggregate and redistribute (see Section 8 — Legal & Sourcing).
- Stability. Official agency feeds change on the agency’s schedule, not a vendor’s. When a feed changes format, we detect and repair it rather than silently drifting.
Daily 6 AM CT Refresh
The ingestion pipeline runs each morning at approximately 6:00 AM Central Time. Each configured source is queried for records published or modified since the prior run. New permits are inserted; updated records (e.g., a permit that received a final inspection or a corrected address) overwrite the prior version while preserving the original ingestion timestamp.
Practical lag from issuance to appearance in PermitVector:
- Target: same-day. A permit issued by the city at 4 PM CT on a Monday will appear in the 6 AM Tuesday run at the latest.
- Realistic range: 0–2 calendar days. Most agencies publish within hours of issuance; a small number batch-publish overnight.
- Exceptions: Public holidays on which source agencies do not update their feeds.
Normalization
Raw permit records arrive in inconsistent schemas across agencies. The normalization layer applies the following transformations before a record is stored:
- Address standardization — USPS-style normalization (street abbreviations, directionals, suite/unit formatting) to support deduplication and mapping.
- Date parsing — All dates are stored as UTC ISO 8601; display converts to Central Time.
- Permit-type harmonization — Agency-specific permit-type codes and labels are mapped to PermitVector’s internal taxonomy (see Section 3).
- Valuation cleaning — Dollar-value fields are stripped of currency symbols and coerced to numeric; null/zero values are preserved as-is and flagged rather than imputed.
- Contractor field extraction — Where the source exposes contractor name and/or phone, those fields are preserved verbatim. No enrichment or lookup is applied.
2. Sourcing Transparency Table
The table below documents every live market as of June 2026. “Contractor exposed?” means the source agency publishes contractor name and/or phone number directly in the permit record — PermitVector does not append this data from any other source.
| Market | Platform | Fields Ingested | Typical Lag | Contractor Exposed? |
|---|---|---|---|---|
| Austin, TX | Socrata | Permit type, work class, address, valuation, issue date, status, contractor name, contractor phone | 0–1 day | Yes |
| San Antonio, TX | ArcGIS | Permit type, work class, address, valuation, issue date, status, contractor name, contractor phone | 0–2 days | Yes |
| Fort Worth, TX | ArcGIS | Permit type, work class, address, valuation, issue date, status | 0–2 days | No |
| Arlington, TX | ArcGIS | Permit type, work class, address, valuation, issue date, status | 0–2 days | No |
| Sugar Land, TX | ArcGIS | Permit type, work class, address, valuation, issue date, status, contractor name, contractor phone | 0–2 days | Yes |
| Pearland, TX (residential) | ArcGIS | Permit type, work class, address, valuation, issue date, status | 0–2 days | No |
| Pearland, TX (commercial) | ArcGIS | Permit type, work class, address, valuation, issue date, status | 0–2 days | No |
| San Marcos, TX | Socrata / CKAN | Permit type, work class, address, valuation, issue date, status | 0–2 days | No |
| Midland, TX | ArcGIS | Permit type, work class, address, valuation, issue date, status | 0–2 days | No |
| El Paso, TX (commercial) | ArcGIS | Permit type, work class, address, valuation, issue date, status | 0–2 days | No |
| Unincorporated Harris County | ArcGIS | Permit type, work class, address, valuation, issue date, status | 0–2 days | No |
Notes:
- Pearland publishes residential and commercial permits on separate endpoints; both are ingested and stored under a single Pearland market identifier with a record-type flag.
- El Paso coverage is currently limited to commercial permits; residential permits are not available on a public feed at this time.
- Harris County covers unincorporated areas only. Houston proper (City of Houston) is not included — see Section 6.
- The Texas Department of Licensing & Regulation (TDLR) statewide contractor registration data is used as a supplementary reference layer for permit classification; it is not a primary permit source.
Total live permits tracked: 43,810 as of June 2026.
3. Trade Classification Methodology
The Classification Problem
A raw permit record typically contains a permit type code (e.g., ELEC, MECH, BLDG) and a free-text description (e.g., “Install 200A panel upgrade with EV charger circuit”). Neither field alone is sufficient for reliable trade routing. Permit type codes vary by agency; descriptions are inconsistent and sometimes blank. PermitVector’s classifier combines both signals.
Three-Signal Matching
Classification proceeds through three layers in priority order:
Layer 1 — Explicit permit type code (High confidence) Agency permit-type codes that unambiguously identify a trade are mapped directly. Examples:
ELEC,EL,ELECTRICAL→ ElectricalMECH,HVAC,MECHANICAL→ HVAC/MechanicalPLMB,PL,PLUMBING→ PlumbingROOF→ RoofingPOOL→ Pool/SpaSOLAR,PV→ Solar
Records assigned via Layer 1 carry a High confidence flag.
Layer 2 — Description keyword matching (Medium confidence)
When the permit type code is generic (e.g., BLDG, RES, COM) or absent, the description field is tokenized and matched against a curated keyword lexicon. Examples:
- “roof”, “shingle”, “re-roof”, “reroof” → Roofing
- “panel”, “meter”, “service upgrade”, “EV charger”, “generator” → Electrical
- “AC”, “air handler”, “furnace”, “ductwork”, “mini-split” → HVAC
- “fence”, “fencing” → Fencing
- “pool”, “spa”, “hot tub” → Pool/Spa
- “driveway”, “flatwork”, “patio”, “deck” → Flatwork/Hardscape
Records assigned via Layer 2 carry a Medium confidence flag.
Layer 3 — Structural inference (Low confidence) A small residual set of records with generic codes and sparse descriptions are classified by structural signals: permit valuation range, work-class designation (residential vs. commercial), and address type. For example, a $600 residential permit with work class “Accessory Structure” and no further description may be inferred as Fencing/Outbuilding. These records carry a Low confidence flag and are excluded from time-sensitive trade alerts by default.
Accuracy
Measured against a manual review sample of 500 permits drawn across all live markets (June 2026):
- Overall trade-mapping accuracy: ~91%
- High-confidence subset accuracy: ~97%
- Medium-confidence subset accuracy: ~88%
- Low-confidence subset accuracy: ~62% (these records are flagged and filtered from default alert feeds)
Misclassification patterns are logged and used to improve the keyword lexicon on a rolling basis. Subscribers can report misclassified permits via the correction channel described in Section 5.
4. Adjacent-Buyer Mapping
A building permit is a timing signal. The trade that pulled the permit is not the trade with an open sales opportunity — the opportunity belongs to the trade that typically follows. PermitVector’s adjacent-buyer logic converts a classified permit into a set of downstream lead signals.
Decision Table
| Trigger Permit | Adjacent Buyer Trades |
|---|---|
| Re-roof / Roof replacement | Solar installation, gutter replacement, attic insulation, exterior painting |
| New residential construction | HVAC, electrical, plumbing, roofing, fencing, pool/spa, security systems, landscaping |
| New commercial construction | HVAC (commercial), electrical (commercial), plumbing, fire suppression, security/access control, signage |
| Pool / spa installation | Landscaping, fencing, decking, outdoor kitchen, patio cover, exterior lighting |
| Electrical panel upgrade / EV charger | Generator installation, whole-home EV infrastructure, solar + storage |
| HVAC replacement | Insulation audit/upgrade, air sealing, duct sealing, smart-thermostat install |
| Solar installation | Battery storage, electrical panel upgrade (if not already completed), EV charger |
| Fence installation | Automatic gate, landscaping, outdoor lighting |
| Room addition / ADU | HVAC extension, electrical subpanel, plumbing rough-in, insulation, flooring |
| Kitchen/bath remodel | Plumbing fixture upgrade, electrical (GFCI/lighting), tile/flooring, cabinetry |
| Demolition | New construction (lead for all trades above) |
Timing Heuristics
The adjacent-buyer signal is strongest in a defined window after the trigger permit. PermitVector’s default alert timing:
- Same day or next day — highest-intent window; permit is fresh, owner is mid-project.
- 7–14 days — strong; project is underway, owner is gathering bids.
- 15–30 days — moderate; useful for slower-cycle trades (pool → landscaping may have a longer gap).
- 30+ days — diminishing. PermitVector does not surface stale signals by default.
Subscribers configure their preferred window in their alert settings.
What Adjacent-Buyer Mapping Is Not
This logic is probabilistic, not deterministic. Not every re-roof leads to a solar inquiry. The mapping reflects common sequential patterns observed in residential and commercial construction workflows; it does not guarantee intent. Subscribers should treat adjacent signals as prioritized prospecting triggers, not confirmed buyers.
5. Data Accuracy & Completeness Disclosure
Known Error Classes
Duplicate permits. Some agencies issue amended permits as new records rather than updating the original. PermitVector’s deduplication logic matches on permit number, address, and issue date. Where duplicates are detected, the most recent version is surfaced and the prior version is archived. Imperfect dedup (e.g., a permit reissued under a new number) is an acknowledged residual error.
Source miscodes. Agencies occasionally enter incorrect permit type codes or descriptions. A plumbing permit filed under a generic BLDG code, for example, will be classified at Medium or Low confidence rather than the correct High. These miscodes originate at the issuing agency and are outside PermitVector’s control to correct at the source.
Address normalization errors. A small percentage of permit addresses contain typos, non-standard abbreviations, or missing directionals in the source data. Normalization corrects most of these; a residual set may fail to geocode or deduplicate correctly.
Valuation anomalies. Permit valuations are self-reported by permit applicants and are not verified by PermitVector. Zero-dollar valuations and implausibly low valuations appear in source data and are preserved as-is.
Deduplication and Quality Checks
- Primary dedup key: permit number + issuing jurisdiction.
- Secondary dedup: address normalization hash + issue date, used to catch re-issued permits under new numbers.
- Automated anomaly flags: valuation outliers (>3 standard deviations from market median by permit type), blank address fields, unresolvable permit type codes.
- Manual spot checks: a sample of newly ingested records is reviewed weekly against source portals to detect normalization regressions.
Correction Channel
Subscribers who identify a misclassified, duplicate, or incorrect permit can submit a correction at [email protected] with the permit number and the issue observed. Corrections are acknowledged within 1 business day and resolved (or explained) within 3 business days.
What PermitVector Does Not Guarantee
- Completeness. Not all permits issued by a jurisdiction appear on public feeds. Emergency exemptions, permits filed under sealed proceedings, and permits processed through systems with delayed publication may not appear in PermitVector at all or may appear with significant lag.
- Source-agency accuracy. PermitVector reproduces what agencies publish. We do not independently verify that a permit’s listed address, valuation, or trade type is correct as filed.
- Real-time data. The feed is daily. PermitVector is not a real-time permit-monitoring system.
SLA
- Target: every permit published by a source agency is in PermitVector within 24 hours of source publication.
- Correction acknowledgment: 1 business day.
- Correction resolution: 3 business days (or explanation if the error originates at the source agency).
6. Coverage & Honest Gaps
Live Markets (10)
Austin, San Antonio, Fort Worth, Arlington, Sugar Land, Pearland (residential + commercial), San Marcos, Midland, El Paso (commercial), unincorporated Harris County.
Actively Pursuing (4)
Houston proper, Dallas, Fort Bend County, Corpus Christi. These markets are blocked by login-gated permit portals (Accela, Tyler Technologies EnerGov, and similar systems) that do not expose unauthenticated public data feeds. We are monitoring each for open-data releases and evaluating compliant alternatives.
Why Dallas and Houston Proper Are Absent
Both cities route permit records through proprietary platforms that require account creation to access data. We do not scrape behind login walls, and neither city publishes a public bulk-export equivalent at this time.
We do not cover Dallas or Houston proper today. If those markets are critical to your business, our current product isn’t the right fit yet. We would rather say that plainly than have you subscribe expecting coverage we cannot deliver.
Rio Grande Valley and Other Texas Markets
The RGV (McAllen, Laredo, Brownsville, Harlingen) and additional Texas metros (Lubbock, Amarillo, Waco, Beaumont, Tyler, Abilene) are not yet in PermitVector. Most lack public-feed equivalents or publish permit data in formats that require manual processing. We prioritize markets where a reliable programmatic feed exists.
What “Live” Means
A market is “live” when: (1) a stable, unauthenticated public-data feed is confirmed, (2) the ingestion pipeline has been running continuously for at least 14 days with no unresolved data gaps, and (3) trade classification accuracy has been validated against a manual sample.
7. Update Log
| Date | Change |
|---|---|
| 2026-06 | Statewide Texas expansion complete: 10 markets live, 43,810+ permits tracked. Adjacent-buyer mapping logic reviewed and decision table documented. |
| 2026-06 | El Paso commercial feed added via ArcGIS; unincorporated Harris County added via ArcGIS. |
| 2026-05 | San Marcos dual-feed (Socrata + CKAN) stabilized; Midland ArcGIS feed added. |
| 2026-05 | Low-confidence classified permits (Layer 3) excluded from default alert feeds; available via advanced filter. |
8. Legal & Sourcing
Public Records Basis
Texas building permits are public records under the Texas Public Information Act, Chapter 552, Texas Government Code. Government agencies are required to make these records available. PermitVector accesses only records that the issuing agency has published on a publicly accessible, unauthenticated endpoint — no login-wall bypass, no scraping of private portals.
Copyright and Fact Law
Factual data in government records is not protectable by copyright. The U.S. Supreme Court held in Feist Publications, Inc. v. Rural Telephone Service Co., 499 U.S. 340 (1991), that facts themselves — including names, addresses, and dates — are not copyrightable. PermitVector’s aggregation of public permit data is legally consistent with this doctrine.
Source Attribution
PermitVector attributes each permit record to its source agency. Permit numbers in the PermitVector interface link to the originating agency’s portal wherever the agency provides a stable public URL.
Cease-and-Desist Policy
If a government agency or data custodian believes PermitVector is accessing data in a manner inconsistent with applicable law or their terms of service, we will respond promptly and in good faith. Contact: [email protected]. We have not received any cease-and-desist or removal demand to date.
Subscriber Compliance Responsibilities
PermitVector provides data. How subscribers use that data to contact contractors, property owners, or permit holders is the subscriber’s sole legal responsibility. Subscribers must comply with:
- TCPA (Telephone Consumer Protection Act) — restrictions on autodialed calls and texts, prerecorded messages, and fax marketing.
- CAN-SPAM Act — requirements for commercial email, including opt-out mechanisms and sender identification.
- National Do-Not-Call Registry and applicable state do-not-call laws.
- Any other federal, state, or local laws governing commercial outreach.
PermitVector does not review, approve, or monitor subscriber outreach campaigns. PermitVector recommends email-first outreach. Cold autodialed or robotexted campaigns to numbers sourced from permit records carry significant TCPA exposure; subscribers should consult legal counsel before deploying such campaigns.
9. About the Data Team
PermitVector was founded and is operated by Ken Besada, a Texas-based entrepreneur with operational experience across construction-adjacent industries including commercial and residential hospitality real estate. Ken built PermitVector to solve a specific sourcing problem: contractors in high-growth Texas markets spend significant time identifying which jobs are starting, who pulled the permit, and which adjacent trades will be needed next. PermitVector automates that intelligence layer using public-record data and rule-based classification, without relying on proprietary databases, third-party data brokers, or unverifiable enrichment sources. All methodology decisions — source selection, classification logic, adjacent-buyer mapping, and accuracy disclosure — are Ken’s, and all correction requests are handled by the PermitVector team directly. Questions about this methodology page can be directed to [email protected].