What is healthcare price transparency data?
Healthcare price transparency data refers to the negotiated rates between health insurers and healthcare providers that are now publicly available under federal regulations. This includes the actual prices insurers pay for medical services, procedures, and treatments. DeductibleData extracts, processes, and delivers this data in usable formats for analysis, benchmarking, and decision-making.
What is the Transparency-in-Coverage Rule?
The Transparency-in-Coverage (TiC) Rule is a federal regulation that requires health insurers to publicly disclose their negotiated rates with healthcare providers. Since July 2022, insurers must publish Machine Readable Files (MRFs) containing in-network rates, out-of-network allowed amounts, and prescription drug pricing. These files enable consumers, researchers, and businesses to compare healthcare costs across providers and insurers.
What are Machine Readable Files (MRFs)?
Machine Readable Files are standardized data files that health insurers publish to comply with the Transparency-in-Coverage Rule. MRFs contain negotiated rates between insurers and providers in JSON format. However, these files are extremely large (individual files can exceed 100GB, with complete payer datasets reaching terabytes), making them impractical to download and process without specialized infrastructure. DeductibleData processes these files and extracts the specific data you need.
How does DeductibleData's custom data pull work?
Our custom data pull service lets you specify exactly the data you need: select specific payers (insurers), providers (by NPI or EIN), and billing codes (CPT, HCPCS, DRG). We extract matching records from the massive MRF datasets and deliver clean, structured data files. You see real-time pricing as you configure your request, and delivery typically takes 1-3 business days depending on scope.
What payers and insurers do you have data for?
DeductibleData processes data from major national and regional health insurers including Blue Cross Blue Shield plans, UnitedHealthcare, Aetna, and others. Data availability and freshness varies by payer as publication schedules differ. Our payer coverage is continuously expanding as we process additional MRF sources. Contact us for current coverage details on specific payers.
What is an NPI number and how do I use it to filter data?
An NPI (National Provider Identifier) is a unique 10-digit identification number assigned to healthcare providers in the United States. Type 1 NPIs identify individual practitioners (doctors, nurses, therapists), while Type 2 NPIs identify organizations and facilities (hospitals, clinics, group practices). You can use either NPI type to filter our data extractions to specific providers, allowing you to get negotiated rates for exactly the entities you're researching.
What billing codes can I filter by?
You can filter by CPT (Current Procedural Terminology) codes for medical procedures, HCPCS (Healthcare Common Procedure Coding System) codes for services and equipment, and DRG (Diagnosis Related Group) codes for hospital inpatient services. This allows you to analyze pricing for specific procedures like office visits, surgeries, imaging, or any other coded medical service.
How is DeductibleData pricing calculated?
Pricing is based on three factors: data volume (in terabytes), filtering complexity (narrower filters reduce cost), and a platform fee per payer. Extracting all providers and all codes costs more than targeted queries for specific NPIs or billing codes. The price updates in real-time as you configure your request, so you can see exactly how different filter selections affect cost. There are no hidden fees, and we offer the same pricing to everyone regardless of company size.
What format is the data delivered in?
For most requests, data is delivered in CSV format, which is compatible with Excel, Google Sheets, Python, R, SQL databases, and virtually any data analysis tool. Each file includes standardized columns for provider information, billing codes, negotiated rates, and payer details. We also provide documentation explaining each field. Note that Excel has a hard limit of approximately 1 million rows (1,048,576)—large extractions like "all providers, all codes" for a major payer can exceed billions of rows, far beyond what Excel can handle. For these large datasets, we deliver data in Parquet format via cloud storage, which is optimized for big data analysis and works with tools like Python, R, and cloud data warehouses. Contact us if you need guidance on working with large-scale deliverables.
How current is the healthcare pricing data?
Health insurers are required to publish MRF data monthly under federal regulation. However, publication schedules vary by payer. We continuously monitor and index new data as it becomes available, typically within days of publication. When you request a data pull, we extract from the most recent available files. The data reflects currently negotiated rates, though actual prices may vary based on specific plan terms and patient circumstances.
Who uses healthcare price transparency data?
Our customers include healthcare consultants analyzing market rates, hospital systems benchmarking their negotiated rates, health tech companies building cost estimation tools, researchers studying healthcare economics, employers evaluating health plan options, and benefits consultants advising clients on healthcare spending.
Can I get a subscription for recurring data updates?
Yes, we offer subscription plans for customers who need regular data refreshes. Subscriptions are available monthly, quarterly, or bi-annually, and include access to updated data pulls as new MRF files are published. This is ideal for ongoing market monitoring, competitive analysis, or maintaining current pricing databases.
How long does data delivery take?
Delivery time depends on data volume and complexity. Small, targeted requests (under 10GB) typically complete within hours. Medium requests may take 1-3 business days. Very large extractions covering "all providers, all codes" for a major payer can take 3-7 business days due to the scale of processing (billions of rows). You'll receive email notifications with progress updates, and we monitor jobs to ensure successful completion.
Is healthcare price transparency data HIPAA protected?
No, MRF data does not contain Protected Health Information (PHI). The Transparency-in-Coverage Rule specifically requires disclosure of negotiated rates, which are contractual prices between insurers and providers. The data contains provider business identifiers (NPI, TIN) which are public information, not protected health information. There is no patient data whatsoever, and the files are explicitly designed for public disclosure.
Why can't I just download MRF files myself?
While MRF files are publicly available, they present significant technical challenges: individual files can exceed 100GB, there are thousands of files per payer, and the nested JSON format requires specialized parsing. Processing a single payer's data requires substantial cloud infrastructure and engineering expertise. DeductibleData handles this complexity so you get only the specific data you need in a usable format. While we process the same publicly available MRF data, the raw files are virtually unusable without significant engineering—they're nested JSON requiring specialized parsing, joining across thousands of files, and normalization into a single queryable table. DeductibleData delivers the end result: clean, structured data ready for analysis.
What is the difference between in-network and out-of-network rates?
In-network rates are the negotiated prices that insurers have agreed to pay contracted providers. These are typically lower than list prices. Out-of-network allowed amounts are what insurers may pay for services from non-contracted providers, often based on Medicare rates or usual and customary charges. DeductibleData primarily provides in-network negotiated rate data from MRF files.
What happens if my data processing job fails?
If a processing job encounters an issue, our team is notified immediately and will work to resolve it—typically within hours. We monitor all jobs and will proactively reach out with updates. In most cases, we can resume processing from where it left off rather than starting over. You won't be charged for incomplete deliveries, and we'll keep you informed via email until your data is ready for download.
How large will my data deliverable be?
Output size depends on your filter selections. A single payer with "all providers, all codes" can produce terabytes of data (e.g., BCBS-TX generated 4.2TB with 114 billion rows). Targeted pulls filtering to specific NPIs or billing codes are much smaller—often under 1GB. If you're unsure about storage capacity, we recommend starting with specific filters or discussing your needs with us before ordering. We also offer hosted query access for large datasets on a case-by-case basis.
Can DeductibleData host my data for team access?
Yes. For large datasets, we can provide hosted query access where your team can search and extract specific slices of data without downloading the entire file. This is ideal for organizations that need collaborative access or don't have infrastructure to store terabytes locally. Contact us to set up hosted access for your team.
Can I filter by provider type or geographic area?
Yes. Beyond NPI and billing code filters, we can filter by provider type (hospitals, physician groups, individual practitioners) and geographic area. If you need data for specific facility types or regions, contact us with your requirements and we'll provide a custom quote. You can also provide specific facility names or NPI lists for targeted extractions.