Files API Guide
Access historical pricing page scrape files by version, quarter, month, or ISO week. This API provides URLs to screenshots, markdown, HTML, and OCR data for pricing intelligence.Overview
The Files API allows you to retrieve scrape file URLs from the PricingSaaS intelligence database. Each scrape includes multiple file formats:- Screenshot (Cloudinary) - Visual screenshot of the pricing page
- Markdown - Text content extracted in markdown format
- HTML - Full HTML source of the page
- OCR - OCR data in JSON format
Authentication
Include your API key in the request headers using either method:Endpoints
1. Get Files by Version
Retrieve scrape files for a specific version (date). Endpoint:GET /files/:slug/version/:version
Version Format: YYYYMMDD (e.g., 20251010 for October 10, 2025)
Example Request:
2. Get Files by Quarter
Retrieve all scrape files for a specific quarter. Endpoint:GET /files/:slug/quarter/:quarter
Quarter Format: YYYYQ# where # is 1-4 (e.g., 2025Q4 for Q4 2025: October-December)
Quarter Mappings:
Q1= January - MarchQ2= April - JuneQ3= July - SeptemberQ4= October - December
3. Get Files by Month
Retrieve all scrape files for a specific month. Endpoint:GET /files/:slug/month/:month
Month Formats:
YYYYM##(e.g.,2025M12for December 2025)YYYY##(e.g.,202512for December 2025)
4. Get Files by ISO Week
Retrieve all scrape files for a specific ISO 8601 week. Endpoint:GET /files/:slug/week/:week
Week Format: YYYYW## where ## is the ISO week number 1-53 (e.g., 2025W42 for week 42 of 2025)
ISO Week Notes:
- Weeks start on Monday and end on Sunday
- Week 1 is the week containing the first Thursday of the year
- Some years have 53 weeks
5. Get Files by Flexible Period (Unified Endpoint)
Retrieve scrape files using any supported period format. This unified endpoint automatically detects the format and returns the appropriate results. Endpoint:GET /files/:slug/period/:period
Supported Formats:
- Version:
YYYYMMDD(e.g.,20251010for October 10, 2025) - Quarter:
YYYYQ#(e.g.,2025Q4for Q4 2025) - Week:
YYYYW##(e.g.,2025W42for week 42 of 2025) - Month:
YYYYM##orYYYY##(e.g.,2025M12or202512for December 2025)
| Field | Type | Description |
|---|---|---|
period.start | string | Start date of the period in YYYYMMDD format |
period.end | string | End date of the period in YYYYMMDD format |
format | string | Detected format: version, quarter, week, or month |
version | string | null | Specific version returned. For periods, this is the earliest version in the range |
- Simplicity: Single endpoint for all period queries
- Flexibility: No need to remember format-specific endpoints
- Clear Errors: Automatic format detection with helpful error messages
- Consistent Response: All formats return the same response structure
- ✅ Building new integrations or applications
- ✅ Accepting period inputs from users who may use different formats
- ✅ Dynamic queries where the period format varies
/version/{version}, /quarter/{quarter}, /week/{week}, /month/{month}) remain available and unchanged.
Response Schema
ScrapeFile Object
| Field | Type | Description |
|---|---|---|
slug | string | Company slug identifier |
version | string | Scrape version in YYYYMMDD format |
page_url | string | URL of the original pricing page |
screenshot_url | string | null | Cloudinary URL to screenshot image |
markdown_url | string | null | S3 URL to markdown version |
html_url | string | null | S3 URL to HTML version |
ocr_url | string | null | Supabase storage URL to OCR JSON |
scraper | string | null | Scraper used: local-claude, scrapingBee, aws, firecrawl |
valid | boolean | null | Whether the scrape is valid |
score | integer | null | Quality score (0-10) |
banner_free | boolean | null | Whether the page is free of banners |
created_at | string | null | ISO 8601 timestamp |
updated_at | string | null | ISO 8601 timestamp |
Error Handling
400 Bad Request - Invalid Format
401 Unauthorized - Missing/Invalid API Key
429 Too Many Requests - Insufficient Credits
500 Internal Server Error
Rate Limiting & Credits
- Cost: 1 credit per request (all endpoints)
- Default Limit: 1,000 credits per month
- Reset: Credits reset on the 1st of each month
- Headers: Check
X-RateLimit-RemainingandX-RateLimit-Resetresponse headers
Best Practices
- Cache responses - Files don’t change after creation, cache aggressively
- Use version endpoint for exact dates when available
- Use period endpoints (quarter/month/week) for historical analysis
- Handle empty results - Not all companies have scrapes for all periods
- Check
validandscorefields - Filter for high-quality scrapes - Respect rate limits - Monitor credit usage
Code Examples
JavaScript/Node.js
Unified Period Endpoint
Python
cURL
Use Cases
Track Pricing Changes Over Time
Monitor how a competitor’s pricing evolves quarter over quarter:Download All Pricing Pages for Analysis
Retrieve all markdown files for a specific month for text analysis:Related Documentation
Support
For questions or issues:- GitHub Issues: pricingsaas/pricingsaas-valut
- Email: [email protected]