What We Are Building
By the end of this tutorial, you will have a Python script that:
- Fetches XBRL financial data for any list of companies from the SEC EDGAR API
- Extracts key metrics: revenue, net income, EPS, and total assets
- Calculates revenue growth year-over-year
- Filters and ranks companies by your criteria
- Outputs a clean table showing the top performers
No API keys. No paid data feeds. Just the free SEC EDGAR API and Python.
Prerequisites
pip install requests pandas
You will also need a CIK number for each company you want to screen. We will handle the ticker-to-CIK conversion automatically.
Step 1: Set Up the SEC API Client
First, create a reusable class for making SEC API requests with proper rate limiting and error handling:
import requests
import pandas as pd
import time
class SECClient:
"""Simple SEC EDGAR API client with rate limiting."""
BASE_URL = 'https://data.sec.gov'
HEADERS = {'User-Agent': 'StockScreener [email protected]'}
def __init__(self):
self.session = requests.Session()
self.session.headers.update(self.HEADERS)
self.last_request = 0
def _rate_limit(self):
"""Ensure we don't exceed 10 requests per second."""
elapsed = time.time() - self.last_request
if elapsed < 0.12:
time.sleep(0.12 - elapsed)
self.last_request = time.time()
def get_company_facts(self, cik):
"""Fetch all XBRL financial facts for a company."""
self._rate_limit()
cik_padded = str(cik).zfill(10)
url = f'{self.BASE_URL}/api/xbrl/companyfacts/CIK{cik_padded}.json'
response = self.session.get(url)
response.raise_for_status()
return response.json()
def get_ticker_map(self):
"""Download the SEC ticker-to-CIK mapping."""
self._rate_limit()
url = 'https://www.sec.gov/files/company_tickers.json'
response = self.session.get(url)
data = response.json()
return {v['ticker'].upper(): str(v['cik_str']) for v in data.values()}
Step 2: Extract Financial Metrics from XBRL Data
The companyfacts endpoint returns all XBRL-tagged data for a company. We need to extract specific metrics and filter for annual (10-K) data:
def extract_metric(facts, tag, form='10-K', unit='USD'):
"""Extract a specific financial metric from companyfacts data.
Returns a list of dicts with 'end' (date), 'val' (value), and 'fy' (fiscal year).
"""
gaap = facts.get('facts', {}).get('us-gaap', {})
concept = gaap.get(tag, {})
units = concept.get('units', {})
values = units.get(unit, [])
# Filter for annual filings only
annual = [v for v in values if v.get('form') == form]
# Deduplicate by fiscal year end date
seen = set()
unique = []
for v in sorted(annual, key=lambda x: x['end'], reverse=True):
if v['end'] not in seen:
seen.add(v['end'])
unique.append(v)
return unique
def extract_eps(facts, form='10-K'):
"""Extract EPS separately since it uses USD/shares unit."""
gaap = facts.get('facts', {}).get('us-gaap', {})
concept = gaap.get('EarningsPerShareDiluted', {})
units = concept.get('units', {})
values = units.get('USD/shares', [])
annual = [v for v in values if v.get('form') == form]
seen = set()
unique = []
for v in sorted(annual, key=lambda x: x['end'], reverse=True):
if v['end'] not in seen:
seen.add(v['end'])
unique.append(v)
return unique
Step 3: Build the Screening Function
Now we combine everything into a function that screens a list of companies and computes key metrics:
def screen_companies(tickers):
"""Screen a list of companies by financial metrics.
Returns a pandas DataFrame with key financial data.
"""
client = SECClient()
ticker_map = client.get_ticker_map()
results = []
for ticker in tickers:
cik = ticker_map.get(ticker.upper())
if not cik:
print(f' Skipping {ticker}: CIK not found')
continue
try:
facts = client.get_company_facts(cik)
name = facts.get('entityName', ticker)
# Extract latest annual metrics
revenue = extract_metric(facts, 'Revenues')
net_income = extract_metric(facts, 'NetIncomeLoss')
assets = extract_metric(facts, 'Assets')
eps = extract_eps(facts)
# Some companies use RevenueFromContractWithCustomerExcludingAssessedTax
if not revenue:
revenue = extract_metric(facts,
'RevenueFromContractWithCustomerExcludingAssessedTax')
latest_rev = revenue[0]['val'] if revenue else None
prior_rev = revenue[1]['val'] if len(revenue) > 1 else None
rev_growth = None
if latest_rev and prior_rev and prior_rev > 0:
rev_growth = round((latest_rev - prior_rev) / prior_rev * 100, 1)
latest_ni = net_income[0]['val'] if net_income else None
profit_margin = None
if latest_rev and latest_ni and latest_rev > 0:
profit_margin = round(latest_ni / latest_rev * 100, 1)
results.append({
'Ticker': ticker.upper(),
'Company': name,
'Revenue ($B)': round(latest_rev / 1e9, 2) if latest_rev else None,
'Rev Growth %': rev_growth,
'Net Income ($B)': round(latest_ni / 1e9, 2) if latest_ni else None,
'Profit Margin %': profit_margin,
'EPS': eps[0]['val'] if eps else None,
'Assets ($B)': round(assets[0]['val'] / 1e9, 2) if assets else None,
'Period': revenue[0]['end'] if revenue else None,
})
print(f' Fetched {ticker}: {name}')
except Exception as e:
print(f' Error fetching {ticker}: {e}')
return pd.DataFrame(results)
Step 4: Run the Screener
Let us screen a list of major tech and financial companies:
# Define the companies to screen
tickers = [
'AAPL', 'MSFT', 'GOOGL', 'AMZN', 'META',
'NVDA', 'TSLA', 'NFLX', 'JPM', 'V',
'MA', 'CRM', 'ORCL', 'ADBE', 'INTC'
]
print('Screening companies...')
df = screen_companies(tickers)
# Sort by revenue growth (descending)
df_sorted = df.sort_values('Rev Growth %', ascending=False)
print('\n--- Top Companies by Revenue Growth ---')
print(df_sorted[['Ticker', 'Company', 'Revenue ($B)', 'Rev Growth %',
'Profit Margin %', 'EPS']].to_string(index=False))
# Filter: profitable companies with >10% revenue growth
high_growth = df[(df['Rev Growth %'] > 10) & (df['Profit Margin %'] > 0)]
print(f'\n--- High-Growth Profitable Companies ({len(high_growth)}) ---')
print(high_growth[['Ticker', 'Revenue ($B)', 'Rev Growth %',
'Profit Margin %']].to_string(index=False))
Example Output
--- Top Companies by Revenue Growth ---
Ticker Company Revenue ($B) Rev Growth % Profit Margin % EPS
NVDA NVIDIA Corp. 60.92 122.4 55.8 2.94
META Meta Platforms Inc. 134.90 22.1 33.4 19.47
AMZN Amazon.com Inc. 574.78 11.8 7.8 5.53
MSFT Microsoft Corp. 236.58 15.7 35.6 12.41
AAPL Apple Inc. 391.04 2.0 26.3 6.97
...
Step 5: Add Custom Screening Criteria
The real power is adding your own filters. Here are some common screening strategies:
# Value screen: high margins + reasonable growth
value_picks = df[
(df['Profit Margin %'] > 20) &
(df['Rev Growth %'] > 5) &
(df['Revenue ($B)'] > 10)
]
print('Value picks:', value_picks['Ticker'].tolist())
# Growth screen: fastest growing companies
growth_picks = df[df['Rev Growth %'] > 15].sort_values('Rev Growth %', ascending=False)
print('Growth picks:', growth_picks['Ticker'].tolist())
# Large cap screen: biggest by assets
large_caps = df.nlargest(5, 'Assets ($B)')
print('Largest by assets:', large_caps['Ticker'].tolist())
Extending the Screener
You can add many more metrics from the SEC XBRL data. Here are some useful XBRL tags to consider:
StockholdersEquity— Book value (for price-to-book calculations)LongTermDebt— Debt levelsNetCashProvidedByUsedInOperatingActivities— Operating cash flowCommonStockSharesOutstanding— Share count (USD/shares unit)ResearchAndDevelopmentExpense— R&D spendingDividends— Dividend payments
See the SEC EDGAR API Guide for a complete list of available XBRL tags and how the companyfacts endpoint works.
Performance Tips
- Cache companyfacts locally: Each response is 100KB-2MB. Save them to disk to avoid re-fetching.
- Use company_tickers.json: Download it once and reuse for all CIK lookups. See our CIK lookup guide.
- Batch carefully: Screening 100+ companies takes 10+ seconds due to rate limits. Cache aggressively.
- Handle missing data: Not all companies report every metric. Always check for None values.
FAQ
Can I build a stock screener using only free SEC data?
Yes. The SEC EDGAR API provides free access to XBRL financial data including revenue, net income, EPS, assets, liabilities, and cash flow for every public company. No API key or payment required.
What Python libraries do I need for this tutorial?
You need requests for HTTP calls and pandas for data analysis. Install them with: pip install requests pandas.
What XBRL tags should I use for a stock screener?
Key tags include Revenues, NetIncomeLoss, EarningsPerShareDiluted, Assets, StockholdersEquity, and NetCashProvidedByUsedInOperatingActivities.
How accurate is SEC EDGAR financial data?
EDGAR XBRL data comes directly from company filings and is highly accurate. However, companies may use different XBRL tags for similar concepts. Always validate against the original filing.
How do I calculate revenue growth from SEC data?
Fetch the Revenues concept from companyfacts, filter for 10-K annual data, sort by date, then calculate: (current_year - prior_year) / prior_year * 100.
Related Guides
- Free SEC EDGAR API Guide — Complete overview of all EDGAR API endpoints
- SEC CIK Number Lookup Guide — Find any company's CIK instantly
- SEC EDGAR Full-Text Search API — Search the text of every filing
- Download SEC 10-K Filings Programmatically — Python & JavaScript guide
- SEC EDGAR API Rate Limits & Best Practices — Avoid getting blocked