Tutorial Apr 9, 2026

Build a Free Stock Screener with the SEC EDGAR API (Python Tutorial)

Q: What Python libraries do I need for this tutorial?

You need requests for HTTP calls and pandas for data analysis. Install them with: pip install requests pandas. Both are standard data science libraries.

Q: What XBRL tags should I use for a stock screener?

Key tags include Revenues (total revenue), NetIncomeLoss (net income), EarningsPerShareDiluted (EPS), Assets (total assets), StockholdersEquity (book value), and NetCashProvidedByUsedInOperatingActivities (operating cash flow).

Pull real financial data from SEC filings, compare companies by revenue growth and profitability, and build your own screening criteria — all using free data from the EDGAR API and Python.

What We Are Building

By the end of this tutorial, you will have a Python script that:

Fetches XBRL financial data for any list of companies from the SEC EDGAR API
Extracts key metrics: revenue, net income, EPS, and total assets
Calculates revenue growth year-over-year
Filters and ranks companies by your criteria
Outputs a clean table showing the top performers

No API keys. No paid data feeds. Just the free SEC EDGAR API and Python.

Prerequisites

pip install requests pandas

You will also need a CIK number for each company you want to screen. We will handle the ticker-to-CIK conversion automatically.

Step 1: Set Up the SEC API Client

First, create a reusable class for making SEC API requests with proper rate limiting and error handling:

import requests
import pandas as pd
import time

class SECClient:
    """Simple SEC EDGAR API client with rate limiting."""

    BASE_URL = 'https://data.sec.gov'
    HEADERS = {'User-Agent': 'StockScreener [email protected]'}

    def __init__(self):
        self.session = requests.Session()
        self.session.headers.update(self.HEADERS)
        self.last_request = 0

    def _rate_limit(self):
        """Ensure we don't exceed 10 requests per second."""
        elapsed = time.time() - self.last_request
        if elapsed < 0.12:
            time.sleep(0.12 - elapsed)
        self.last_request = time.time()

    def get_company_facts(self, cik):
        """Fetch all XBRL financial facts for a company."""
        self._rate_limit()
        cik_padded = str(cik).zfill(10)
        url = f'{self.BASE_URL}/api/xbrl/companyfacts/CIK{cik_padded}.json'
        response = self.session.get(url)
        response.raise_for_status()
        return response.json()

    def get_ticker_map(self):
        """Download the SEC ticker-to-CIK mapping."""
        self._rate_limit()
        url = 'https://www.sec.gov/files/company_tickers.json'
        response = self.session.get(url)
        data = response.json()
        return {v['ticker'].upper(): str(v['cik_str']) for v in data.values()}

Step 2: Extract Financial Metrics from XBRL Data

The companyfacts endpoint returns all XBRL-tagged data for a company. We need to extract specific metrics and filter for annual (10-K) data:

def extract_metric(facts, tag, form='10-K', unit='USD'):
    """Extract a specific financial metric from companyfacts data.

    Returns a list of dicts with 'end' (date), 'val' (value), and 'fy' (fiscal year).
    """
    gaap = facts.get('facts', {}).get('us-gaap', {})
    concept = gaap.get(tag, {})
    units = concept.get('units', {})
    values = units.get(unit, [])

    # Filter for annual filings only
    annual = [v for v in values if v.get('form') == form]

    # Deduplicate by fiscal year end date
    seen = set()
    unique = []
    for v in sorted(annual, key=lambda x: x['end'], reverse=True):
        if v['end'] not in seen:
            seen.add(v['end'])
            unique.append(v)

    return unique


def extract_eps(facts, form='10-K'):
    """Extract EPS separately since it uses USD/shares unit."""
    gaap = facts.get('facts', {}).get('us-gaap', {})
    concept = gaap.get('EarningsPerShareDiluted', {})
    units = concept.get('units', {})
    values = units.get('USD/shares', [])

    annual = [v for v in values if v.get('form') == form]
    seen = set()
    unique = []
    for v in sorted(annual, key=lambda x: x['end'], reverse=True):
        if v['end'] not in seen:
            seen.add(v['end'])
            unique.append(v)
    return unique

Step 3: Build the Screening Function

Now we combine everything into a function that screens a list of companies and computes key metrics:

def screen_companies(tickers):
    """Screen a list of companies by financial metrics.

    Returns a pandas DataFrame with key financial data.
    """
    client = SECClient()
    ticker_map = client.get_ticker_map()
    results = []

    for ticker in tickers:
        cik = ticker_map.get(ticker.upper())
        if not cik:
            print(f'  Skipping {ticker}: CIK not found')
            continue

        try:
            facts = client.get_company_facts(cik)
            name = facts.get('entityName', ticker)

            # Extract latest annual metrics
            revenue = extract_metric(facts, 'Revenues')
            net_income = extract_metric(facts, 'NetIncomeLoss')
            assets = extract_metric(facts, 'Assets')
            eps = extract_eps(facts)

            # Some companies use RevenueFromContractWithCustomerExcludingAssessedTax
            if not revenue:
                revenue = extract_metric(facts,
                    'RevenueFromContractWithCustomerExcludingAssessedTax')

            latest_rev = revenue[0]['val'] if revenue else None
            prior_rev = revenue[1]['val'] if len(revenue) > 1 else None
            rev_growth = None
            if latest_rev and prior_rev and prior_rev > 0:
                rev_growth = round((latest_rev - prior_rev) / prior_rev * 100, 1)

            latest_ni = net_income[0]['val'] if net_income else None
            profit_margin = None
            if latest_rev and latest_ni and latest_rev > 0:
                profit_margin = round(latest_ni / latest_rev * 100, 1)

            results.append({
                'Ticker': ticker.upper(),
                'Company': name,
                'Revenue ($B)': round(latest_rev / 1e9, 2) if latest_rev else None,
                'Rev Growth %': rev_growth,
                'Net Income ($B)': round(latest_ni / 1e9, 2) if latest_ni else None,
                'Profit Margin %': profit_margin,
                'EPS': eps[0]['val'] if eps else None,
                'Assets ($B)': round(assets[0]['val'] / 1e9, 2) if assets else None,
                'Period': revenue[0]['end'] if revenue else None,
            })
            print(f'  Fetched {ticker}: {name}')

        except Exception as e:
            print(f'  Error fetching {ticker}: {e}')

    return pd.DataFrame(results)

Step 4: Run the Screener

Let us screen a list of major tech and financial companies:

# Define the companies to screen
tickers = [
    'AAPL', 'MSFT', 'GOOGL', 'AMZN', 'META',
    'NVDA', 'TSLA', 'NFLX', 'JPM', 'V',
    'MA', 'CRM', 'ORCL', 'ADBE', 'INTC'
]

print('Screening companies...')
df = screen_companies(tickers)

# Sort by revenue growth (descending)
df_sorted = df.sort_values('Rev Growth %', ascending=False)

print('\n--- Top Companies by Revenue Growth ---')
print(df_sorted[['Ticker', 'Company', 'Revenue ($B)', 'Rev Growth %',
                  'Profit Margin %', 'EPS']].to_string(index=False))

# Filter: profitable companies with >10% revenue growth
high_growth = df[(df['Rev Growth %'] > 10) & (df['Profit Margin %'] > 0)]
print(f'\n--- High-Growth Profitable Companies ({len(high_growth)}) ---')
print(high_growth[['Ticker', 'Revenue ($B)', 'Rev Growth %',
                    'Profit Margin %']].to_string(index=False))

Example Output

--- Top Companies by Revenue Growth ---
Ticker              Company  Revenue ($B)  Rev Growth %  Profit Margin %    EPS
  NVDA        NVIDIA Corp.         60.92          122.4             55.8  2.94
  META  Meta Platforms Inc.        134.90           22.1             33.4 19.47
  AMZN     Amazon.com Inc.        574.78           11.8              7.8  5.53
  MSFT    Microsoft Corp.         236.58           15.7             35.6 12.41
  AAPL          Apple Inc.        391.04            2.0             26.3  6.97
  ...

Step 5: Add Custom Screening Criteria

The real power is adding your own filters. Here are some common screening strategies:

# Value screen: high margins + reasonable growth
value_picks = df[
    (df['Profit Margin %'] > 20) &
    (df['Rev Growth %'] > 5) &
    (df['Revenue ($B)'] > 10)
]
print('Value picks:', value_picks['Ticker'].tolist())

# Growth screen: fastest growing companies
growth_picks = df[df['Rev Growth %'] > 15].sort_values('Rev Growth %', ascending=False)
print('Growth picks:', growth_picks['Ticker'].tolist())

# Large cap screen: biggest by assets
large_caps = df.nlargest(5, 'Assets ($B)')
print('Largest by assets:', large_caps['Ticker'].tolist())

Extending the Screener

You can add many more metrics from the SEC XBRL data. Here are some useful XBRL tags to consider:

StockholdersEquity — Book value (for price-to-book calculations)
LongTermDebt — Debt levels
NetCashProvidedByUsedInOperatingActivities — Operating cash flow
CommonStockSharesOutstanding — Share count (USD/shares unit)
ResearchAndDevelopmentExpense — R&D spending
Dividends — Dividend payments

See the SEC EDGAR API Guide for a complete list of available XBRL tags and how the companyfacts endpoint works.

Performance Tips

Cache companyfacts locally: Each response is 100KB-2MB. Save them to disk to avoid re-fetching.
Use company_tickers.json: Download it once and reuse for all CIK lookups. See our CIK lookup guide.
Batch carefully: Screening 100+ companies takes 10+ seconds due to rate limits. Cache aggressively.
Handle missing data: Not all companies report every metric. Always check for None values.

FAQ

Can I build a stock screener using only free SEC data?

Yes. The SEC EDGAR API provides free access to XBRL financial data including revenue, net income, EPS, assets, liabilities, and cash flow for every public company. No API key or payment required.

What Python libraries do I need for this tutorial?

You need requests for HTTP calls and pandas for data analysis. Install them with: pip install requests pandas.

What XBRL tags should I use for a stock screener?

Key tags include Revenues, NetIncomeLoss, EarningsPerShareDiluted, Assets, StockholdersEquity, and NetCashProvidedByUsedInOperatingActivities.

How accurate is SEC EDGAR financial data?

EDGAR XBRL data comes directly from company filings and is highly accurate. However, companies may use different XBRL tags for similar concepts. Always validate against the original filing.

How do I calculate revenue growth from SEC data?

Fetch the Revenues concept from companyfacts, filter for 10-K annual data, sort by date, then calculate: (current_year - prior_year) / prior_year * 100.

Related Guides

Free SEC EDGAR API Guide — Complete overview of all EDGAR API endpoints
SEC CIK Number Lookup Guide — Find any company's CIK instantly
SEC EDGAR Full-Text Search API — Search the text of every filing
Download SEC 10-K Filings Programmatically — Python & JavaScript guide
SEC EDGAR API Rate Limits & Best Practices — Avoid getting blocked