Pactolio
Technical Documentation

Data & Analytics Methodology

How Pactolio ingests Form 13F-HR and Form 4 filings from SEC EDGAR, normalizes entities via Central Index Key (CIK), and computes the metrics that appear on every fund, ticker, and insider page.


1. Data Sources

The platform sources all institutional and insider filing data directly from the SEC's Electronic Data Gathering, Analysis, and Retrieval system (EDGAR). Position, transaction, and entity-identity values resolve back to a primary filing identified by its accession number, filer CIK, and reporting date. Reference metadata that the filings themselves do not contain — such as ticker symbols (mapped from the filed CUSIP via OpenFIGI) and sector or industry classifications — is layered on as supplementary context and is always secondary to the SEC filing of record.

  • Form 13F-HR. Quarterly long-position disclosures from institutional investment managers exercising discretion over $100 million or more in Section 13(f) securities.
  • Form 4. Transaction-level disclosures from corporate officers, directors, and 10%+ beneficial owners, filed within two business days of execution.
  • EDGAR submissions index + company-tickers index. Authoritative source for entity legal names, CIK assignments, and ticker-to-CIK mapping, refreshed daily.

2. Pipeline Ingestion & Cadence

Pactolio runs an automated nightly ingest cycle that parses newly accepted EDGAR filings and re-publishes the updated datasets within the following intervals.

Filing source, statutory deadline, and Pactolio processing interval
Data Feed Source Statutory Filing Requirement Pactolio Processing Interval
Form 13F-HR / 13F-HR/A Within 45 days of calendar quarter end Re-indexed within 24–48 hours of EDGAR release
Form 4 Within 2 business days of transaction date Next nightly ingest cycle after filing acceptance (typically within 24 hours)

When a filer amends a prior filing (Form 13F-HR/A or Form 4/A), the amended values overwrite the original on the next ingest run. Each reporting quarter is retained as a separate historical snapshot, and the live per-page views always reflect the most recent accepted state.

3. Entity Normalization & CIK Resolution

Raw SEC text submissions contain highly inconsistent issuer names, changing ticker strings, and complex parent-subsidiary organizational variations. Pactolio resolves these gaps using a strict schema:

  • Identifier Mapping: Every position is anchored to its immutable SEC EDGAR Central Index Key (CIK) and CUSIP identity, avoiding historical ticker collision artifacts.
  • Amended Filing Handling: Restatements (Form 13F-HR/A) automatically overwrite the prior quarterly values in the live dataset, eliminating double-counting anomalies; each reporting quarter remains a distinct historical snapshot.
  • Legal vs. Display Name: For funds whose registered legal name differs from their publicly-known brand, the brand name is preserved as alternateName in the JSON-LD Organization node so both strings resolve to the same canonical entity.

4. Metric Definitions

Portfolio Weight

Reported market value of a position divided by the filer's total reported 13F portfolio market value on the same reporting date. Listed options are excluded from both numerator and denominator to ensure comparability across filers. Recomputed on every quarterly snapshot — never carried forward.

Distinct Institutional Count (Crowding)

For each security in the most recent available quarter, the number of unique filers reporting a long equity position. Each filer is counted once, deduplicated to a single canonical manager per CIK at aggregation time. This is the primary crowding signal: a $50B fund and a $500M fund each contribute one holder, preventing any single large-AUM manager from distorting the consensus reading.

Median Portfolio Concentration (Crowding)

The median of Portfolio Weight across all filers reporting the security as a long equity position in the relevant quarter. Median — not mean — is used to dampen single-fund concentration effects.

Peer Overlap

The percentage of a peer fund's 13F equity portfolio (by reported market value) that consists of positions also held by the reference fund in the same reporting quarter. Restricted to Section 13(f) long equity positions.

Open-Market Insider Purchases / Sales

Form 4 purchase codes — primarily code P (open-market or private purchase) — are classified as buys, and sale codes — primarily code S (open-market or private sale), plus code D (disposition to the issuer) — as sells. Open-market purchases under code P require the insider to deploy their own capital and carry the strongest conviction signal. Grants (code A), option exercises (code M), tax withholding (code F), and other compensation-driven transactions are tracked separately and excluded from the conviction signal.

5. Known Exclusions

  • Short positions (not disclosed on Form 13F-HR by regulation).
  • OTC derivatives, swaps, and bilateral agreements.
  • Non-U.S. holdings and holdings outside Section 13(f) securities.
  • Form N-PORT mutual fund holdings (not currently ingested).
  • Schedule 13D / 13G activist and 5%+ ownership disclosures (not currently surfaced in fund/ticker views).
  • Positions filed under confidential treatment, until the confidential period expires.

6. Methodology FAQ

What is Form 13F-HR and who is required to file it?

Form 13F-HR is a quarterly disclosure required of institutional investment managers exercising investment discretion over $100 million or more in Section 13(f) securities. Filers must report long positions in Section 13(f) securities — listed equities, equity options, convertibles, ETFs, and warrants — within 45 days following the end of each calendar quarter. Short positions, non-U.S. holdings, and OTC derivatives are not required to be disclosed.

What is Form 4 and what counts as an open-market transaction?

Form 4 is a transaction-level disclosure that corporate insiders — officers, directors, and beneficial owners of more than 10% of a registered class of equity securities — must file within two business days of any change in beneficial ownership. Pactolio classifies purchase codes (primarily code P, open-market or private purchase of a non-derivative or derivative security) as buys and sale codes (primarily code S, open-market or private sale, plus code D, dispositions to the issuer) as sells. Grants (code A), option exercises (code M), and tax-withholding (code F) transactions are tracked separately and excluded from the conviction signal.

How is Portfolio Weight calculated?

Portfolio Weight is the reported market value of a position divided by the filer's total reported 13F portfolio market value on the same reporting date. Listed options are excluded from both numerator and denominator to ensure comparability across filers. Weights are recomputed on every quarterly snapshot rather than carried forward from the prior quarter.

How is Peer Overlap calculated?

Peer Overlap measures the percentage of a peer fund's 13F equity portfolio (by reported market value) that consists of positions also held by the reference fund in the same reporting quarter. Both the numerator and denominator are restricted to Section 13(f) long equity positions.

How are fund names reconciled to a stable entity?

Every fund in the dataset is mapped to its SEC EDGAR Central Index Key (CIK). On pages where the CIK is known, the JSON-LD Organization node sets the SEC legal name (sourced from EDGAR's submissions API) and exposes sameAs links to both EDGAR and, where available, Wikidata. This lets Google's Knowledge Graph and LLM citation engines reconcile a Pactolio page to the same canonical entity referenced on Wikipedia, Crunchbase, and the SEC's own pages.

How frequently is the data refreshed?

Form 13F-HR filings are statutorily required within 45 days following the end of each calendar quarter; Pactolio re-ingests and re-publishes on its next nightly ingest cycle, typically within 24–48 hours of EDGAR publication. Form 4 transactions are typically reported within two business days of the execution date and are reflected on the platform on the next nightly ingest cycle, typically within 24 hours of filing acceptance.

What is excluded from the dataset?

Short positions, OTC derivatives, non-U.S. holdings, mutual fund (Form N-PORT) holdings, and 13G/13D activist disclosures are not currently surfaced in the per-fund or per-ticker views. Confidential treatment requests granted by the SEC are honored — positions filed under confidential cover are excluded until the confidential period expires.

7. Corrections

Errors are inevitable in an ingest-driven platform. If a value on the site disagrees with the underlying SEC filing, please flag it via Contact. Corrections are applied in the next pipeline run and the source filing reference is retained for audit.