Analyst Upgrade Clusters: 5,884 Events, One Persistent Pattern
When one analyst upgrades a stock, the market shrugs. When three analysts upgrade the same stock in the same month, something measurable happens. We tested this across NYSE, NASDAQ, and AMEX from 2019 to 2025 using aggregate rating count data, and found statistically significant abnormal returns at every window out to 63 days.
Contents
- Method
- What We Found
- Overall Results
- By Cluster Size
- Why the Pop-Reversal-Recovery Pattern?
- The Practical Screen
- Data Notes
- Academic Foundation
- Limitations
The pattern isn't simple. There's an immediate pop, a sharp reversal, then a longer-term recovery. Understanding which cluster size drives each phase is where the practical insight lives.
Method
Universe: NYSE, NASDAQ, and AMEX-listed stocks with market cap above $1B USD.
Data source: FMP grades_historical table, aggregate analyst rating counts (StrongBuy, Buy, Hold, Sell, StrongSell) per symbol per date, 2019–2025.
Event definition: An upgrade cluster fires when the total bullish count (StrongBuy + Buy) increases by 2 or more between consecutive observations for the same symbol. Observations must be 14–30 days apart. The 14-day minimum gap normalizes observation frequency. FMP recorded daily updates for many symbols in 2022, which would otherwise create artificial cluster spikes.
Categories: - upgrade_small, delta = 2 (minimum, most common) - upgrade_medium, delta = 3–4 - upgrade_large, delta ≥ 5 (strongest consensus shift) - downgrade_cluster, bearish delta ≥ 2
Windows: T+1, T+5, T+21, T+63 trading days post-event.
Benchmark: SPY. CAR = stock return minus SPY return over each window.
Total events: 5,884 upgrade clusters, 1,027 downgrade clusters.
What We Found
Overall Results
| Window | Mean CAR | t-stat | N | Hit Rate |
|---|---|---|---|---|
| T+1 | +0.475% | 11.42** | 5,884 | 53.3% |
| T+5 | -0.971% | -9.42** | 5,880 | 46.0% |
| T+21 | +0.565% | 3.22** | 5,865 | 51.4% |
| T+63 | +0.632% | 2.18* | 5,838 | 48.1% |
** = p<0.01, * = p<0.05
The T+1 drift is the most robust signal. +0.475% with a t-stat of 11.42 across 5,884 events isn't noise. The market prices in the new analyst sentiment on day one, but it overshoots.
T+5 is the reversal. -0.971% mean CAR, t-stat of -9.42. Within a week, most of the initial pop is gone and then some. This isn't just drift compression. The market appears to overcorrect.
T+21 and T+63 tell a more interesting story. The CAR recovers to +0.565% and +0.632%. Statistically significant at T+21 (p<0.01). The cluster signal does carry longer-term information, it's just buried under the first-week reversal.
By Cluster Size
The aggregate numbers hide substantial variation across cluster categories.
| Category | N | T+1 | T+5 | T+21 | T+63 |
|---|---|---|---|---|---|
| upgrade_large (delta ≥ 5) | 2,376 | +0.636%** | -1.794%** | +0.077% | +0.876% |
| upgrade_medium (delta 3–4) | 1,762 | +0.382%** | -0.600%** | +1.068%** | +1.184%* |
| upgrade_small (delta 2) | 1,746 | +0.354%** | -0.236% | +0.688%* | -0.239% |
| downgrade_cluster | 1,027 | +0.200%* | -0.339% | +0.795%* | -1.326% |
** = p<0.01, * = p<0.05
Three things stand out.
Large clusters pop hardest, then reverse hardest. Delta ≥ 5 events get +0.636% on day one, the strongest immediate reaction. But by T+5 they're down -1.794%. By T+21 the CAR is essentially zero (+0.077%, not significant). Large clusters fire a strong immediate signal, but the information appears fully priced within a week. Whatever drove multiple analysts to converge rapidly gets absorbed quickly.
Medium clusters have the best persistence. Delta 3–4 events show +1.068% at T+21 (t=3.53*) and +1.184% at T+63 (t=2.29). Both statistically significant. Medium clusters represent genuine consensus shifts, enough analysts to confirm a catalyst, but not so many that the market front-runs the entire move. This is the category with the longest-lasting alpha signal.
Small clusters (delta = 2) are noisy beyond T+21. The minimum threshold fires a real T+1 effect (+0.354%*) and a borderline T+21 effect (+0.688%), but T+63 flips negative (-0.239%, not significant). Two analysts upgrading simultaneously is real, but it's the weakest signal. Some fraction are coincidental rather than catalyst-driven.
Downgrade clusters don't behave as expected. The T+1 CAR is +0.200% (positive). That's counterintuitive, but it likely reflects the market having already priced in deterioration before the cluster observation fires. By T+63 the CAR is -1.326%, though not statistically significant at standard thresholds.
Why the Pop-Reversal-Recovery Pattern?
The three-phase pattern has a plausible explanation.
T+1 (pop): When the cluster observation is published, it registers in screening systems. Systematic funds that monitor consensus shifts execute quickly. The initial buying pushes the stock above fair value.
T+5 (reversal): The immediate buyers exit as the stock reaches short-term targets. Mean-reversion traders fade the move. The net result is an overshoot correction.
T+21–T+63 (recovery): The fundamental information in the cluster, that multiple analysts identified a real catalyst, works its way into institutional positioning over weeks. Analysts who upgraded need to publish updates, hold conferences, and field client calls. Portfolio managers run their own due diligence. Capital flows happen gradually.
This timeline is consistent with Barber et al. (2001)'s finding that consensus shifts predict returns at longer horizons, not just immediately.
The Practical Screen
This query finds current upgrade clusters on US exchanges:
WITH lagged AS (
SELECT
symbol,
CAST(date AS DATE) AS obs_date,
CAST(analystRatingsStrongBuy AS INTEGER) + CAST(analystRatingsBuy AS INTEGER)
AS bullish_count,
CAST(analystRatingsSell AS INTEGER) + CAST(analystRatingsStrongSell AS INTEGER)
AS bearish_count,
LAG(CAST(analystRatingsStrongBuy AS INTEGER) + CAST(analystRatingsBuy AS INTEGER))
OVER (PARTITION BY symbol ORDER BY date) AS prev_bullish,
LAG(CAST(date AS DATE))
OVER (PARTITION BY symbol ORDER BY date) AS prev_date
FROM grades_historical
WHERE CAST(date AS DATE) >= CURRENT_DATE - INTERVAL '30' DAY
),
clusters AS (
SELECT
symbol,
obs_date,
bullish_count,
bearish_count,
bullish_count - prev_bullish AS upgrade_delta
FROM lagged
WHERE prev_bullish IS NOT NULL
AND (obs_date - prev_date) BETWEEN 14 AND 30
AND bullish_count - prev_bullish >= 2
)
SELECT
c.symbol,
c.obs_date,
c.upgrade_delta,
c.bullish_count,
c.bearish_count,
ROUND(k.marketCap / 1e9, 1) AS mktcap_bn
FROM clusters c
JOIN profile p ON c.symbol = p.symbol
JOIN key_metrics k ON c.symbol = k.symbol AND k.period = 'FY'
WHERE p.exchange IN ('NYSE', 'NASDAQ', 'AMEX')
AND k.marketCap > 1000000000
QUALIFY ROW_NUMBER() OVER (PARTITION BY c.symbol ORDER BY k.date DESC, c.obs_date DESC) = 1
ORDER BY c.upgrade_delta DESC, c.obs_date DESC
LIMIT 30
Note on the UINT16 cast: The analystRatings* columns in grades_historical are stored as unsigned 16-bit integers in parquet. Computing deltas directly causes underflow when counts decrease (e.g., 3 - 5 on UINT16 = 65,534, not -2). Always CAST AS INTEGER before delta computation.
The 14–30 day gap filter is non-negotiable for live screening. Without it, any period where FMP recorded daily updates (2022 in particular) produces false cluster signals from consecutive-day observation pairs.
Run this screen live on Ceta Research →
Data Notes
Effective period: 2019–2025. The grades_historical table has fewer than 200 symbols with data before 2019. Analysis before 2019 lacks statistical power.
2022 concentration. 56% of US events fall in 2022, the year FMP recorded daily or near-daily rating snapshots for many symbols. The 14-day minimum gap filter eliminates most spurious clusters from that year, but some concentration remains. The 2022 events that pass the gap filter represent genuine consensus shifts and are included in the analysis. We acknowledge that 2022 market conditions (post-pandemic rotation, Fed rate cycle beginning) may have increased cluster frequency independently of the data artifact.
Year-by-year distribution:
| Year | Upgrade Events | Large Clusters |
|---|---|---|
| 2019 | 57 | 15 |
| 2020 | 1,309 | 526 |
| 2021 | 488 | 144 |
| 2022 | 3,303 | 1,631 |
| 2023 | 77 | 32 |
| 2024 | 185 | 20 |
| 2025 | 465 | 8 |
2020's spike reflects COVID-era rating activity as analysts rapidly reassessed companies across multiple sectors simultaneously. The 2025 data is partial (YTD to backtest cutoff).
Academic Foundation
Womack (1996) documented that analyst upgrades to Strong Buy generate significant positive abnormal returns persisting for months after the recommendation change. Barber et al. (2001) extended this: changes in consensus recommendations, not individual calls, are the stronger predictor. When multiple analysts independently reach the same conclusion in a short window, they're likely responding to the same fundamental catalyst.
Our findings partially confirm this. The medium cluster signal (+1.068% T+21, p<0.01) is the clearest evidence of persistence. The large cluster result shows that the strongest consensus shifts generate the strongest immediate reaction but compress fastest, consistent with markets becoming more efficient at processing high-signal events.
Limitations
Short effective history. Seven years (2019–2025) limits statistical reliability compared to the 25-year datasets available for price-based strategies. The results are robust within this window but may not generalize across full market cycles.
2022 data concentration. More than half of events come from one year. This is disclosed but not fully resolved. Future datasets with longer pre-2022 history will clarify whether the patterns hold outside this period.
No individual analyst timing. The grades_historical table gives aggregate counts, not individual analyst action timestamps. A bullish count increase of 2 could mean two upgrades on the same day or two upgrades 13 days apart. We can't distinguish within the 14-day window.
Market cap floor. The $1B minimum excludes small-cap and micro-cap stocks where analyst clustering might behave differently.
Part of a Series: S Global Comparison Event Study | Global | S Uk Lse Event Study | US | S Germany Xetra Event Study
Run It Yourself
Explore the data behind this analysis on Ceta Research. Query our financial data warehouse with SQL, build custom screens, and run your own backtests across 70,000+ stocks on 20 exchanges.
Data: Ceta Research / FMP warehouse. Event study uses grades_historical + stock_eod + key_metrics tables. Market cap filter >$1B USD. Abnormal returns computed vs SPY benchmark. 2019–2025. Past performance doesn't guarantee future results. This is research content, not investment advice.
Methodology: Womack (1996) and Barber, Lehavy, McNichols & Trueman (2001).