noteefy
recreationalcoder

Case Study: Noteefy Analytics

Noteefy Analytics Overview Dashboard

Noteefy wanted to turn their admin analytics experience into a stronger sales, retention, and operator visibility tool. I replaced a fragmented MongoDB Charts implementation with a custom analytics architecture built on Snowflake, Python FastAPI, and React.

The existing analytics experience was built from a combination of MongoDB Charts and lightweight React components that embedded those charts into the admin application. This worked well enough for early reporting, but it became harder to scale as Noteefy needed to support larger multi-course operators, more product lines, and more sophisticated reporting.

I worked closely with Noteefy's COO, CTO, and Product leads to gather requirements and define the scope of the reimplementation. The goal was to make analytics a central part of the admin experience. The dashboard needed to load meaningful lifetime metrics for large operators, support universal filtering, help internal teams show the value of Noteefy to prospective customers, and give operators a clearer reason to return to the admin application.

One of the most important product decisions was making analytics the homepage of the admin application. That was not just a navigation change. It reflected the role analytics needed to play in the product. Noteefy wanted operators to see value as soon as they logged in. They also wanted internal teams to use the dashboard to impress potential customers, support renewals, and increase engagement with the admin application.

The dashboard was designed to tell a story quickly:

The value Noteefy created
Where that value came from
How activity changed over time
What operators should pay attention to next

That product framing influenced both the backend and frontend work. The goal was not to expose every possible metric. The goal was to organize the most important metrics into a clear, persuasive, and reusable experience.

I approached this challenge with the following plan:

  1. Replace the MongoDB Charts-based reporting experience with a custom analytics architecture
  2. Move the most important metric calculations into Snowflake using stored procedures
  3. Model the data around Noteefy's product value stories: recovered revenue, protected rounds, golfer engagement, search demand, communication demand, and operator activity
  4. Build a Python FastAPI microservice to connect the admin application to Snowflake through clean analytics APIs
  5. Rebuild the React dashboard experience with reusable components, universal filtering, MUI X Charts, consistent layouts, and stronger visual hierarchy
  6. Add PDF report downloads so internal teams and operators could turn filtered dashboard views into shareable reports
  7. Optimize lifetime metric loading for the largest multi-course operators across the Platform, Waitlist, and Confirm dashboards

The final result was a faster, more flexible analytics system that became the default homepage experience for the Noteefy admin application.

Tech Stack

I arrived at the following stack for the implementation:

  1. MongoDB as the operational data source.
  2. Estuary for syncing MongoDB data into Snowflake.
  3. Snowflake for analytics storage, querying, and stored procedure execution.
  4. SQL stored procedures for dashboard-specific analytics payloads
  5. Python FastAPI for the analytics microservice.
  6. React for the admin dashboard frontend.
  7. MUI X Charts for frontend visualizations.
  8. dom-to-image

    and pdfmake for client-side PDF report generation
  9. Custom frontend logic for report formatting, anonymization, and interactive element replacement

The most important technical decision was treating the dashboard as a full analytics product, not just a frontend charting project.

Backend Architecture

The new architecture centered around Snowflake, with MongoDB data replicated into Snowflake using Estuary. A Python FastAPI microservice sat between the frontend and Snowflake, providing a stable API boundary with request validation, response formatting, and error handling so the React dashboard never needed to know how the analytics layer worked. This created a cleaner separation of concerns. Snowflake handled the heavy analytics work. FastAPI provided a stable application boundary. React handled filtering, layout, visualization, reporting, and user experience.

Throughout implementation, I gathered context and feedback from Noteefy's Engineering team and joined their daily scrums to stay aligned on data availability, API contracts, and integration timing.

Snowflake

Moving analytics into Snowflake gave Noteefy a dedicated environment where expensive reporting queries could run without relying directly on the operational MongoDB database. Once the data was available in Snowflake, the next challenge was deciding how to expose it to the dashboard.

Views were not flexible enough because the dashboard needed to pass in dynamic parameters like start date, end date, and selected course IDs. Snowflake functions were also not the right fit. Scalar functions were useful for small reusable transformations, but they could not represent an entire dashboard response. Table functions were closer, but still awkward because each dashboard needed to return multiple kinds of data at once: KPI values, chart series, nested objects, product-specific metric groups, and report-ready structures.

Stored procedures gave me the flexibility I needed. They allowed me to accept parameters, run multi-step SQL logic, reuse intermediate result sets, build structured response objects, and return dashboard-ready payloads. Instead of duplicating calculations across MongoDB Charts, API routes, and React components, the most important business logic lived in one analytics layer.

Testing the stored procedures was its own challenge. Unit testing SQL logic in Snowflake was not straightforward, so I built integration tests that ran the full procedures against known datasets and validated the response payloads. This approach fit cleanly into the CI/CD workflow and gave me confidence that metric calculations stayed correct as the procedures evolved.

Data Modeling

A key challenge was deciding how the stored procedure responses and API payloads should be structured. I decided early on that the response shape should match all the way from Snowflake through FastAPI to the React frontend, so that each layer could pass the data through without restructuring it.

Each product got its own response contract. Waitlist focused on demand and recovered opportunity: searches, golfer signups, recaptured revenue, and referrer attribution. Confirm focused on operational visibility and booking protection: text confirmations, communication demand, and rounds recaptured. Platform had its own set of metrics that brought the larger story together across products, communicating overall value without double-counting activity or flattening product-specific metrics into something less meaningful.

On the Platform overview page, the stored procedure actually fetches all product metrics, but the dashboard only surfaces the top-level KPIs for each product in side-by-side panes.

Platform Performance and Product Analytics KPIsplatform overview with product-specific metric panes

Metric Calculations

Many of the metrics already existed in some form through MongoDB Charts, but they lacked standardization. I worked with Product to define the exact metric definitions and with Engineering to validate the formulas against the underlying data.

That standardization process surfaced real complexity. The hardest problem was deduplication across products. Waitlist and Confirm were not independent data sources. They were part of the same user flow, but without explicit linkage in the data. A golfer who found a tee time through Waitlist might later confirm that same booking through Confirm. Both products would record activity for the same round, but there was no shared booking ID connecting them. At the same time, a different golfer might book the same tee time through a regular channel and also confirm through Confirm. The data was stored in nested documents, and the challenge was ensuring that overlapping bookings were not double-counted while still correctly summing spend across distinct golfers for the same tee time.

That kind of detail mattered because the dashboard was being used to make real business claims. A polished chart was not enough. The metrics had to be credible.

Backend Architecture

Performance Optimization

Performance was the hardest technical part of the project.

The target metric was clear: load lifetime metrics for the largest multi-course operator on each dashboard.

In the original MongoDB Charts implementation, individual charts loaded independently. Some metrics loaded in a few seconds, while others could take multiple minutes depending on the chart and the size of the operator.

After the first pass of the Snowflake stored procedure implementation, the dashboard experience was more unified, but it still was not fast enough. The initial stored procedure performance was roughly 5 seconds per course, which meant the query still scaled too linearly as more courses were added.

Because the source data came from MongoDB, many of the records synced into Snowflake retained their document structure with nested arrays and embedded objects. Lateral FLATTEN was the natural way to work with this data in SQL, and the first implementation leaned on it heavily. That approach was correct for getting the metrics right, but it created unnecessary performance problems that compounded at scale.

The optimization work involved many similar changes across the stored procedures. Here are a few representative examples.

Eliminating LATERAL FLATTEN on Course ID Arrays

The biggest single improvement came from how the query filtered subscriptions by selected courses. The MongoDB documents stored course associations as an array of IDs. The original query flattened that array to join against the selected courses:

-- Before: flatten the course_ids array, then join to filter
SELECT s.search_id, s.email, sc.value::STRING AS course_id, s.booking_price
FROM subscriptions s,
  LATERAL FLATTEN(input => s.course_ids) sc
  JOIN selected_courses ci ON sc.value::STRING = ci.course_id
WHERE ARRAYS_OVERLAP(s.course_ids, :course_ids);

Every subscription row with N course IDs became N rows before the join could filter anything. For a large operator with millions of subscription records, this created an enormous intermediate result set.

After analyzing the data, I discovered that each waitlist entry currently only had one course associated with it. The array field existed in the schema, but it was never actually populated with multiple values. That meant the flatten was doing expensive work for no practical reason. The fix was to join directly against a scalar course ID field that existed alongside the array:

-- After: join on the scalar field, skip the flatten entirely
SELECT s.search_id, s.email, s."course_data/course_id"::STRING AS course_id, s.booking_price
FROM subscriptions s
  JOIN selected_courses ci ON s."course_data/course_id"::STRING = ci.course_id;

That removed the LATERAL FLATTEN, the ARRAYS_OVERLAP pre-filter, and the row multiplication entirely. For courses with millions of subscription rows, this was typically a 5-10x reduction in intermediate row count.

Single-Pass Referrer Bucketing

The original implementation used two separate CTEs to calculate referrer metrics, one for search counts and one for revenue, each scanning the same table with a WHERE referrer IN (...) filter:

-- Before: two separate scans with filtered referrer lists
referrer_searches AS (
  SELECT referrer, COUNT(*) AS searches
  FROM subscriptions_all
  WHERE referrer IN ('widget', 'calendar', 'map', 'app')
  GROUP BY referrer
),
referrer_revenue AS (
  SELECT referrer, SUM(booking_price) AS revenue
  FROM subscriptions_all
  WHERE referrer IN ('widget', 'calendar', 'map', 'app')
  GROUP BY referrer
)

This scanned the table twice and silently dropped any traffic that did not match the explicit referrer list. The rewrite used a single CASE expression to bucket all referrers in one pass:

-- After: one scan, one group, nothing dropped
SELECT
  CASE
    WHEN referrer IN ('widget') THEN 'WIDGET'
    WHEN referrer IN ('calendar') THEN 'EMBEDDED_CALENDAR'
    WHEN referrer IN ('map', 'app') THEN 'APP'
    ELSE 'DIRECT'
  END AS referrer_bucket,
  COUNT(*) AS searches,
  SUM(booking_price) AS revenue
FROM subscriptions_all
GROUP BY referrer_bucket;

This cut the table scans in half and also fixed a data accuracy bug where unrecognized referrer values were being silently excluded from the totals.

Replacing ROW_NUMBER() with COUNT(DISTINCT)

The original query used a window function to find each user's first subscription per course. This was intentional: the metric at the time was "new golfer cohort in window", which required identifying the earliest entry per user to determine when they first appeared.

-- Before: partition sort to find first subscription per user
users_first AS (
  SELECT *, ROW_NUMBER() OVER (
    PARTITION BY course_id, email_norm ORDER BY created_at
  ) AS rn
  FROM subscriptions_all
)
SELECT course_id, COUNT(*) AS unique_users
FROM users_first WHERE rn = 1
GROUP BY course_id;

That forced a full partition sort over the entire subscription set. During the reimplementation, the product requirement changed from "new golfer cohort in window" to "unique users active in window". That shift meant the query no longer needed to know when a user first appeared, just whether they appeared. That collapsed the logic to a simple distinct count:

-- After: hash-distinct, no sort required
SELECT course_id, COUNT(DISTINCT email_norm) AS unique_users
FROM subscriptions_all
GROUP BY course_id;

Snowflake handles COUNT(DISTINCT) efficiently through its MPP hash-distinct execution, which avoids the sort overhead entirely.

The Principle

The exact production queries involved more metric branches and nested output structures, but the principle across all three optimizations was the same.

Instead of this pattern:

  1. Flatten nested document data early
  2. Multiply rows across courses, bookings, referrers, and events
  3. Repeat similar logic in multiple CTEs scanning the same tables
  4. Use expensive window functions for deduplication
  5. Re-aggregate inflated result sets later

I moved toward this pattern:

  1. Join on scalar fields instead of flattening arrays where possible
  2. Combine related metrics into single-pass aggregations
  3. Use hash-based distinct operations instead of window sorts
  4. Reuse smaller intermediate datasets
  5. Assemble the nested dashboard objects after the expensive aggregation work had already been reduced

For cases where lateral flattening was still genuinely needed, I worked with the team to move that work out of the stored procedures entirely. By leveraging the Estuary data pipeline and writing trigger functions in Snowflake, I could flatten the nested document structures once at sync time rather than on every dashboard request. That way the stored procedures could query against pre-flattened tables without paying the flattening cost at read time.

Now, instead of repeating heavy calculations for each course and metric, the stored procedures could filter the relevant operator data once, aggregate it in grouped passes, and shape the results for the dashboard.

This made the dashboards usable for the workflows they were designed to support: sales demos, customer success reporting, operator review, and admin homepage visibility. The integration tests I had built for the stored procedures were critical here. Each optimization pass changed how intermediate results were computed, and the tests ensured that the final metric values stayed accurate through every iteration.

Performance Results

The largest benchmark was lifetime metric loading for the largest multi-course operator. Before the reimplementation, individual MongoDB Charts loaded independently and could range from a few seconds to multiple minutes depending on the metric. After the reimplementation, the major dashboards loaded through coordinated Snowflake stored procedures, giving Noteefy a more predictable and usable analytics experience for large operators.

Largest MCO Lifetime Dashboard Load Times

Platform~12 seconds
Waitlist~20 seconds
Confirm~8 seconds
Performance Optimization

React Dashboard Reimplementation

The frontend was rebuilt as a custom React dashboard experience instead of a collection of embedded MongoDB Charts.

This gave me more control over layout, filtering, visual hierarchy, loading states, chart behavior, report generation, and product consistency.

The new dashboard experience included:

  • KPI cards for headline metrics
  • Product-specific dashboard sections
  • Universal date and course filtering
  • Reusable chart and table components
  • MUI X Charts for visualizations
  • Loading and empty states
  • A unified structure across Platform, Waitlist, Confirm, and AI Assistant
  • PDF report downloads
  • A homepage-style admin experience that immediately showed operators the value Noteefy was creating

The AI Assistant dashboard was already implemented differently, so that work was more about bringing it into the same admin UI structure. Platform, Waitlist, and Confirm were the primary analytics reimplementation targets.

Platform

The Platform dashboard opened with headline KPI cards for total golfers served and total economic impact, then broke down product-specific metrics for Waitlist, Confirm, and AI Assistant side by side.

Platform Performance and Product Analytics KPIsplatform performance and product analytics

Waitlist

The Waitlist dashboard surfaced revenue recaptured, engaged users, and searches created as top-level KPIs, with deeper views into confirmed booking revenue by lead time, search demand across time slots, referrer attribution, and golfer engagement metrics.

Waitlist KPIs and Confirmed Bookings Revenue Chartwaitlist KPIs and confirmed bookings revenue by lead time
Waitlist Demand Crystal Ball Chartssearch demand crystal ball, 0-30 and 30-90 day windows
Waitlist Traffic by Referrer and Engagement Metricstraffic by referrer and engagement metrics

Confirm

The Confirm dashboard led with revenue recaptured, rounds recaptured, and confirmations sent, then provided communication demand breakdowns by booking status and cancellation patterns by days before tee time.

Confirm KPIs and Communication Crystal Ball Chartconfirm KPIs and communication crystal ball
Confirm Cancellations by Days Before Tee Time and Engagement Metricscancellations by days before tee time and engagement metrics

AI Assistant

The AI Assistant dashboard tracked conversation volume, session duration, calls saved, and topic distribution, along with session ratings and booking engagement metrics.

AI Assistant Conversations, Topics Discussed, and Engagement Metricsconversations, topics discussed, and engagement metrics
AI Assistant Session Ratingssession ratings

Universal Filtering

Universal filtering was one of the most important UX improvements.

Instead of having each chart behave like its own separate report, the dashboard let users select a date range and course group once, then see the entire dashboard update around that context.

Universal Analytics Filtering with Multi-Course Operator, Course, and Date Selectionuniversal filtering: operator, course, and date selection

This was especially important for multi-course operators. A user could view one course, a subset of courses, or the full operator account without switching between disconnected reports.

It also made internal reporting easier. Sales and customer success teams could prepare a focused view for a customer conversation, then export that view into a report.

The backend complexity was hidden from the user. Different metrics used different event timestamps, but the UI presented one simple filtering experience.

That made the dashboard feel much more like a product and less like a reporting tool.

PDF Report Downloads

The PDF report feature turned the dashboard into a sales and customer success asset.

Reports were generated on the client using a custom function built with dom-to-image, pdfmake, and additional export logic.

That custom logic handled details like:

  • Capturing dashboard sections as report-ready images
  • Hiding or replacing interactive elements that did not belong in a PDF
  • Preserving the selected filter context
  • Anonymizing sensitive data when needed
  • Formatting the report so it could be shared outside the live admin interface

This mattered because Noteefy wanted the dashboard to support both internal teams and external operators. The live dashboard helped users explore the data, while the PDF report helped turn that data into a shareable business review.

Here are some example reports generated from the dashboard:

Frontend & Product

Results

The reimplementation gave Noteefy a stronger analytics foundation and a much more useful admin homepage.

The biggest wins were:

Page visit comparison showing new analytics pages peaking at 200 weekly pageviews vs 21 for the old admin analyticsNew analytics pages: 200 weekly pageviews vs. old admin analytics: 21

The page visit data confirmed that the new dashboard was not just a better reporting tool. It changed how users engaged with the admin application. The old analytics pages faded out naturally as operators and internal teams adopted the new experience as their default.

The result was a dashboard that did not just report data. It helped Noteefy communicate product value.

We hired Neel to help us with some AI and data initiatives. We needed somebody who could solve problems for us really quickly and that's exactly what he was able to do. What really impressed me with Neel was his ability to ramp up on our code base and gain context so quickly. He's very much a self starter and was able to take control and lead the project we needed.

Dathan Wong
Dathan WongCo-Founder, CTO, Noteefy

Conclusion

This project was a great example of combining backend architecture, data modeling, frontend development, and product strategy into one focused reimplementation.

Noteefy needed more than better charts. They needed an analytics experience that could scale to large multi-course operators, support internal sales and customer success workflows, and give operators a clearer reason to engage with the admin application.

By moving the analytics layer into Snowflake stored procedures, connecting it through a Python FastAPI microservice, rebuilding the frontend in React, adding universal filtering, using MUI X Charts for visualizations, and supporting PDF report downloads, I turned a fragmented reporting experience into a central part of the product.

The final result was a faster, more flexible, and more persuasive analytics dashboard that helped Noteefy tell a clearer story about recovered revenue, protected rounds, golfer engagement, and operator value.