Industry: Web3 / Blockchain
Service: Databricks Genie Implementation
Service: Databricks Genie Implementation
Customize (Refine) and Expand (Grow) High‑Intent Marketing Audience in Real Time—at Scale
Stack: Google Cloud, BigQuery, Dataproc, Vertex AI, Spark
March 24, 2026
The Challenge
With hundreds of millions of users, marketing teams at digital wallet platforms face a major personalization challenge. Each user holds a unique mix of tokens and NFTs, which are blockchain-based digital assets, with varying user behaviors. Adding to the complexity, a single wallet can contain more than 20,000 distinct data points. These together make market segmentation and identification far beyond the reach of simple filtering approaches and the technical problem a nightmare.
The client wanted an AI system that could take the marketing team’s segment criteria from CRM and identify similar wallets and generate target audience instantly for marketing initiatives.
The Product
A real-time recommendation engine was designed and deployed on Google Cloud, using a ScaNN index for similarity matching across 65 million wallets.
The system uses wallet data from BigQuery, including token holdings, NFT assets, and behavioral signals, and sends it through a dimensionality reduction model to reduce a massive feature space of 100,000 down to a compact, meaningful representation. This feature set is then indexed using Vertex AI’s Matching Engine, allowing the system to query any wallet ID and return its nearest neighbors instantly.
The output is a target group of wallets that share similar characteristics, which can be used directly for personalized recommendations or audience segmentation.
Technical Challenges Worth Noting
Sparse vector problem
Wallets with high-value but narrow portfolios were influencing recommendations toward similar high-value wallets, skewing results. We addressed this by analyzing PCA feature distributions and adjusting the distance calculations to produce more balanced, relevant matches.
Scaling from prototype to production
The initial proof of concept ran on a single-node compute. Moving to Spark allowed the pipeline to scale and handle the full dataset efficiently making the system production-ready rather than a demo.
End-to-end verification
Before scaling up, we ran the full pipeline on a sample dataset to validate accuracy at each step. These validations helped identify any issues at an early stage giving much more confidence in the results to the client before committing to full deployment.
The Result
The deployed engine can query any wallet in the database and return a ranked list of its nearest neighbors in real time, enabling dynamic audience generation at a scale that was not possible before
Marketing and product teams can now generate highly targeted audiences based on wallet behavior and holdings rather than broad demographic segments or manual curation. These segments are accurate, live, and configurable at massive scale—enabling campaigns to reach the right users with far less manual work and in a fraction of the time. The result is improved conversion and ROI, reduced costs, and full end‑to‑end data governance.
Turn your data into high-intent audiences—instantly.
Let’s build a real-time AI engine tailored to your product and scale.

