Summary

System architect and AI infrastructure engineer. Lead scientist on a custom JEPA Transformer Encoder–World Model architecture for market panel data, and defines system architecture, technical direction, and infrastructure standards for agent systems and the surrounding foundation model platform at Kuona, providing architectural guidance across engineering and data science teams.

Track record of replacing fragmented, per-client systems with generalized platforms that compound over time — in agent execution, distributed ML training (500+ servers), and forecasting infrastructure. Architectures defined for two major product lines have become organizational standards followed across teams. Fulbright–García Robles Scholar; M.S. Data Science from NYU; dual degrees from UNAM.

Technical Scope

  • Agent Systems Architecture — System design for multi-agent execution, orchestration, context management, plan validation, and evaluation infrastructure.
  • Distributed ML Infrastructure — Distributed training platforms, custom task orchestration, heterogeneous compute management, and horizontal scalability across 500+ servers.
  • Foundation Model Architecture & Infrastructure — Custom JEPA Transformer Encoder–World Model architecture and training objective design; composable ETL pipelines, scalable inference infrastructure, modular per-client adaptation, and generalized forecasting systems built around it.
  • Platform & Infrastructure Engineering — Reproducible environments, ephemeral infrastructure, development-to-production consistency, and standardized deployment workflows.
  • Architectural Leadership & Technical Direction — Cross-team architectural guidance, system design review, and infrastructure standards definition across engineering and data science teams.

Education

New York University 2023 – 2025
M.S. Data Science, Center for Data Science, Courant Institute of Mathematical Sciences
GPA: 3.88 / 4.0
Fulbright–García Robles Scholar (2022)
Course Assistant, DS-GA 1012 Natural Language Understanding and Computational Semantics (Prof. Tal Linzen, Spring 2025)
National Autonomous University of Mexico (UNAM) 2016 – 2022
B.Eng. Computer Engineering — School of Engineering GPA: 9.18 / 10
B.Sc. Data Science — Institute of Research in Applied Math & Systems (IIMAS) GPA: 9.44 / 10
Specialization Certificate in Finance — School of Accounting & Administration GPA: 9.57 / 10
Honors: UNAM PAPIIT Research Scholarship Grantee — IN100719 (2020), IA104720 (2020–2021)

Products & Systems

ExpertAI

Agentic Platform • 2024–Present

Multi-agent analytical platform for autonomous analysis over tabular data. 90% task success rate; reduces cross-silo analytical queries from hours to minutes. Used by global retailers and manufacturers across 5 database backends.

Foundation Model for Market Panel Data

Research • 2025–Present

Custom JEPA Transformer Encoder–World Model architecture — an energy-based, joint-embedding predictive model designed from the ground up for market panel data. Learns compact representations of unobservable market dynamics and is the single foundation underlying all forecasting, intervention simulation, and causal probing tasks across the platform. Lead scientist on the architecture itself and on its training objective; the surrounding infrastructure supports immediate inference via modular per-client adapters without retraining.

Grafa

Knowledge Graph Engine • 2024

Knowledge graph engine giving LLMs structured memory over heterogeneous documents. Sub-second federated retrieval, ~95% recall, significant token consumption reduction.

KQuery

NL-to-SQL Engine • 2024

Natural-language-to-SQL engine across five database backends. ~95% SQL generation accuracy, sub-minute cross-source query generation.

KNER

Entity Resolution Pipeline • 2024

Multi-agent entity resolution across all company entity systems — products, geographies, promotions, brands, ontological concepts. 10–20 seconds including disambiguation. Read more

ExpertAI Visualization System

AI Visualization Pipeline • 2024–Present

Agent-driven data visualization that autonomously selects chart types, maps visual encodings, and renders interactive charts. Eliminates manual BI configuration for non-technical users.

UserAI

Evaluation Harness • 2025

Autonomous evaluation harness driving multi-turn agent sessions end-to-end. CI-integrated regression detection for failure modes invisible in single-turn testing.

Perfect Order

Demand Forecasting Platform • 2022–2024

Unified distributed ML training platform replacing fragmented per-client systems. 500+ servers, thousands of model training jobs weekly. Pre-modeling data governance checks catch data errors upstream, saving hundreds of GPU hours in wasted training.

Oort

Task Orchestration System • 2022–2024

Custom task orchestration system for heterogeneous distributed computing with arbitrary horizontal scalability and a UNIX-like process control interface for remote task management.

BM-Forecasting

Forecasting Library • 2022–2024

Time series forecasting library with transparent multi-horizon predictions, automatic hyperparameter search, and native ensemble model support.

Experience

Kuona — System Architect, Forecasting & Foundation Model Architecture 2025 – Present
Platform Architecture (USA) • Led team of 7
  • Lead scientist on a custom JEPA Transformer Encoder–World Model architecture for market panel data — designed the model itself: encoder topology, predictor, joint-embedding objective with self-supervised and semi-supervised targets, and energy-based training formulation. Learns compact representations of unobservable market dynamics, and serves as the single world model underlying all forecasting, intervention simulation, and causal probing tasks across the platform.
  • Defined the system decomposition and architectural structure for Kuona's forecasting platform, redesigning it from client-specific pipelines into the generalized infrastructure that serves this foundation model. This architecture has been followed by the data science team for approximately two years.
  • Decoupled system components into composable ETL pipelines, eliminating hardcoded client-specific workflows. Promoted anomaly detection and outlier tracking to first-class persistent training signals across datasets. Removed architectural bottlenecks and standardized infrastructure across forecasting systems.
  • Designed generalized inference infrastructure supporting immediate forecasting without per-client retraining, with modular per-client adapters eliminating 1–12 hour retraining cycles. Reduced onboarding time for new client data from days to hours.
Kuona — AI Lead & Agent Systems Architect 2024 – Present
Agent Systems Architecture (USA) • Led team of 4
  • Defined the ExpertAI system architecture, which is now the required architectural foundation through which all agent initiatives at Kuona operate. All agent-based systems and initiatives follow the execution model, infrastructure, and architectural structure defined in this system.
  • Other teams build on top of this architecture — agent initiatives across the organization depend on the execution, orchestration, context management, plan validation, evaluation, and resilience patterns established in ExpertAI.
  • Concept to production in under a year. Multi-agent analytical platform — 90% task success rate, reduces cross-silo queries from hours to minutes across 5 database backends. Full-stack: Python/FastAPI + Python/TypeScript/Django backend, React frontend, WebSocket real-time communication, RabbitMQ/Kombu messaging, AWS EKS deployment.
  • Designed the evaluation and observability infrastructure for agent systems: agents represented as directed graphs with automated failure attribution to specific nodes, component ranking by downstream harm, and deterministic replay for counterfactual experiments.
  • Designed formal tool-planning framework where agents generate multi-step plans verified via PDDL-style preconditions and effects, with bypass auditing that explicitly records safety exception decisions for post-hoc analysis.
  • Implemented deterministic plan reuse and semantic grounding mechanisms, enabling consistent execution, controlled plan adaptation, and resolution of ambiguous inputs into executable specifications.
  • Built Grafa: knowledge graph engine providing LLMs structured memory over heterogeneous documents. Sub-second federated retrieval, ~95% recall. Multi-tenant isolation with hook-based extensibility. Deployed as shared infrastructure across agent products.
  • Built KQuery: NL-to-SQL engine across five database backends, ~95% SQL accuracy. Plans and verifies the full query hierarchy upfront, then runs all queries in parallel as dependencies resolve. Generation rules stored as data in the knowledge graph, scoped by platform, company, or user.
  • Built KNER: multi-agent entity resolution pipeline resolving free-text queries against all company entity systems — products, geographies, promotions, brands, ontological concepts — in 10–20 seconds including disambiguation. Implemented as nested LangGraph state machines with four specialized agents.
  • Designed UserAI, an autonomous evaluation harness driving multi-turn agent sessions end-to-end for CI-integrated regression detection of failure modes invisible in single-turn testing.
Sketchpro.ai — Research Scientist Intern 2024
USA
  • Built the curation pipeline over Recap-DataComp-1B, producing a domain-curated dataset for SDXL training. Two-stage filter: metadata-only dimension/aspect-ratio screen, then MobileCLIP content filtering on hundreds of parallel Modal CPU workers with iteratively tuned positive/negative label sets, manually verified across ~120k images. Outputs bucketed into SDXL's native training resolutions and tagged with setting (indoor/outdoor) for targeted fine-tunes.
  • Built a watermark detection cascade — domain blocklist, cheap high-recall detector, GPT-4o gate on survivors — keeping API cost bounded while labeling watermark presence as a per-sample field, so watermarked data could either be filtered or conditioned on at training time.
  • Designed the full training pipeline for a custom SDXL ControlNet for domain-specific controllable generation. Decomposed scenes into structured conditions — depth, scribble, line art, albedo, structural edges, shadows — with routing logic that selects the appropriate condition set per input class (depth for 3D/interior renders, scribble for hand-drawn sketches, precise line art otherwise) and toggleable color preservation and structure-locking based on user intent.
  • Ran A/B testing and automatic prompt optimization with DSPy across image generation pipelines, tuning prompts against quality and task-adherence metrics to systematically improve output over hand-written baselines.
  • Instrumented end-to-end OpenTelemetry observability spanning frontend, backend services, and ML inference servers, giving unified traces across user actions, orchestration, and model calls.
Kuona — Data Scientist & ML Engineer 2022 – 2024
Distributed ML Infrastructure & Platform Engineering (Mexico)
  • Replaced Kuona's fragmented per-client training-on-demand systems with a unified, generalized distributed ML training platform — deprecating legacy systems entirely and deploying a single platform spanning 500+ servers, supporting thousands of model training jobs weekly, with orchestration infrastructure, training pipelines, and monitoring systems.
  • This structural architectural replacement reduced onboarding time for new configurations and clients from weeks to days, improved system scalability and reliability, and improved forecasting accuracy through standardized training infrastructure.
  • Designed and implemented a custom task orchestration system with native support for heterogeneous distributed computing, arbitrary horizontal scalability, and a UNIX-like process control interface for remote task management.
  • Led a 3-person team as lead architect in the redesign of Kuona's Perfect Order demand forecasting system. Introduced pre-modeling data governance checks that caught data errors upstream, saving hundreds of GPU hours in wasted training.
  • Established engineering infrastructure and practices adopted across engineering and data science teams: eliminated development-to-production drift, implemented ephemeral environments and reproducible infrastructure, and standardized workflows and deployment environments across production ML systems.
  • Designed and implemented a time series forecasting library with transparent multi-horizon predictions, automatic hyperparameter search, and native ensemble model support.

Skills & Technologies

Programming Languages
Python Julia C/C++ CUDA Elixir
ML & Data
PyTorch TensorFlow Flux.jl Spark LangGraph LLMs/Prompting
Databases
SQL Neo4j Redis MongoDB
Systems & Infrastructure
Multi-Agent Systems Knowledge Graphs Kubernetes EKS AWS (Advanced) Azure Databricks Prometheus

Personal Projects

BakePy

Open Source • 2022

Python library for creating good-looking reports programmatically without templates or complex layout systems. Automatically transforms Python objects (Matplotlib figures, Pandas DataFrames) into HTML using Bootstrap 5's grid for layout. MIT licensed. GitHub

Academic Projects

AIMPAC Refactoring

Research • 2021–2022

Led a multidisciplinary group that built a Julia language replacement for the AIMPAC software suite for describing the quantum structure of molecules. Parallel, GPU-ready replacement for the original Fortran code. Collaboration between UNAM's IIMAS and School of Chemistry. View project

PSO Supply Chain Optimization

Research • 2021

Parallel implementation for supply chain optimization on Nvidia GPUs using CUDA and Julia. Binary particle swarm optimization algorithm with conservation-of-flow constraints and LP-based cost evaluation. View project

Academic Collaboration Prediction

Research • 2020–2021

Improved upon previous results in predicting future academic collaborations using topological data. Created a reduced feature vector via SVD on the collaboration network's adjacency matrix, improving on Hasan et al. (2006). View project

Publications & Preprints

Agarwal, V., Manasson, J., Garrido Czacki, M., & Sucholutsky, I. (2025)
ICLR 2025 Re-Align Workshop (Poster)
Proposed and built a pipeline for extracting hierarchical latent representations from olfactory bulb imagery; repurposed medically finetuned SAM2 for generalization, selected encoder-level features, and applied optimal transport to align and compare latent geometry across subjects.

Awards & Honors

1st Place — Energy-Based Modeling Competition (NYU, Yann LeCun; 1st of 53). Compact ~20K-parameter physics-informed model outperformed ~3M-parameter ViT approaches via structural inductive biases. 2025
Fulbright–García Robles Scholar 2022
UNAM PAPIIT Research Scholarship — Project IA104720 (MCMC Methods for Large-Scale Linear Systems) 2020–2021
1st Place — UNAM School of Engineering VLSI Design Competition 2020
UNAM PAPIIT Research Scholarship — Project IN100719 (Predictive Models Applied to Graphs and Text) 2020
2nd Place — First UNAM Impulse to Innovation Contest 2018
Telmex Foundation Scholarship for Academic Excellence 2017
UNAM Data Science B.Sc. Alumni Association — Founding Member
UNAM Data Science B.Sc. Academic Council — First Class Student Representative

Languages

Fulbright Scholar from Mexico to the USA. Multicultural background reflected in daily working fluency across languages and professional contexts.

Spanish Native
English Fluent
Japanese Intermediate
Chinese Beginner