Summary

System architect and AI infrastructure engineer. Defines system architecture, technical direction, and infrastructure standards for agent systems and foundation model infrastructure at Kuona, providing architectural guidance across engineering and data science teams.

Track record of replacing fragmented, per-client systems with generalized platforms that compound over time — in agent execution, distributed ML training (500+ servers), and forecasting infrastructure. Architectures defined for two major product lines have become organizational standards followed across teams. Fulbright–García Robles Scholar; M.S. Data Science from NYU; dual degrees from UNAM.

Technical Scope

  • Agent Systems Architecture — System design for multi-agent execution, orchestration, context management, plan validation, and evaluation infrastructure.
  • Distributed ML Infrastructure — Distributed training platforms, custom task orchestration, heterogeneous compute management, and horizontal scalability across 500+ servers.
  • Foundation Model Infrastructure — Composable ETL pipelines, scalable inference infrastructure, modular per-client adaptation, and generalized forecasting systems.
  • Platform & Infrastructure Engineering — Reproducible environments, ephemeral infrastructure, development-to-production consistency, and standardized deployment workflows.
  • Architectural Leadership & Technical Direction — Cross-team architectural guidance, system design review, and infrastructure standards definition across engineering and data science teams.

Education

New York University 2023 – 2025
M.S. Data Science, Center for Data Science, Courant Institute of Mathematical Sciences
GPA: 3.88 / 4.0

Fulbright–García Robles Scholar (2022).

National Autonomous University of Mexico (UNAM) 2016 – 2022
B.Eng. Computer Engineering — School of Engineering GPA: 9.18 / 10
B.Sc. Data Science — Institute of Research in Applied Math & Systems (IIMAS) GPA: 9.44 / 10
Specialization Certificate in Finance — School of Accounting & Administration GPA: 9.57 / 10
Honors: UNAM PAPIIT Research Scholarship Grantee — IN100719 (2020), IA104720 (2020–2021)

Products & Systems

ExpertAI

Agentic Platform • 2024–Present

Multi-agent analytical platform for autonomous analysis over tabular data. 90% task success rate; reduces cross-silo analytical queries from hours to minutes. Used by global retailers and manufacturers across 5 database backends.

Foundation Model for Market Panel Data

Research • 2025–Present

World model for market panel data with JEPA-style training objective. Modular per-client adapters eliminate hours-long retraining cycles.

Grafa

Knowledge Graph Engine • 2024

Knowledge graph engine giving LLMs structured memory over heterogeneous documents. Sub-second federated retrieval, ~95% recall, significant token consumption reduction.

KQuery

NL-to-SQL Engine • 2024

Natural-language-to-SQL engine across five database backends. ~95% SQL generation accuracy, sub-minute cross-source query generation.

KNER

Entity Resolution Pipeline • 2024

Multi-agent entity resolution across all company entity systems — products, geographies, promotions, brands, ontological concepts. 10–20 seconds including disambiguation. Read more

ExpertAI Visualization System

AI Visualization Pipeline • 2024–Present

Agent-driven data visualization that autonomously selects chart types, maps visual encodings, and renders interactive charts. Eliminates manual BI configuration for non-technical users.

UserAI

Evaluation Harness • 2025

Autonomous evaluation harness driving multi-turn agent sessions end-to-end. CI-integrated regression detection for failure modes invisible in single-turn testing.

Perfect Order

Demand Forecasting Platform • 2022–2024

Unified distributed ML training platform replacing fragmented per-client systems. 500+ servers, thousands of model training jobs weekly. Pre-modeling data governance checks catch data errors upstream, saving hundreds of GPU hours in wasted training.

Oort

Task Orchestration System • 2022–2024

Custom task orchestration system for heterogeneous distributed computing with arbitrary horizontal scalability and a UNIX-like process control interface for remote task management.

BM-Forecasting

Forecasting Library • 2022–2024

Time series forecasting library with transparent multi-horizon predictions, automatic hyperparameter search, and native ensemble model support.

Experience

Kuona — System Architect, Forecasting & Foundation Model Infrastructure 2025 – Present
Platform Architecture (USA) • Led team of 7
  • Defined the system decomposition and architectural structure for Kuona's forecasting platform, redesigning it from client-specific pipelines into generalized, reusable infrastructure. This architecture has been followed by the data science team for approximately two years and serves as the basis for the company's foundation model development.
  • Decoupled system components into composable ETL pipelines, eliminating hardcoded client-specific workflows. Promoted anomaly detection and outlier tracking to first-class persistent training signals across datasets. Removed architectural bottlenecks and standardized infrastructure across forecasting systems.
  • Designed generalized inference infrastructure supporting immediate forecasting without per-client retraining, with modular per-client adapters eliminating 1–12 hour retraining cycles. Reduced onboarding time for new client data from days to hours. Real-time forecasting capabilities enabled by this architecture are in active development.
  • Leading the design of a JEPA-style energy-based foundation model on this platform — learning compact representations of unobservable market dynamics via synthetic supervision, supporting multi-horizon forecasting, intervention simulation, and causal probing.
Kuona — AI Lead & Agent Systems Architect 2024 – Present
Agent Systems Architecture (USA) • Led team of 4
  • Defined the ExpertAI system architecture, which is now the required architectural foundation through which all agent initiatives at Kuona operate. All agent-based systems and initiatives follow the execution model, infrastructure, and architectural structure defined in this system.
  • Other teams build on top of this architecture — agent initiatives across the organization depend on the execution, orchestration, context management, plan validation, evaluation, and resilience patterns established in ExpertAI.
  • Concept to production in under a year. Multi-agent analytical platform — 90% task success rate, reduces cross-silo queries from hours to minutes across 5 database backends. Full-stack: Python/FastAPI + Python/TypeScript/Django backend, React frontend, WebSocket real-time communication, RabbitMQ/Kombu messaging, AWS EKS deployment.
  • Designed the evaluation and observability infrastructure for agent systems: agents represented as directed graphs with automated failure attribution to specific nodes, component ranking by downstream harm, and deterministic replay for counterfactual experiments.
  • Designed formal tool-planning framework where agents generate multi-step plans verified via PDDL-style preconditions and effects, with bypass auditing that explicitly records safety exception decisions for post-hoc analysis.
  • Implemented deterministic plan reuse and semantic grounding mechanisms, enabling consistent execution, controlled plan adaptation, and resolution of ambiguous inputs into executable specifications.
  • Built Grafa: knowledge graph engine providing LLMs structured memory over heterogeneous documents. Sub-second federated retrieval, ~95% recall. Multi-tenant isolation with hook-based extensibility. Deployed as shared infrastructure across agent products.
  • Built KQuery: NL-to-SQL engine across five database backends, ~95% SQL accuracy. Plans and verifies the full query hierarchy upfront, then runs all queries in parallel as dependencies resolve. Generation rules stored as data in the knowledge graph, scoped by platform, company, or user.
  • Built KNER: multi-agent entity resolution pipeline resolving free-text queries against all company entity systems — products, geographies, promotions, brands, ontological concepts — in 10–20 seconds including disambiguation. Implemented as nested LangGraph state machines with four specialized agents.
  • Designed UserAI, an autonomous evaluation harness driving multi-turn agent sessions end-to-end for CI-integrated regression detection of failure modes invisible in single-turn testing.
Sketchpro.ai — Research Scientist Intern 2024
USA
  • Built vision-LLM quality monitoring systems and automated data acquisition pipelines over CommonCrawl, producing a ~700 GB high-quality dataset. Designed the full training pipeline for a custom SDXL ControlNet for conditional generation.
Kuona — Data Scientist & ML Engineer 2022 – 2024
Distributed ML Infrastructure & Platform Engineering (Mexico)
  • Replaced Kuona's fragmented per-client training-on-demand systems with a unified, generalized distributed ML training platform — deprecating legacy systems entirely and deploying a single platform spanning 500+ servers, supporting thousands of model training jobs weekly, with orchestration infrastructure, training pipelines, and monitoring systems.
  • This structural architectural replacement reduced onboarding time for new configurations and clients from weeks to days, improved system scalability and reliability, and improved forecasting accuracy through standardized training infrastructure.
  • Designed and implemented a custom task orchestration system with native support for heterogeneous distributed computing, arbitrary horizontal scalability, and a UNIX-like process control interface for remote task management.
  • Led a 3-person team as lead architect in the redesign of Kuona's Perfect Order demand forecasting system. Introduced pre-modeling data governance checks that caught data errors upstream, saving hundreds of GPU hours in wasted training.
  • Established engineering infrastructure and practices adopted across engineering and data science teams: eliminated development-to-production drift, implemented ephemeral environments and reproducible infrastructure, and standardized workflows and deployment environments across production ML systems.
  • Designed and implemented a time series forecasting library with transparent multi-horizon predictions, automatic hyperparameter search, and native ensemble model support.

Skills & Technologies

Programming Languages
Python Julia C/C++ CUDA Elixir
ML & Data
PyTorch TensorFlow Flux.jl Spark LangGraph LLMs/Prompting
Databases
SQL Neo4j Redis MongoDB
Systems & Infrastructure
Multi-Agent Systems Knowledge Graphs Kubernetes EKS AWS (Advanced) Azure Databricks Prometheus

Personal Projects

BakePy

Open Source • 2022

Python library for creating good-looking reports programmatically without templates or complex layout systems. Automatically transforms Python objects (Matplotlib figures, Pandas DataFrames) into HTML using Bootstrap 5's grid for layout. MIT licensed. GitHub

Academic Projects

AIMPAC Refactoring

Research • 2021–2022

Led a multidisciplinary group that built a Julia language replacement for the AIMPAC software suite for describing the quantum structure of molecules. Parallel, GPU-ready replacement for the original Fortran code. Collaboration between UNAM's IIMAS and School of Chemistry. View project

PSO Supply Chain Optimization

Research • 2021

Parallel implementation for supply chain optimization on Nvidia GPUs using CUDA and Julia. Binary particle swarm optimization algorithm with conservation-of-flow constraints and LP-based cost evaluation. View project

Academic Collaboration Prediction

Research • 2020–2021

Improved upon previous results in predicting future academic collaborations using topological data. Created a reduced feature vector via SVD on the collaboration network's adjacency matrix, improving on Hasan et al. (2006). View project

Publications & Preprints

Agarwal, V., Manasson, J., Garrido Czacki, M., & Sucholutsky, I. (2025)
ICLR 2025 Re-Align Workshop (Poster)
Proposed and built a pipeline for extracting hierarchical latent representations from olfactory bulb imagery; repurposed medically finetuned SAM2 for generalization, selected encoder-level features, and applied optimal transport to align and compare latent geometry across subjects.

Awards & Honors

1st Place — Energy-Based Modeling Competition (NYU, Yann LeCun; 1st of 53). Compact ~20K-parameter physics-informed model outperformed ~3M-parameter ViT approaches via structural inductive biases. 2025
Fulbright–García Robles Scholar 2022
UNAM PAPIIT Research Scholarship — Project IA104720 (MCMC Methods for Large-Scale Linear Systems) 2020–2021
1st Place — UNAM School of Engineering VLSI Design Competition 2020
UNAM PAPIIT Research Scholarship — Project IN100719 (Predictive Models Applied to Graphs and Text) 2020
2nd Place — First UNAM Impulse to Innovation Contest 2018
Telmex Foundation Scholarship for Academic Excellence 2017
UNAM Data Science B.Sc. Alumni Association — Founding Member
UNAM Data Science B.Sc. Academic Council — First Class Student Representative

Languages

Fulbright Scholar from Mexico to the USA. Multicultural background reflected in daily working fluency across languages and professional contexts.

Spanish Native
English Fluent
Japanese Intermediate
Chinese Beginner