Early Career SRE turned ML Engineer working on Tools and AI/ML Infrastructure.Currently contributing to Apple’s MLX.
Github
↗
manuel@villanuev.com
↗
X / Twitter
↗
↗
Substack
↗
Experience
CloudOps Analyst (AI & Security)
Pragma
September 2024 - Present
Hybrid, Panama City
- Developed ETL and RAG pipelines for +1,000 internal documents, boosting retrieval ranking quality (nDCG +18%)
- Lead AI Red Teaming for a GRC Platform with clients in over 110 countries, 50% increase in successful guardrails activations during adversarial attacks.
- Reduced Time To First Token by 64.7% for Legal & Compliance Applications.
- Designed internal platform for Labeled Datasets generation used for Supervised Learning
Site Reliability Engineer, Intern
BAC
January 2024 - August 2024
Panama City
- Implemented SLO/SLIs, effectively managing over 1600 L1-L6 monitoring metrics for infrastructure and services, resulting in improved system reliability
- Achieved a 50% reduction in alerts and a 75.2% reduction in down hosts over 6 months by optimizing monitoring strategies.
- Resolved SNMPv3 issues in multiple VMware ESXi clusters related to misconfiguration, enhancing network stability and performance
- Created 5 Observability Dashboards for NOC to visualize Data Center and VPN traffic.
DevOps Engineer, Intern
Triangulum S.A.
January 2023 - February 2023
San José, Costa Rica
- Implemented AI best practices for DevOps/SRE, which decreased Time To Production (TPP) by 20%
- Took part of DOS Attacks Incident Response.
- Refactored K8S Clusters from Docker to Podman.
Projects
Apple MLX ↗• Contributor adding CUDA support to MLX, working on FFT CUDA Backend.
• Modified mx.sort() behavior in CPU and Metal Backends to match NaN Sorting behavior based on IEEE-754.
• Added QoL Improvements such as #2689 (Einsum Error Msg Improvement) and #2690 (Fixed Type Annotations in Stubs).
Apple Silicon RDMA Simulation Testbed
• Deployed K3s Cluster on LimaVM on a M4 Mini, using it to run AI/ML Workloads
• Enabled Soft-RoCE to simulate RDMA behavior, set up Cilium Hubble for eBPF enabled observability of networking.
Gear9 ↗
• Designed a Digital Pitwall which works with F1 23 Videogame, streaming the game data via UDP.
• Created Data Streaming Pipeline hosted on AWS for real time data during racing.
• Data stored and collected on Prometheus, with all Data Visualization on Grafana.
Publications
Guide for Preparing and Responding to Deepfake Events ↗
(Co-authored with OWASP Members)
Defined deepfake incident-response
playbooks with detection, containment, eradication, and recovery workflows for fraud, impersonation, and misinformation scenarios.
GenAI Red Teaming Guide ↗
(Co-authored with OWASP Members)
Defined deepfake incident-response
Designed an adversarial testing blueprint (Model, Implementation, System, Runtime) integrated into CI/CD to fortify AI services against attacks. 2nd most downloaded OWASP Top 10 for LLMs White paper.
LLM & GenAI Data Security Best Practices ↗
(Co-authored with OWASP Members)
Defined defense measures including layered
encryption, SIEM/XDR monitoring, secure data flows for agentic LLMs and governance frameworks to ensure integrity and reliability of AI data pipelines.
Education
Technological University of Panama
BSc. in Computer Science
& Systems Engineering
Skills
Systems, Security & Observability
- Linux
- MacOS
- eBPF
- DNS
- BGP
- PKI
- RoCE, Infiniband
- ELK Stack
- Prometheus, Grafana
- Datadog
Programming & Automation
- C++
- Cmake
- Python
- FastAPI
- Go
- Swift
- SQL
- Terraform
- Ansible
- Jenkins
SystemsCloud & Container Orchestration
- AWS (EC2, S3, IAM, Bedrock, SageMaker)
- GCP (BigQuery, VertexAI)
- Azure (AI Foundry, AzureML)
- Docker
- Podman
- Kubernetes
- Cilium
- FluxCD
AI/ML & Data Platforms
- PyTorch
- MLX
- CUDA
- RAG Pipelines
- Model Quantitization
- LoRA / QLoRA
- RLHF
- Apache Spark
- Databricks
Certifications
NVIDIA Certified Professional - AI Infrastructure (NPC-AII) ↗
AWS Certified AI Practitioner (Early Adopter) ↗
Advanced Deep Learning Specialist (IBM) ↗
Al Red Teaming Professional (AIRTP+) (LearnPrompting) ↗
NASA TOPS-T ScienceCore AI/ML in Space Biology (NASA Ames Research Center)