Manuel Villanueva,

ML Engineer

About

Early Career SRE turned ML Engineer working on Tools and AI/ML Infrastructure.Currently contributing to Apple’s MLX.

Github

manuel@villanuev.com

X / Twitter

Linkedin

Substack

Experience

CloudOps Analyst (AI & Security)

Pragma

September 2024 - Present

Hybrid, Panama City

  • Developed ETL and RAG pipelines for +1,000 internal documents, boosting retrieval ranking quality (nDCG +18%)
  • Lead AI Red Teaming for a GRC Platform with clients in over 110 countries, 50% increase in successful guardrails activations during adversarial attacks.
  • Reduced Time To First Token by 64.7% for Legal & Compliance Applications.
  • Designed internal platform for Labeled Datasets generation used for Supervised Learning

Site Reliability Engineer, Intern

BAC

January 2024 - August 2024

Panama City

  • Implemented SLO/SLIs, effectively managing over 1600 L1-L6 monitoring metrics for infrastructure and services, resulting in improved system reliability
  • Achieved a 50% reduction in alerts and a 75.2% reduction in down hosts over 6 months by optimizing monitoring strategies.
  • Resolved SNMPv3 issues in multiple VMware ESXi clusters related to misconfiguration, enhancing network stability and performance
  • Created 5 Observability Dashboards for NOC to visualize Data Center and VPN traffic.

DevOps Engineer, Intern

Triangulum S.A.

January 2023 - February 2023

San José, Costa Rica

  • Implemented AI best practices for DevOps/SRE, which decreased Time To Production (TPP) by 20%
  • Took part of DOS Attacks Incident Response.
  • Refactored K8S Clusters from Docker to Podman.

Projects

Apple MLX ↗• Contributor adding CUDA support to MLX, working on FFT CUDA Backend.

• Modified mx.sort() behavior in CPU and Metal Backends to match NaN Sorting behavior based on IEEE-754.

• Added QoL Improvements such as #2689 (Einsum Error Msg Improvement) and #2690 (Fixed Type Annotations in Stubs).

Apple Silicon RDMA Simulation Testbed

• Deployed K3s Cluster on LimaVM on a M4 Mini, using it to run AI/ML Workloads

• Enabled Soft-RoCE to simulate RDMA behavior, set up Cilium Hubble for eBPF enabled observability of networking.

Gear9

• Designed a Digital Pitwall which works with F1 23 Videogame, streaming the game data via UDP.

• Created Data Streaming Pipeline hosted on AWS for real time data during racing.

• Data stored and collected on Prometheus, with all Data Visualization on Grafana.

Publications

Guide for Preparing and Responding to Deepfake Events

(Co-authored with OWASP Members)

Defined deepfake incident-response

playbooks with detection, containment, eradication, and recovery workflows for fraud, impersonation, and misinformation scenarios.

GenAI Red Teaming Guide

(Co-authored with OWASP Members)

Defined deepfake incident-response

Designed an adversarial testing blueprint (Model, Implementation, System, Runtime) integrated into CI/CD to fortify AI services against attacks. 2nd most downloaded OWASP Top 10 for LLMs White paper.

LLM & GenAI Data Security Best Practices

(Co-authored with OWASP Members)

Defined defense measures including layered

encryption, SIEM/XDR monitoring, secure data flows for agentic LLMs and governance frameworks to ensure integrity and reliability of AI data pipelines.

Education

Technological University of Panama

BSc. in Computer Science

& Systems Engineering

Skills

Systems, Security & Observability

 

  • Linux
  • MacOS
  • eBPF
  • DNS
  • BGP
  • PKI
  • RoCE, Infiniband
  • ELK Stack
  • Prometheus, Grafana
  • Datadog

Programming & Automation

 

  • C++
  • Cmake
  • Python
  • FastAPI
  • Go
  • Swift
  • SQL
  • Terraform
  • Ansible
  • Jenkins
  •  

SystemsCloud & Container Orchestration

 

  • AWS (EC2, S3, IAM, Bedrock, SageMaker)
  • GCP (BigQuery, VertexAI)
  • Azure (AI Foundry, AzureML)
  • Docker
  • Podman
  • Kubernetes
  • Cilium
  • FluxCD

AI/ML & Data Platforms

 

  • PyTorch
  • MLX
  • CUDA
  • RAG Pipelines
  • Model Quantitization
  • LoRA / QLoRA
  • RLHF
  • Apache Spark
  • Databricks

Manuel Villanueva,

ML Engineer

About

Early Career SRE turned ML Engineer working on Tools and AI/ML Infrastructure.Currently contributing to Apple’s MLX.

Github

manuel@villanuev.com

X / Twitter

Linkedin

Substack

Experience

CloudOps Analyst (AI & Security)

Pragma

September 2024 - Present

Hybrid, Panama City

  • Developed ETL and RAG pipelines for +1,000 internal documents, boosting retrieval ranking quality (nDCG +18%)
  • Lead AI Red Teaming for a GRC Platform with clients in over 110 countries, 50% increase in successful guardrails activations during adversarial attacks.
  • Reduced Time To First Token by 64.7% for Legal & Compliance Applications.
  • Designed internal platform for Labeled Datasets generation used for Supervised Learning

Site Reliability Engineer, Intern

BAC

January 2024 - August 2024

Panama City

  • Implemented SLO/SLIs, effectively managing over 1600 L1-L6 monitoring metrics for infrastructure and services, resulting in improved system reliability
  • Achieved a 50% reduction in alerts and a 75.2% reduction in down hosts over 6 months by optimizing monitoring strategies.
  • Resolved SNMPv3 issues in multiple VMware ESXi clusters related to misconfiguration, enhancing network stability and performance
  • Created 5 Observability Dashboards for NOC to visualize Data Center and VPN traffic.

DevOps Engineer, Intern

Triangulum S.A.

January 2023 - February 2023

San José, Costa Rica

  • Implemented AI best practices for DevOps/SRE, which decreased Time To Production (TPP) by 20%
  • Took part of DOS Attacks Incident Response.
  • Refactored K8S Clusters from Docker to Podman.

Projects

Apple MLX ↗• Contributor adding CUDA support to MLX, working on FFT CUDA Backend.

• Modified mx.sort() behavior in CPU and Metal Backends to match NaN Sorting behavior based on IEEE-754.

• Added QoL Improvements such as #2689 (Einsum Error Msg Improvement) and #2690 (Fixed Type Annotations in Stubs).

Apple Silicon RDMA Simulation Testbed

• Deployed K3s Cluster on LimaVM on a M4 Mini, using it to run AI/ML Workloads

• Enabled Soft-RoCE to simulate RDMA behavior, set up Cilium Hubble for eBPF enabled observability of networking.

Gear9

• Designed a Digital Pit wall which works with F1 23 Video game, streaming the game data via UDP.

• Created Data Streaming Pipeline hosted on AWS for real time data during racing.

• Data stored and collected on Prometheus, with all Data Visualization on Grafana.

Publications

Guide for Preparing and Responding to Deepfake Events

(Co-authored with OWASP Members)

Defined deepfake incident-response

playbooks with detection, containment, eradication, and recovery workflows for fraud, impersonation, and misinformation scenarios.

GenAI Red Teaming Guide

(Co-authored with OWASP Members)

Defined deepfake incident-response

Designed an adversarial testing blueprint (Model, Implementation, System, Runtime) integrated into CI/CD to fortify AI services against attacks. 2nd most downloaded OWASP Top 10 for LLMs White paper.

LLM & GenAI Data Security Best Practices

(Co-authored with OWASP Members)

Defined defense measures including layered

encryption, SIEM/XDR monitoring, secure data flows for agentic LLMs and governance frameworks to ensure integrity and reliability of AI data pipelines.

Education

Technological University of Panama

BSc. in Computer Science

& Systems Engineering

Skills

Systems, Security & Observability

 

  • Linux
  • MacOS
  • eBPF
  • DNS
  • BGP
  • PKI
  • RoCE, Infiniband
  • ELK Stack
  • Prometheus, Grafana
  • Datadog

Programming & Automation

 

  • C++ • Cmake
  • Python • FastAPI
  • Go
  • Swift
  • SQL
  • Terraform
  • Ansible
  • Jenkins
  •  

SystemsCloud & Container Orchestration

 

  • AWS (EC2, S3, IAM, Bedrock, SageMaker)
  • GCP (BigQuery, VertexAI)
  • Azure (AI Foundry, AzureML)
  • Docker
  • Podman
  • Kubernetes
  • Cilium
  • FluxCD

AI/ML & Data Platforms

 

  • PyTorch
  • MLX
  • CUDA
  • RAG Pipelines
  • Model Quantitization
  • LoRA / QLoRA
  • RLHF
  • Apache Spark
  • Databricks

Manuel Villanueva,

ML Engineer

About

Early Career SRE turned ML Engineer working on Tools and AI/ML Infrastructure.Currently contributing to Apple’s MLX.

manuel@villanuev.com

Github

X / Twitter

Substack

Linkedin

Experience

CloudOps Analyst (AI & Security)

Pragma

September 2024 - Present

Hybrid, Panama City

  • Developed ETL and RAG pipelines for +1,000 internal documents, boosting retrieval ranking quality (nDCG +18%)
  • Lead AI Red Teaming for a GRC Platform with clients in over 110 countries, 50% increase in successful guardrails activations during adversarial attacks.
  • Reduced Time To First Token by 64.7% for Legal & Compliance Applications.
  • Designed internal platform for Labeled Datasets generation used for Supervised Learning

Site Reliability Engineer, Intern

BAC

January 2024 - August 2024

Panama City

  • Implemented SLO/SLIs, effectively managing over 1600 L1-L6 monitoring metrics for infrastructure and services, resulting in improved system reliability
  • Achieved a 50% reduction in alerts and a 75.2% reduction in down hosts over 6 months by optimizing monitoring strategies.
  • Resolved SNMPv3 issues in multiple VMware ESXi clusters related to misconfiguration, enhancing network stability and performance
  • Created 5 Observability Dashboards for NOC to visualize Data Center and VPN traffic.

DevOps Engineer, Intern

Triangulum S.A.

January 2023 - February 2023

San José, Costa Rica

  • Implemented AI best practices for DevOps/SRE, which decreased Time To Production (TPP) by 20%
  • Took part of DOS Attacks Incident Response.
  • Refactored K8S Clusters from Docker to Podman.

Projects

Apple MLX ↗• Contributor adding CUDA support to MLX, working on FFT CUDA Backend.

• Modified mx.sort() behavior in CPU and Metal Backends to match NaN Sorting behavior based on IEEE-754.

• Added QoL Improvements such as #2689 (Einsum Error Msg Improvement) and #2690 (Fixed Type Annotations in Stubs).

Apple Silicon RDMA Simulation Testbed

• Deployed K3s Cluster on LimaVM on a M4 Mini, using it to run AI/ML Workloads

• Enabled Soft-RoCE to simulate RDMA behavior, set up Cilium Hubble for eBPF enabled observability of networking.

Gear9

• Designed a Digital Pit wall which works with F1 23 Video game, streaming game data via UDP.

• Created Data Streaming Pipeline hosted on AWS for real time data during racing.

• Data stored and collected on Prometheus, with all Data Visualization on Grafana.

Publications

Guide for Preparing and Responding to Deepfake Events

(Co-authored with OWASP Members)

Defined deepfake incident-response

playbooks with detection, containment, eradication, and recovery workflows for fraud, impersonation, and misinformation scenarios.

GenAI Red Teaming Guide

(Co-authored with OWASP Members)

Defined deepfake incident-response

Designed an adversarial testing blueprint (Model, Implementation, System, Runtime) integrated into CI/CD to fortify AI services against attacks. 2nd most downloaded OWASP Top 10 for LLMs White paper.

LLM & GenAI Data Security Best Practices

(Co-authored with OWASP Members)

Defined defense measures including layered

encryption, SIEM/XDR monitoring, secure data flows for agentic LLMs and governance frameworks to ensure integrity and reliability of AI data pipelines.

Education

Technological University of Panama

BSc. in Computer Science

& Systems Engineering

Skills

Programming & Automation

 

  • C++ • CMake
  • Python • FastAPI
  • Go
  • Swift
  • SQL
  • Terraform
  • Ansible
  • Jenkins

Systems, Security & Observability

 

  • Linux
  • MacOS
  • eBPF
  • DNS
  • BGP
  • PKI
  • RoCE, Infiniband
  • ELK Stack
  • Prometheus, Grafana
  • Datadog

AI/ML & Data Platforms

 

  • PyTorch
  • MLX
  • CUDA
  • RAG Pipelines
  • Model Quantitization
  • LoRA / QLoRA
  • RLHF
  • Apache Spark
  • Databricks

SystemsCloud & Container Orchestration

 

  • AWS (EC2, S3, IAM, Bedrock, SageMaker)
  • GCP (BigQuery, VertexAI)
  • Azure (AI Foundry, AzureML)
  • Docker
  • Podman
  • Kubernetes
  • Cilium
  • FluxCD