Xavier Murias

Infrastructure & AI Engineer · Singapore

[email protected]·juanako.ai·github.com/fblgit·huggingface.co/fblgit·linkedin.com/in/xmm-sg

About

I've been in infrastructure since 2003 — through sysadmin, security, virtualization, containers, SRE, platform, and now AI. The work kept changing names; the shape of the problem didn't. Whatever I'm good at now, I owe to the next thing always being harder than the last.

These days I run platform at Xendit — seven thousand services across dozens of clusters, the kind of fleet where you measure success in things that didn't happen. Together with my team, we keep iterating and advancing our stack: materializing our own utopia day by day.

I independently released models with UNA and MGS — post-training methods I developed and applied across multiple transformer architectures, reaching #1 on the HuggingFace LLM Leaderboard several times: TheBeagle, Juanako, miniClaus, and Cybertron, which was served for nearly two years in Cloudflare Workers AI. Each iteration sharpened my intuition over deep neural networks, so I kept building AI.

I read more than I write, run more experiments than I publish, and contribute upstream when the fix belongs in the project itself, not a private fork. The systems I'm proudest of are the ones nobody notices, because they just keep running.

Currently

Head of Infrastructure and Science · Xendit · since 2022

Languages

Spanish (Native) · English (Fluent) · Italian (Intermediate) · Chinese (Conversational)

Stack

Kubernetes · Helm · Terraform · AWS · GCP · Cloudflare · Go · Python · PyTorch · Transformers · ArgoCD · Argo Rollouts · Atlantis · GitOps · Datadog · Cilium · KEDA · CDN · CI/CD

Work Preferences

Remote · Hybrid · TZ: CET · SGT · AU

Relocation

EU · US · AU · SG

Education

Massachusetts Institute of Technology2018 – 2018

6.00.1x — Introduction to Computer Science and Programming Using Python

Introduction to Computer Science and Programming Using Python, and Introduction to Computational Thinking and Data Science.

University of Washington2017 – 2018

Essentials of CyberSecurity — Professional Certificate

Four-course UWx track (CYB001x–CYB004x): cybersecurity foundations, the CISO's view, defensive toolkit, and career-path framing.

Liceo Scientifico1995 – 2000

Computer and Information Sciences

Certifications

Mensa International

Mensa · May 1994

LFS158x — Introduction to Kubernetes

The Linux Foundation · Jun 2018

Certified Linux Administrator (LPIC-1)

Linux Professional Institute (LPI) · Sep 2009

Security+

CompTIA · Mar 2007

Check Point Certified Security Administrator (CCSA)

Check Point Software · Jan 2008

IBM Certified Advanced Technical Expert — Power Systems with AIX

IBM · Aug 2004

Red Hat Certified Engineer (RHCE)

Red Hat · Feb 2004

SUNSA

Sun Microsystems · May 2003

CHFI — Computer Hacking Forensic Investigator

EC-Council · Feb 2007

Certified Ethical Hacker (CEH)

EC-Council · Dec 2006

Experience

Head of Infrastructure and Science · Xendit · Singapore

08.2022 - Present

Spearheaded a comprehensive overhaul of infrastructure and engineering culture at Xendit, managing 7,000+ services across dozens of Kubernetes clusters and thousands of nodes. Delivered unprecedented efficiency, 7-figure cost savings, and a transition from a toil-focused squad to a high-performing SRE engineering unit.

Platform Engineering & Orchestration

Fleet Re-Architecture: Completely reimagined and rebuilt the Kubernetes fleet, eliminating technical debt and implementing a true Active/Active Multi-Cluster architecture for hyper-distributed workloads.
Advanced GitOps & Control Plane: Engineered a custom ArgoCD Macro-Scale Framework and plugin, enabling a dry, layered YAML structure to manage thousands of deployments across a distributed fleet with infinite-scale design.
Lifecycle & Scaling: Achieved zero-downtime, zero-error-rate EKS lifecycle management. Leveraged Karpenter for high-performance node provisioning and KEDA to drive extreme cost efficiency, enabling services to scale to zero during low-demand periods.
Deployment & Governance: Standardized canary deployments via Argo Rollouts and orchestrated complex QA/CI/CD pipelines with Argo Workflows, all governed by automated Kyverno policy enforcement.

Networking & Traffic Engineering

Multi-Cluster Connectivity: Deployed Cilium MultiCluster Mesh to provide seamless, secure, and observable connectivity across the global service landscape.
Traffic Steering & DR: Designed a fault-tolerant, transparent DR strategy via a Split DNS Horizon on multi-region/multi-AZ topologies. Developed an advanced DNS hierarchy for weighted blue/green load balancing between CDNs, clusters, and regions.
Edge Migration: Executed a flawless, zero-downtime migration from Imperva to Cloudflare using a Multi-CDN topology.

Reliability & Cost Engineering

SLA & Incident Management: Elevated platform availability from 99.98% to 99.999% SLA, maintaining a near two-year record of zero team-led incidents. Elevated RCA and Post-Mortems to mission-critical status.
Strategic Cost Optimization: Delivered consistent 20% YoY cost reductions through a dual-track strategy:
- Architectural Efficiency: Implementing scaling-to-zero (KEDA), right-sizing via Karpenter, and optimized compute architectures (arm64/amd64).
- Financial Engineering: Orchestrating complex capacity planning involving Spot instances, Reserved Instances, and Savings Plans.
Data Layer Resilience: Managed high-availability, self-hosted data layers including YugabyteDB, MongoDB, and PostgreSQL within the Kubernetes ecosystem.

Wins: 99.999% uptime · Pure IaaC self-service · 7-figures cost efficiency

Reports: 15Squads: SRE, Data, SecuritySkills: EKS, ArgoCD, GitOps, Terraform, Cloudflare, Split DNS, Multi-CDN

AI/ML Lead (Contract) · Candy.AI · Remote

03.2024 - 08.2024

Single-handedly revolutionized a startup's AI capabilities, scaling from nascent to millions of users.

Platform & Orchestration

Fleet rebuild: from a handful of Azure VM GPUs to a fully orchestrated, hyperscale GKE on GCP environment with NVIDIA GPU autoscaling, supporting multi-GPU and diverse generative workloads
CloudNative IaC discipline: Terraform, Helm, ArgoCD, and GitOps
Just-in-time launch: delivered the GKE fleet ahead of media and TV exposure — fleet held the resulting ramp from hundreds of thousands to millions of users at 99.99% uptime
Dev → prod gating: containerized AI workflows and inference for consistent deployment and controlled promotion across environments

Inference & Generative Workloads

Data pipelines: engineered the foundations that fed training and inference workflows
'SuperBooga' inference broker: an event-driven bus fronting GPU inference — nodes pull from a pub/sub queue and serve one request at a time per GPU, avoiding the performance hit a single GPU takes from concurrent inference
GPU & instance right-sizing: through performance and load-testing, matched adequate GPU and instance family to each workload while keeping a cost-efficient footprint
Distributed weights & adapters: intra-cluster storage kept in sync with the upstream source and fanned out to inference containers on demand, accelerating cold-start times and cutting storage costs

ML Engineering & Practice

End-to-end product delivery: translated aspirational & conceptual product requirements into shipped features — both the supporting software and custom-trained PyTorch/Transformers models
Performance discipline: custom stress-testing for LLM and Stable Diffusion, anchored in a K6-driven load-testing and observability culture

Wins: Scale from thousands to millions · Built from scratch: pure IaaC · GKE GPU cost-efficient fleets

Reports: 5Squads: AI, InfrastructureSkills: GCP, GKE, Terraform, Helm, ArgoCD, GitOps, K6, PyTorch/Transformers, StableDiffusion, LLM, Pub/Sub

Principal DevOps Engineer · foodpanda · Singapore

01.2021 - 08.2022

First Principal Engineer of its kind at foodpanda — running APAC infrastructure not just for foodpanda but across the entire Delivery Hero group. Each cluster a self-contained Local Business Unit: a country or region running its own business, independently. Large-scale computing of containerised environments, driven by IaaC on high-availability and high-concurrency systems.

Platform & Orchestration

Kubernetes at large scale, concurrency, resilience — thousands of distributed services, thousands of nodes, tens of thousands of ingress resources under management
Cluster lifecycle on AWS/EKS: Terraform-imported the existing clusters into a blue/green blueprint, then spun new clusters inside the same VPC/networking — communicating internally and presenting as a single perimeter entity: a "metacluster"
Macro-scale ArgoCD ecosystem: a custom plugin, a shared chart, and a layered DRY footprint repo. Reproducible infrastructure built on meta-modular abstractions, so the group could stand up new Regions and Countries fast and safe
ArgoCD and ArgoRollouts (Design, Deployment, Customisation, Workshops)
Kubernetes Tailoring (MPA, Controllers, Advanced Scheduling, Affinity, etc.)
Self-service GitOps absorbed the bulk of the Jira Service Desk hot-request types — engineers spent their time reviewing PRs instead of crafting them

Reliability, Cost & Resilience

Observability: Datadog as the observability stack across the fleet
Cost engineering: introduced Spot and hybrid capacity, with on-demand fallback to tolerate instance-exhaustion events
GameDays: twice-yearly drills exercising failure scenarios — AZ-outage availability among them

People & Practice

Taking care of a small (7), talented APAC SRE squad — zero attrition and zero self-inflicted incidents during my tenure
Mentorship & upskilling: invested in team certifications (Terraform especially); the modules they shipped reached a level the team had not produced before, with continuous support so no engineer flew alone
Platform engineering at group scale: the macro-scale ArgoCD ecosystem and self-service GitOps enabled developer teams across many Local Business Units to ship to production fast and safely
Hackathons: drove the hackathon program with deliberate themes — disabilities and accessibility among them, surfacing features like ingredient-to-condition filtering (diabetes, allergies, G6PD, gluten) and visual-impairment support, several of which shipped to production
Tooling & Automation (Python, Terraform, Go, JS, HTML, CSS, Helm, Kustomize)
Infrastructure Security and Hardening (Topology & Perimetrical)

Mainstream Contributions

argo-rollouts — advanced canary across multiple rollout providers, so north-south and east-west traffic splits can be routed independently
kubernetes/ingress-nginx — incremental update capacity for fleets running tens of thousands of ingress resources, plus controller-runtime observability (timings, counts)
Atlantis — hardened the admin portal, and added proper SEMVER support so Terraform modules absorb minor tf-binary updates automatically across vast IaaC

Share is Care

Wins: Macro-scale ArgoCD · Multiple mainstream contributions · Self-service IaaC state · APAC LBU enablement

Reports: 7Squads: APAC SRESkills: Kubernetes, ArgoCD, ArgoRollouts, ArgoWorkflows, Terraform, Python, Go, Helm, Kustomize, ingress-nginx, Atlantis, AWS, EKS, Datadog

SRE Manager (Principal Engineer) · Prudential PACS · Singapore

06.2019 - 01.2021

Drove a 100-year-old regulated financial institution from on-prem and early DevOps into a cloud-native, containerised operating model in almost two years. End-to-end PMO ownership delivered structural savings across cloud, licensing, and augmentation:

Leading a team of nine across SRE and Architecture squads — zero attrition during my tenure, and a materially shortened SLA turnaround on infrastructure tasks, driven by upskilling the team through certifications (CKA, CKAD, Terraform Associate, etc.)
OnPrem to Cloud Migrations Expertise — Planning, Architecture, End-to-End Execution
Containerisation and migration of legacy platforms — moved IBM WebSphere and JBoss workloads off OpenShift and VMs into AKS, containers only
Evolved Jenkins to behave like Drone-CI — declarative-YAML pipelines
TRM-aligned SDLC — crafted Prudential's SDLC covering DevOps, Agile, Pipelines, Artifact Lineage, and CAB, aligned with the MAS TRM 2021 Act (Monetary Authority of Singapore)
COVID WFH enablement — played a crucial role making sure thousands of employees could work from home safely as the pandemic started
Automation Specialist (CI/CD, Terraform, Python, DevSecOps)
Resilient stateful workloads — introduced Kafka self-hosted via the Confluent Kubernetes Operator and Stolon-managed self-hosted Postgres clusters; production StatefulSets with CSI-backed persistence
Production-grade MVP on AKS — OpenFaaS + Kafka + Cassandra, with autoscaling and full instrumentation through ElasticSearch + Prometheus + Grafana
Performance and production incident troubleshooting
Solutions Architecture — re-engineering and new platforms toward cloud-native and containerised, while maintaining a cost-efficient footprint

Wins: OnPrem to cloud-native · Full DevOps implementation · Complete containerization · MAS TRM 2021-aligned SDLC

Reports: 9Squads: SRE, ArchitectureSkills: Kubernetes, OCP, Terraform, Python, Prometheus, Grafana, EFK, Kafka, Zookeeper, OpenFaaS, AKS, Jenkins, Cassandra, Stolon

VP of Reliability Engineering · DBS Bank · Singapore

02.2019 - 07.2019

First SRE on board helping the organization understand and implement SRE practices on a legacy structure and systems:

Toil reduction by Python automation
Implementation of Monitoring platform with Prometheus and evangelisation of better monitoring practices
Definition of SLO, Error budget, Monitoring Dashboards
Development of prometheus exporters for databases and applications
Helping the organization understand what is SRE and how to implement SRE practices

Wins: Observability Platform · Tooling and Toil Reduction · SLO/ErrorBudgets compositions

Skills: Python, Prometheus, Grafana

Senior Site Reliability Engineer · Cloudflare · Singapore

06.2018 - 02.2019

First line of response in the largest worldwide CDN, over 10k+ physical servers in 200+ locations:

Troubleshooting production issues on Kafka, ZK, K8, Mesos, Ceph and diverse complex systems
Developing and maintaining SaltStack reactors, modules, states and solutions for IaaC and automation
Root cause analysis for escalated support issues in a complex stack environment
Mitigation and troubleshooting of large DDoS attacks, performance issues and production incidents on linux systems

Wins: SaltStack champion · Edge LifeCycle automation · Large scale DDoS mitigation

Skills: Kafka, Zookeeper, Kubernetes, Mesos, Ceph, SaltStack, Linux

Infrastructure Architect SRE/DevOps · Wipro · Hong Kong

08.2015 - 06.2018

Provide a complete consulting service for containers projects:

Creation of topology & diagrams, execution timelines
Hands-on creation images and logical split of large applications into microservices
Monitoring & reporting platforms and deploying clusters of Kubernetes, CoreOS, Rancher, Swarm, Kettle from scratch
Implementation of diverse SDS (Software Defined Storage) solutions like ScaleIO, Ceph, Nutanix
Troubleshoot production performance issues in complex stacks
Defined DevOps CAMS standards and values to accelerate the SDLC
Real CD/CI implementation with CloudFormation, CodeDeploy for large enterprises

Wins: K8s/Rancher/Swarm from scratch · SDS (ScaleIO, Ceph, Nutanix) · DevOps CAMS standards

Skills: Kubernetes, Docker, Rancher, CoreOS, Swarm, Ceph, ScaleIO, Nutanix, CloudFormation, CodeDeploy

Senior Infrastructure Architect · Ping An Insurance · Beijing, China

09.2012 - 06.2015

Engineering process for design, build, and implementation of high availability/disaster recovery infrastructure model for Tier-1 applications across two datacenter:

Consistently delivered survey documentation packages ahead of schedule
Prepared thorough checklists, reports, and Visio drawings documenting circuit and network equipment
Elaboration of technological roadmaps for production environments
Pivotal role in the SDLC CD/CI at the operational side, automating with Puppet or Chef
Coordinated across multiple teams to complete assigned tasks and projects driven by business needs
Troubleshoot production performance issues and suggest architecture changes

Wins: HA/DR across two datacenters · CD/CI automation (Puppet/Chef) · Tech roadmaps for production

Skills: Puppet, Chef, VMware, Linux, High Availability, Disaster Recovery

Chinese Student & Freelance · Independent · China

04.2010 - 04.2012

Career break for language studies and cultural immersion:

Chinese Mandarin Student
English Student
Independent Freelance consulting
Traveler across China

Infrastructure & CyberSecurity Architect · Endesa · Madrid, Spain

03.2008 - 02.2010

Defining Networking, Unix and Security architecture Roadmap:

Supervise the implementation and evolution of new providers solutions
Define best practices and security requirements
Assisting production outsourcing in troubleshooting production performance issues
Provide documentation, analytical evidences in order to improve performance, stability or scalability
Design of Laboratory and test scenarios under bare-metal, virtual environments

Wins: Security architecture roadmap · Lab & test scenario design

Skills: Unix, Networking, Security Architecture, VMware, Bare-metal

Network Security Engineer · Telefonica · Madrid, Spain

03.2007 - 03.2008

Security and Network Engineering:

Performing in-depth application vulnerabilities scans, scheduling of automatic scans
Tracking & Assessment of new vulnerabilities/risks and the impact in our infrastructure
Administration and daily operation of IDS/IPS devices, reporting, mitigation and escalation
Intrusion tests, Perimetrical tests and related intrusion technics
Administration of Checkpoint Firewall-1 National Cores
Migration & Update of Firewall-1 on IPSO & x86 Hardware
Migration of old circuits from Nortel devices into modern IP solutions

Wins: FW-1 National Cores admin · Nortel to IP migration · Vulnerability assessment

Skills: Checkpoint Firewall-1, IDS/IPS, IPSO, Vulnerability Scanning, Nortel

CyberSecurity Engineer · UCM University · Madrid, Spain

10.2006 - 03.2007

Security Operations and Research:

Tracking & Assessment of new vulnerabilities/risks and the impact in our infrastructure
Provide countermeasures, suggested fix and reporting to relevant operational unit
Intrusion tests, Perimetrical tests and related intrusion technics
DDoS Mitigation, in-depth inspection of packets, Pattern discovery and IPS countermeasures
Provide Forensic Analysis of detected intrusions, used method and mitigation for recurrence
Daily operation and monitoring of correlational events platform, design of new rules and security triggers

Wins: DDoS mitigation & forensics · IPS pattern discovery · SIEM rules design

Skills: DDoS Mitigation, Forensics, IPS, Packet Analysis, SIEM

Senior Systems Engineer · Orange · Madrid, Spain

03.2006 - 10.2006

Before Orange acquisition, in Ya.Com:

Planning and execution of new projects infrastructure in the Linux field
Hardening of Operating Systems Linux & Windows
Administration of company distributed storage with NetApp filers
Migration into centralised user governance of Linux systems with Kerberos AD
Tracking & Assessment of new vulnerabilities/risks
Administration of Housing customers ISP side security with FortiGate Firewalls
Administration of company core firewalls with StoneGate and Firewall-1 (IPSO)
Develop centralised tripwire alike platform from scratch (Python & Expect/TCL)

Wins: Custom tripwire platform (Python) · OS hardening specialist · NetApp/Kerberos integration

Skills: Linux, Python, Expect/TCL, NetApp, FortiGate, StoneGate, Kerberos AD

Systems Engineer · Endesa · Madrid, Spain

02.2004 - 03.2006

Unix and Virtualization:

Participating in the migrations of Bare-metal AIX & Linux to Virtualised environments with ESXi & vCenter
Design and Implementation of new software solutions
Administration of AIX LPAR pSeries big computing servers
Administration of company storage, FastT and Hitachi with McData and Brocade fibre switches
Tuning and Troubleshooting of platforms in developing
Patching and Updating production environments, new platforms deployments

Wins: Bare-metal to ESXi migrations · AIX pSeries administration · SAN storage (McData/Brocade)

Skills: AIX, Linux, VMware ESXi, vCenter, FastT, Hitachi, McData, Brocade

Systems Operations · Telefonica · Madrid, Spain

07.2003 - 02.2004

Infrastructure Operations:

Administration of Apache and Application Servers, performance troubleshooting, hardening and daily maintenance
Elaboration of monitoring scripts
Administration of corporative DNS, LDAP, TACACS+ and DHCP servers

Skills: Apache, DNS, LDAP, TACACS+, DHCP

Technical Support Specialist · Orange · Madrid, Spain

02.2003 - 07.2003

Before acquired by Orange, in Ya.Com:

Platform troubleshooting and escalation of incidents
Maintenance of platforms, patching, users management
Support for RIMA Network circuits (ADSL)
Administration of Radius ACL & Users

Skills: RADIUS, ADSL, Linux

Selected Work

01 · Leaderboard Achievement

8x HuggingFace #1 Champion

2023-2024

Eight #1 positions on the HuggingFace Open LLM Leaderboard, competing with major tech firms and AI labs using original post-training techniques.

Eight #1 positions on the HuggingFace Open LLM Leaderboard across both v1 and v2 eras, competing with models from major tech firms and AI labs using original post-training techniques (UNA and MGS) applied systematically across different base architectures. Competed against 70B models with 7B, and maintained contamination-free benchmarks.

★ 8x #1 HuggingFace Open LLM Leaderboard

8 separate #1 positions across 2023-2024
Displaced Intel's neural-chat from #1 (Nov 2023)
#8 ALL SIZES with Cybertron v2 — 7B competing against 70B+
#1 across ALL model sizes with TheBeagle (Jan 2024)
Contamination-free verification with 5-gram analysis
Consistent results across Mistral, Intel, Yi/Smaug, and Qwen bases

Total #1 Positions: 8Leaderboard Eras: v1 & v2Model Sizes: 1.5B to 34BBase Architectures: 4+

Client: Independent — Juanako.AIRole: Sole authorDuration: 2023-2024Team: SoloTags: #1 Leaderboard, LLM, Post-Training, UNA, MGS, Open Source

02 · Open Source Contributions

Enterprise Infrastructure Contributions

2021-2025

7+ PRs merged into mainline Kubernetes, Argo, Atlantis, and SurfSense — addressing real production-scale problems.

Contributions to mainstream infrastructure projects used in enterprise deployments. Kubernetes ingress-nginx, Argo Rollouts, Atlantis, SurfSense. All PRs merged into mainline repositories, addressing production-scale problems.

★ Merged PRs in Kubernetes, Argo, Atlantis & More

7+ PRs merged across 4 mainstream projects
All contributions merged into mainline repositories
Focus on large-scale deployment challenges

Projects: 4+PRs Merged: 7+Scope: Enterprise-gradeStatus: Merged to mainline

Client: Open Source CommunityRole: ContributorDuration: 2021-2025Team: Solo contributionsTags: Kubernetes, Argo, Atlantis, Open Source, Infrastructure, GitOps

03 · Neural Image Codec & Visual Tokenizer

UNAVision

2024-Present

Compact neural vision codec and visual tokenizer — 16:1 spatial compression at 97.69% fidelity, under 150K trainable parameters, batches 6× 40MP images on a single RTX 4090.

UNAVision is a compact neural vision codec and visual tokenizer. It compresses arbitrary RGB imagery into a dense latent at a fixed 16:1 spatial ratio and reconstructs at 1–4% fidelity loss — and the loss shrinks as resolution grows (inverse of typical codecs). I can batch 6x 40MP images on a single RTX 4090. Under 150K trainable parameters. 100% codebook utilization (zero dead codes). Dual continuous/discrete bottleneck on same weights with <0.10% gap.

★ 150K Params, 40MP Batching, 97.69% Fidelity

16:1 spatial compression ratio
97.69% average reconstruction fidelity
Under 150K trainable parameters
Batches 6x 40MP images on single RTX 4090
Loss decreases with resolution (inverse of typical codecs)
Dual continuous/discrete bottleneck

Spatial Compression: 16:1Avg Fidelity: 97.69%Peak Fidelity: 99.42%Parameters: <150K

Client: Independent · Eval repo publicRole: Sole author · Architecture, training, evalsDuration: OngoingTeam: SoloTags: Vision, Image Codec, Visual Tokenizer, VAE Alternative, Compression

04 · Research · Embedding-Geometry NLP

HarEmb

2025

Classification, retrieval, and NLP tasks from LLM embedding geometry — only lightweight forward pass components, 28x faster than conventional transformers.

HarEmb performs classification, retrieval, and NLP tasks by exploiting the geometry of LLM embedding matrices. Results achieved using Qwen2.5-0.5B, a very small model — demonstrating that embeddings geometry carries significant semantic information even at minimal scale. Lightweight components run 28x faster than conventional transformers.

★ 93% Classification, 28x Faster Inference

Only lightweight forward pass components
Retrieval extension with MRR@10 >0.9
Throughput: thousands of samples per second
Exploits embeddings geometry with lightweight components

AG News: 93.16%Emotion: 90.75%IMDB: 86.01%SST-2: 83.72%MS MARCO MRR@10: 0.941Speedup: 28x

Client: Independent — Author-attestedRole: Sole authorDuration: 2025Team: SoloTags: Embeddings, Efficient Inference, NLP, Content Moderation, RAG

05 · Production Deployment

Cloudflare Workers AI

2023-2025

Cybertron 7B v2 hosted as a first-party model in Cloudflare's Workers AI catalog — the only third-party fine-tune in the lineup, served at the edge for nearly two years.

Cloudflare hosted Cybertron 7B v2 on their global Workers AI inference platform as a first-party model — served at the edge with OpenAI-compatible endpoints, a 15,000-token context window, and a public playground. The only third-party fine-tune in their catalog under an independent-developer namespace. Hosted for nearly two years.

★ Only Independent Developer in Cloudflare's AI Catalog

First-party model in Cloudflare's curated catalog
Only independent developer namespace in Workers AI
Nearly two years of production hosting

Deployment Duration: ~2 yearsContext Window: 15,000 tokensModel ID: @cf/fblgit/una-cybertron-7b-v2-bf16API: OpenAI-compatible

Client: Cloudflare — Independent DeveloperRole: Model authorDuration: Dec 2023 - Oct 2025Team: SoloTags: Cloudflare, Edge Deployment, Production, Workers AI, Global Scale

06 · SFT/RLHF Technique

UNA — Uniform Neural Alignment

2023-2024

An auxiliary loss-based architecture patch for HuggingFace Transformers, applied during SFT/RLHF. 18 public releases across multiple base models, with multiple #1 leaderboard positions.

UNA is Uniform Neural Alignment — a transformers architecture change introducing an auxiliary loss, applied as a patch to HuggingFace Transformers models. Operates during SFT and RLHF training. Applicable to attention layers, MLP layers, or both. Memory intensive but compatible with LoRA. Training data does not need to be novel, but must not have been previously overfitted. Applied across Mistral, Intel, Yi/Smaug, Qwen2.5, LLaMA 1 & 2, Pythia, and Luxa architectures.

★ 8 Public Releases, Multiple #1 Positions

Consistent positive delta over base models
Multiple #1 leaderboard positions
Applicable to different network layers

Public Releases: 18Base Architectures: 4+Model Sizes: 1.5B to 34B#1 Positions: Multiple

Client: Independent — Juanako.AIRole: Sole author · Method, training, releasesDuration: 2023 & 2024Team: SoloTags: transformers, deepspeed, accelerate, axolotl, torch, wandb, sft, rhlf, distributed-training

07 · Agentic Ecosystem · Spec-Driven Engineering

DIL — Domain Intent Language

2025

A spec-driven agentic ecosystem for long-horizon engineering on enterprise brown-field code.

Agentic coding on greenfield demos is easy. Doing it on the kind of code a business actually runs on — years of history, multiple owners, no authoritative map, and a context window that runs out before the work does — is where most agentic workflows fall apart. DIL exists to make that second case tractable.

★ Every task lands as a reviewed spec before it lands as code.

Proven Scale: 100K+ LOCHost Agents: Claude Code · Codex · KiroSurfaces: Language · Server · UI · MCP · CLIReview Model: SWE Approval Gates

Client: Independent — Ecosystem projectRole: Creator · Ecosystem authorDuration: 2025Team: SoloTags: Agentic, Spec-Driven, MCP, GraphRAG, Brown-Field, Enterprise, Claude Code, Codex, Kiro

08 · SFT/RLHF Regularization

MGS — MultiGumbelSampling

2024-2025

Regularization technique using Gumbel-sampled noise during SFT/RLHF. Combines with UNA (UNAMGS) for additive performance gains.

MGS is MultiGumbelSampling — a regularization technique introducing Gumbel-sampled noise across signal paths during SFT/RLHF training. Combinable with UNA (UNAMGS releases) for additive performance gains.

★ Compatible with UNA for Additive Gains

Compatible with UNA — UNAMGS combines both
Operates on different network paths than UNA
First public release: Oct 2024

Public Releases: 5UNAMGS Releases: 4Model Sizes: 1.5B to 7B#1 Positions: Multiple

Client: Independent — Juanako.AIRole: Sole authorDuration: 2024-2025Team: SoloTags: transformers, deepspeed, accelerate, axolotl, torch, wandb, sft, rhlf, distributed-training

09 · Parameter-Efficient Adaptation

SingleMoM

2024

Exploratory parameter-efficient adaptation that competes with LoRA on GLUE at <0.25M trainable params, with zero-overhead expert switching at inference.

SingleMoM is an exploratory parameter-efficient adaptation approach that competes with LoRA on GLUE benchmarks at a fraction of the trainable parameter cost, while enabling zero-overhead expert switching at inference. Early experiments are encouraging — there's room to better understand its expressiveness, behavior across domains, and potential extensions (e.g. image adapters). SFT experiments on RoBERTa reproduce the LoRA paper's evaluation setup. RLHF track on LLaMA-3 explored per-expert datasets across language, conversational style, formatting, text-to-SQL, and structured output, with experts being combinable at inference (e.g. German × humanlike experts produced the fblgit/german-humanlike-clean-1k dataset).

★ Promising Early Results — More Research Underway

Competes with LoRA on GLUE at a fraction of trainable params
Zero-overhead expert switching at inference
Experts can be combined (e.g. German × humanlike → real dataset output)
Tested under SFT, DPO, and PPO setups
Open directions: expressiveness, cross-domain behavior, image adapters

Client: Independent — Author-attestedRole: Sole authorDuration: 2024Team: SoloTags: transformers, torch, wandb, sft, rlhf, lora-alternative, parameter-efficient

10 · Agent Workbench · MIT Open Source

ClaudeBench

2025

Redis-first, event-driven workbench with swarm intelligence for long-running Claude coding sessions. JSONRPC + WebSocket + MCP. Open source under MIT.

A Redis-first, event-driven workbench with swarm intelligence for decomposing complex tasks into specialist-assigned subtasks. Features JSONRPC 2.0 + WebSocket communication, MCP integration, and React dashboard with Kanban. Architecture anticipated Anthropic's published long-running-agent harness pattern.

★ Anticipated Anthropic's Harness Pattern

Redis-first coordination with direct primitives
Swarm intelligence for task decomposition
Event-driven with JSONRPC 2.0 + WebSocket
MCP integration from day one
React dashboard with Kanban
579+ commits at time of writing

Client: Open source (MIT)Role: Creator & sole maintainerDuration: Ongoing since 2025Team: SoloTags: Agent, Claude, MCP, Redis, Task Management, Swarm Intelligence

11 · Agentic Distributed Trace Simulation

eLLMulator

2025

Each source file becomes an autonomous Claude agent communicating via MCP — surfaces contract mismatches and assumption bugs through OpenTelemetry traces.

Traditional distributed tracing shows what happened at runtime but can't reason about intent or surface contract mismatches. eLLMulator takes a different approach: LLM agents become your software components. Each agent studies its assigned source file, then interacts with other agents via synchronous MCP tool calls that mirror real function calls. The call graph emerges naturally from code control flow, producing traces that capture not just what happened, but why each component behaved as it did.

★ Open Source · Claude Agent SDK + MCP

Source files become autonomous Claude agents
Agent communication mirrors real function calls via MCP
Five finding types including contract mismatches and assumption bugs
Three trace modes: Full, Targeted, and Lens
OpenTelemetry export to standard observability platforms

Finding Types: 5Trace Modes: 3MCP Servers: 2License: Open Source

Client: Open sourceRole: CreatorDuration: 2025Team: SoloTags: Claude Agent SDK, MCP, OpenTelemetry, Code Analysis, Distributed Tracing, Open Source

12 · Agentic Portfolio · Single-User · Free

Juanako — This Site

2026

This site. An agentic portfolio that doubles as a private career copilot — single-user, free, local-first by default.

Most portfolio sites are passive — a scroll of work. Most AI chatbots are ungrounded — they hallucinate away from whatever they're supposed to be about. Most job-search tools pick a side — they serve recruiters or candidates, rarely both. This site refuses those defaults. The visitor-facing agent is strictly grounded in what's on record, drives the interface rather than just describing it, and can produce a printable match report scoped to a recruiter's job description. A local-only companion surface turns the same product into a private career copilot — application tracking, candid notes that never leave the user's machine, and a suite of generators for the moments that matter before, during, and after a job hunt.

★ Dual-audience agent, one codebase, no shared accounts.

Agent drives the interface — navigation, highlighting, deep-dives happen through real actions, not claimed ones
Seven printable deliverable templates spanning the full application arc
Applications act as durable containers — JD plus every generated artefact pinned to the role
Candid data stays on the user's machine; nothing personal is hosted or shared
Generation reshapes style and wording, never substance — every claim traces to the source knowledge; voice rules cut hype and flattery without inventing facts
Meaningful emphasis on security — surface isolation, session integrity, clear public/private boundaries

Deliverable Templates: 7Audiences: Visitor + CandidateHosting: Self-Hostable · FreePrivacy Model: Local-First

Client: IndependentRole: Sole authorDuration: 2026 · OngoingTeam: SoloTags: Agentic, Grounded LLM, Single-User, Free, Local-First, Dual-Audience, Portfolio, Career Copilot

13 · Custom Datasets

Training Datasets

2023-2025

Five custom datasets across math, knowledge, and RLHF — used in #1 leaderboard models and SingleMoM expert composition experiments.

Custom datasets built for targeted training experiments. The simple-math family explores minimal arithmetic corpora for reasoning under SFT and DPO. Tree of Knowledge introduces symbolic knowledge structuring. The german-humanlike pair demonstrates downstream artifacts produced by composing SingleMoM RLHF experts.

★ 5 Public Datasets · Powering #1 Models & RLHF Experiments

Public Datasets: 5Largest: 800K rowsUsed in #1 Models: YesCoverage: Math · Knowledge · Style

Client: Independent — Juanako.AIRole: Dataset authorDuration: 2023-2025Team: SoloTags: dataset, huggingface, synthetic-data, rlhf, dpo, sft, data-engineering

14 · ML Engineering

10,000+ Tracked Experiments

2023-2025

Over 10,000 documented experiments in Weights & Biases — sweeps, ablations, and training runs underpinning every published technique.

Over 10,000 experiments tracked in Weights & Biases — sweeps, ablations, hyperparameter searches, and training runs. Each technique developed (UNA, MGS, SingleMoM, HarEmb, UNAVision) came from methodical experimentation across documented training runs.

★ 10,000+ WandB Tracked Experiments

10,000+ total tracked experiments
Systematic hyperparameter sweeps
Architecture ablation studies
Training dynamics analysis
Reproducible experiment tracking
Cross-technique comparison studies

Total Experiments: 10,000+Techniques Developed: 5+Tracking Platform: Weights & BiasesMethodology: Systematic

Client: Independent — Juanako.AIRole: Sole researcherDuration: OngoingTeam: SoloTags: MLOps, Experiment Tracking, WandB, Ablations, Systematic Research

15 · Track Record · Tinkering · Hobby

Engineering — Receipts Since the Early 2000s

2000s–Present

Two decades of building in public — from glFTPd community tools in C/TCL/SQL in the early 2000s, to performance-first Docker images in 2016, to neural-net debuggers, admission mutators, and smart-home IoT today.

Before AI became the headline, the craft was already there. Contributions to the glFTPd scene in the early 2000s in C, TCL, and SQL — networking primitives, sitebot tooling, and utilities. Docker images on the public registry since 2016, chasing scale, observability, and performance: MariaDB MaxScale, DBNinja, Rundeck, Cacti, and an HHVM repo-build image that packaged Facebook's top-performance PHP runtime into a container years before containerization of perf-first PHP was common. Today's tinkering continues in the same spirit — neural-net visualization, model-weight similarity analysis, declarative Jira, Kubernetes admission mutators driven by live Prometheus signals, Home Assistant + ESP smart-home glue, and ARM64/CUDA ports shared with the community. All hobby. At the job, I deliver more and better.

★ 20+ Years Shipping — glFTPd, Docker, K8s, ML, IoT

glFTPd community contributions in C, TCL, and SQL since the early 2000s
Docker Hub publisher since 2016 — performance and observability focus
HHVM repo-build image (2016–2018) — top-performance PHP in a container before it was common
Neural-net debugger (transviz) with time-travel replay of training sessions
Kubernetes admission mutator driven by live Prometheus metrics (nemutator)
Home Assistant + ESP smart-home with OTP-gated physical access

Years Shipping: 20+Docker Hub Since: 2016Public Repos: DozensDomains: Net · Perf · ML · IoT

Client: Independent — CommunityRole: Author / ContributorDuration: Early 2000s – OngoingTeam: SoloTags: C, TCL, SQL, Docker, Kubernetes, IoT, Open Source, Hobby