Full record · everything, expanded

The complete record.

Every role's full responsibilities and every case study's full body — on a single page. The same content the modals render, served in long form so you can scan, search, or hand the URL to someone else.

01 · Person

Xavier Murias

I've been in infrastructure since 2003 — through sysadmin, security, virtualization, containers, SRE, platform, and now AI. The work kept changing names; the shape of the problem didn't. Whatever I'm good at now, I owe to the next thing always being harder than the last.

These days I run platform at Xendit — seven thousand services across dozens of clusters, the kind of fleet where you measure success in things that didn't happen. Together with my team, we keep iterating and advancing our stack: materializing our own utopia day by day.

I independently released models with UNA and MGS — post-training methods I developed and applied across multiple transformer architectures, reaching #1 on the HuggingFace LLM Leaderboard several times: TheBeagle, Juanako, miniClaus, and Cybertron, which was served for nearly two years in Cloudflare Workers AI. Each iteration sharpened my intuition over deep neural networks, so I kept building AI.

I read more than I write, run more experiments than I publish, and contribute upstream when the fix belongs in the project itself, not a private fork. The systems I'm proudest of are the ones nobody notices, because they just keep running.

02 · Education & Certifications

13 entries

Academic and accredited.

Education

Massachusetts Institute of Technology
6.00.1x — Introduction to Computer Science and Programming Using Python
2018 – 2018
Introduction to Computer Science and Programming Using Python, and Introduction to Computational Thinking and Data Science.
MITx 6.00.1x Certificate
University of Washington
Essentials of CyberSecurity — Professional Certificate
2017 – 2018
Four-course UWx track (CYB001x–CYB004x): cybersecurity foundations, the CISO's view, defensive toolkit, and career-path framing.
Professional Certificate UWx CyberSecurity
Liceo Scientifico
Computer and Information Sciences
1995 – 2000

Certifications

Mensa International
Mensa · May 1994
LFS158x — Introduction to Kubernetes
The Linux Foundation · Jun 2018
Certified Linux Administrator (LPIC-1)
Linux Professional Institute (LPI) · Sep 2009
Security+
CompTIA · Mar 2007
Check Point Certified Security Administrator (CCSA)
Check Point Software · Jan 2008
IBM Certified Advanced Technical Expert — Power Systems with AIX
IBM · Aug 2004
Red Hat Certified Engineer (RHCE)
Red Hat · Feb 2004
SUNSA
Sun Microsystems · May 2003
CHFI — Computer Hacking Forensic Investigator
EC-Council · Feb 2007
Certified Ethical Hacker (CEH)
EC-Council · Dec 2006

03 · Experience

16 entries

Every role, fully expanded.

08.2022 - PresentXenditSingapore

Head of Infrastructure and Science

Xendit · Singapore

Reports15Squads

SREDataSecurity

Wins

99.999% uptimePure IaaC self-service7-figures cost efficiency

Stack

EKSArgoCDGitOpsTerraformCloudflareSplit DNSMulti-CDN

Spearheaded a comprehensive overhaul of infrastructure and engineering culture at Xendit, managing 7,000+ services across dozens of Kubernetes clusters and thousands of nodes. Delivered unprecedented efficiency, 7-figure cost savings, and a transition from a toil-focused squad to a high-performing SRE engineering unit.

Platform Engineering & Orchestration

Fleet Re-Architecture: Completely reimagined and rebuilt the Kubernetes fleet, eliminating technical debt and implementing a true Active/Active Multi-Cluster architecture for hyper-distributed workloads.
Advanced GitOps & Control Plane: Engineered a custom ArgoCD Macro-Scale Framework and plugin, enabling a dry, layered YAML structure to manage thousands of deployments across a distributed fleet with infinite-scale design.
Lifecycle & Scaling: Achieved zero-downtime, zero-error-rate EKS lifecycle management. Leveraged Karpenter for high-performance node provisioning and KEDA to drive extreme cost efficiency, enabling services to scale to zero during low-demand periods.
Deployment & Governance: Standardized canary deployments via Argo Rollouts and orchestrated complex QA/CI/CD pipelines with Argo Workflows, all governed by automated Kyverno policy enforcement.

Networking & Traffic Engineering

Multi-Cluster Connectivity: Deployed Cilium MultiCluster Mesh to provide seamless, secure, and observable connectivity across the global service landscape.
Traffic Steering & DR: Designed a fault-tolerant, transparent DR strategy via a Split DNS Horizon on multi-region/multi-AZ topologies. Developed an advanced DNS hierarchy for weighted blue/green load balancing between CDNs, clusters, and regions.
Edge Migration: Executed a flawless, zero-downtime migration from Imperva to Cloudflare using a Multi-CDN topology.

Reliability & Cost Engineering

SLA & Incident Management: Elevated platform availability from 99.98% to 99.999% SLA, maintaining a near two-year record of zero team-led incidents. Elevated RCA and Post-Mortems to mission-critical status.
Strategic Cost Optimization: Delivered consistent 20% YoY cost reductions through a dual-track strategy:
- Architectural Efficiency: Implementing scaling-to-zero (KEDA), right-sizing via Karpenter, and optimized compute architectures (arm64/amd64).
- Financial Engineering: Orchestrating complex capacity planning involving Spot instances, Reserved Instances, and Savings Plans.
Data Layer Resilience: Managed high-availability, self-hosted data layers including YugabyteDB, MongoDB, and PostgreSQL within the Kubernetes ecosystem.

03.2024 - 08.2024Candy.AIRemote

AI/ML Lead (Contract)

Candy.AI · Remote

Reports5Squads

AIInfrastructure

Wins

Scale from thousands to millionsBuilt from scratch: pure IaaCGKE GPU cost-efficient fleets

Stack

GCPGKETerraformHelmArgoCDGitOpsK6PyTorch/TransformersStableDiffusionLLMPub/Sub

Single-handedly revolutionized a startup's AI capabilities, scaling from nascent to millions of users.

Platform & Orchestration

Fleet rebuild: from a handful of Azure VM GPUs to a fully orchestrated, hyperscale GKE on GCP environment with NVIDIA GPU autoscaling, supporting multi-GPU and diverse generative workloads
CloudNative IaC discipline: Terraform, Helm, ArgoCD, and GitOps
Just-in-time launch: delivered the GKE fleet ahead of media and TV exposure — fleet held the resulting ramp from hundreds of thousands to millions of users at 99.99% uptime
Dev → prod gating: containerized AI workflows and inference for consistent deployment and controlled promotion across environments

Inference & Generative Workloads

Data pipelines: engineered the foundations that fed training and inference workflows
'SuperBooga' inference broker: an event-driven bus fronting GPU inference — nodes pull from a pub/sub queue and serve one request at a time per GPU, avoiding the performance hit a single GPU takes from concurrent inference
GPU & instance right-sizing: through performance and load-testing, matched adequate GPU and instance family to each workload while keeping a cost-efficient footprint
Distributed weights & adapters: intra-cluster storage kept in sync with the upstream source and fanned out to inference containers on demand, accelerating cold-start times and cutting storage costs

ML Engineering & Practice

End-to-end product delivery: translated aspirational & conceptual product requirements into shipped features — both the supporting software and custom-trained PyTorch/Transformers models
Performance discipline: custom stress-testing for LLM and Stable Diffusion, anchored in a K6-driven load-testing and observability culture

01.2021 - 08.2022foodpandaSingapore

Principal DevOps Engineer

foodpanda · Singapore

Reports7Squads

APAC SRE

Wins

Macro-scale ArgoCDMultiple mainstream contributionsSelf-service IaaC stateAPAC LBU enablement

Stack

KubernetesArgoCDArgoRolloutsArgoWorkflowsTerraformPythonGoHelmKustomizeingress-nginxAtlantisAWSEKSDatadog

First Principal Engineer of its kind at foodpanda — running APAC infrastructure not just for foodpanda but across the entire Delivery Hero group. Each cluster a self-contained Local Business Unit: a country or region running its own business, independently. Large-scale computing of containerised environments, driven by IaaC on high-availability and high-concurrency systems.

Platform & Orchestration

Kubernetes at large scale, concurrency, resilience — thousands of distributed services, thousands of nodes, tens of thousands of ingress resources under management
Cluster lifecycle on AWS/EKS: Terraform-imported the existing clusters into a blue/green blueprint, then spun new clusters inside the same VPC/networking — communicating internally and presenting as a single perimeter entity: a "metacluster"
Macro-scale ArgoCD ecosystem: a custom plugin, a shared chart, and a layered DRY footprint repo. Reproducible infrastructure built on meta-modular abstractions, so the group could stand up new Regions and Countries fast and safe
ArgoCD and ArgoRollouts (Design, Deployment, Customisation, Workshops)
Kubernetes Tailoring (MPA, Controllers, Advanced Scheduling, Affinity, etc.)
Self-service GitOps absorbed the bulk of the Jira Service Desk hot-request types — engineers spent their time reviewing PRs instead of crafting them

Reliability, Cost & Resilience

Observability: Datadog as the observability stack across the fleet
Cost engineering: introduced Spot and hybrid capacity, with on-demand fallback to tolerate instance-exhaustion events
GameDays: twice-yearly drills exercising failure scenarios — AZ-outage availability among them

People & Practice

Taking care of a small (7), talented APAC SRE squad — zero attrition and zero self-inflicted incidents during my tenure
Mentorship & upskilling: invested in team certifications (Terraform especially); the modules they shipped reached a level the team had not produced before, with continuous support so no engineer flew alone
Platform engineering at group scale: the macro-scale ArgoCD ecosystem and self-service GitOps enabled developer teams across many Local Business Units to ship to production fast and safely
Hackathons: drove the hackathon program with deliberate themes — disabilities and accessibility among them, surfacing features like ingredient-to-condition filtering (diabetes, allergies, G6PD, gluten) and visual-impairment support, several of which shipped to production
Tooling & Automation (Python, Terraform, Go, JS, HTML, CSS, Helm, Kustomize)
Infrastructure Security and Hardening (Topology & Perimetrical)

Mainstream Contributions

argo-rollouts — advanced canary across multiple rollout providers, so north-south and east-west traffic splits can be routed independently
kubernetes/ingress-nginx — incremental update capacity for fleets running tens of thousands of ingress resources, plus controller-runtime observability (timings, counts)
Atlantis — hardened the admin portal, and added proper SEMVER support so Terraform modules absorb minor tf-binary updates automatically across vast IaaC

Share is Care

06.2019 - 01.2021Prudential PACSSingapore

SRE Manager (Principal Engineer)

Prudential PACS · Singapore

Reports9Squads

SREArchitecture

Wins

OnPrem to cloud-nativeFull DevOps implementationComplete containerizationMAS TRM 2021-aligned SDLC

Stack

KubernetesOCPTerraformPythonPrometheusGrafanaEFKKafkaZookeeperOpenFaaSAKSJenkinsCassandraStolon

Drove a 100-year-old regulated financial institution from on-prem and early DevOps into a cloud-native, containerised operating model in almost two years. End-to-end PMO ownership delivered structural savings across cloud, licensing, and augmentation:

Leading a team of nine across SRE and Architecture squads — zero attrition during my tenure, and a materially shortened SLA turnaround on infrastructure tasks, driven by upskilling the team through certifications (CKA, CKAD, Terraform Associate, etc.)
OnPrem to Cloud Migrations Expertise — Planning, Architecture, End-to-End Execution
Containerisation and migration of legacy platforms — moved IBM WebSphere and JBoss workloads off OpenShift and VMs into AKS, containers only
Evolved Jenkins to behave like Drone-CI — declarative-YAML pipelines
TRM-aligned SDLC — crafted Prudential's SDLC covering DevOps, Agile, Pipelines, Artifact Lineage, and CAB, aligned with the MAS TRM 2021 Act (Monetary Authority of Singapore)
COVID WFH enablement — played a crucial role making sure thousands of employees could work from home safely as the pandemic started
Automation Specialist (CI/CD, Terraform, Python, DevSecOps)
Resilient stateful workloads — introduced Kafka self-hosted via the Confluent Kubernetes Operator and Stolon-managed self-hosted Postgres clusters; production StatefulSets with CSI-backed persistence
Production-grade MVP on AKS — OpenFaaS + Kafka + Cassandra, with autoscaling and full instrumentation through ElasticSearch + Prometheus + Grafana
Performance and production incident troubleshooting
Solutions Architecture — re-engineering and new platforms toward cloud-native and containerised, while maintaining a cost-efficient footprint

02.2019 - 07.2019DBS BankSingapore

VP of Reliability Engineering

DBS Bank · Singapore

Wins

Observability PlatformTooling and Toil ReductionSLO/ErrorBudgets compositions

Stack

PythonPrometheusGrafana

First SRE on board helping the organization understand and implement SRE practices on a legacy structure and systems:

Toil reduction by Python automation
Implementation of Monitoring platform with Prometheus and evangelisation of better monitoring practices
Definition of SLO, Error budget, Monitoring Dashboards
Development of prometheus exporters for databases and applications
Helping the organization understand what is SRE and how to implement SRE practices

06.2018 - 02.2019CloudflareSingapore

Senior Site Reliability Engineer

Cloudflare · Singapore

Wins

SaltStack championEdge LifeCycle automationLarge scale DDoS mitigation

Stack

KafkaZookeeperKubernetesMesosCephSaltStackLinux

First line of response in the largest worldwide CDN, over 10k+ physical servers in 200+ locations:

Troubleshooting production issues on Kafka, ZK, K8, Mesos, Ceph and diverse complex systems
Developing and maintaining SaltStack reactors, modules, states and solutions for IaaC and automation
Root cause analysis for escalated support issues in a complex stack environment
Mitigation and troubleshooting of large DDoS attacks, performance issues and production incidents on linux systems

08.2015 - 06.2018WiproHong Kong

Infrastructure Architect SRE/DevOps

Wipro · Hong Kong

Wins

K8s/Rancher/Swarm from scratchSDS (ScaleIO, Ceph, Nutanix)DevOps CAMS standards

Stack

KubernetesDockerRancherCoreOSSwarmCephScaleIONutanixCloudFormationCodeDeploy

Provide a complete consulting service for containers projects:

Creation of topology & diagrams, execution timelines
Hands-on creation images and logical split of large applications into microservices
Monitoring & reporting platforms and deploying clusters of Kubernetes, CoreOS, Rancher, Swarm, Kettle from scratch
Implementation of diverse SDS (Software Defined Storage) solutions like ScaleIO, Ceph, Nutanix
Troubleshoot production performance issues in complex stacks
Defined DevOps CAMS standards and values to accelerate the SDLC
Real CD/CI implementation with CloudFormation, CodeDeploy for large enterprises

09.2012 - 06.2015Ping An InsuranceBeijing, China

Senior Infrastructure Architect

Ping An Insurance · Beijing, China

Wins

HA/DR across two datacentersCD/CI automation (Puppet/Chef)Tech roadmaps for production

Stack

PuppetChefVMwareLinuxHigh AvailabilityDisaster Recovery

Engineering process for design, build, and implementation of high availability/disaster recovery infrastructure model for Tier-1 applications across two datacenter:

Consistently delivered survey documentation packages ahead of schedule
Prepared thorough checklists, reports, and Visio drawings documenting circuit and network equipment
Elaboration of technological roadmaps for production environments
Pivotal role in the SDLC CD/CI at the operational side, automating with Puppet or Chef
Coordinated across multiple teams to complete assigned tasks and projects driven by business needs
Troubleshoot production performance issues and suggest architecture changes

04.2010 - 04.2012IndependentChina

Chinese Student & Freelance

Independent · China

Career break for language studies and cultural immersion:

Chinese Mandarin Student
English Student
Independent Freelance consulting
Traveler across China

03.2008 - 02.2010EndesaMadrid, Spain

Infrastructure & CyberSecurity Architect

Endesa · Madrid, Spain

Wins

Security architecture roadmapLab & test scenario design

Stack

UnixNetworkingSecurity ArchitectureVMwareBare-metal

Defining Networking, Unix and Security architecture Roadmap:

Supervise the implementation and evolution of new providers solutions
Define best practices and security requirements
Assisting production outsourcing in troubleshooting production performance issues
Provide documentation, analytical evidences in order to improve performance, stability or scalability
Design of Laboratory and test scenarios under bare-metal, virtual environments

03.2007 - 03.2008TelefonicaMadrid, Spain

Network Security Engineer

Telefonica · Madrid, Spain

Wins

FW-1 National Cores adminNortel to IP migrationVulnerability assessment

Stack

Checkpoint Firewall-1IDS/IPSIPSOVulnerability ScanningNortel

Security and Network Engineering:

Performing in-depth application vulnerabilities scans, scheduling of automatic scans
Tracking & Assessment of new vulnerabilities/risks and the impact in our infrastructure
Administration and daily operation of IDS/IPS devices, reporting, mitigation and escalation
Intrusion tests, Perimetrical tests and related intrusion technics
Administration of Checkpoint Firewall-1 National Cores
Migration & Update of Firewall-1 on IPSO & x86 Hardware
Migration of old circuits from Nortel devices into modern IP solutions

10.2006 - 03.2007UCM UniversityMadrid, Spain

CyberSecurity Engineer

UCM University · Madrid, Spain

Wins

DDoS mitigation & forensicsIPS pattern discoverySIEM rules design

Stack

DDoS MitigationForensicsIPSPacket AnalysisSIEM

Security Operations and Research:

Tracking & Assessment of new vulnerabilities/risks and the impact in our infrastructure
Provide countermeasures, suggested fix and reporting to relevant operational unit
Intrusion tests, Perimetrical tests and related intrusion technics
DDoS Mitigation, in-depth inspection of packets, Pattern discovery and IPS countermeasures
Provide Forensic Analysis of detected intrusions, used method and mitigation for recurrence
Daily operation and monitoring of correlational events platform, design of new rules and security triggers

03.2006 - 10.2006OrangeMadrid, Spain

Senior Systems Engineer

Orange · Madrid, Spain

Wins

Custom tripwire platform (Python)OS hardening specialistNetApp/Kerberos integration

Stack

LinuxPythonExpect/TCLNetAppFortiGateStoneGateKerberos AD

Before Orange acquisition, in Ya.Com:

Planning and execution of new projects infrastructure in the Linux field
Hardening of Operating Systems Linux & Windows
Administration of company distributed storage with NetApp filers
Migration into centralised user governance of Linux systems with Kerberos AD
Tracking & Assessment of new vulnerabilities/risks
Administration of Housing customers ISP side security with FortiGate Firewalls
Administration of company core firewalls with StoneGate and Firewall-1 (IPSO)
Develop centralised tripwire alike platform from scratch (Python & Expect/TCL)

02.2004 - 03.2006EndesaMadrid, Spain

Systems Engineer

Endesa · Madrid, Spain

Wins

Bare-metal to ESXi migrationsAIX pSeries administrationSAN storage (McData/Brocade)

Stack

AIXLinuxVMware ESXivCenterFastTHitachiMcDataBrocade

Unix and Virtualization:

Participating in the migrations of Bare-metal AIX & Linux to Virtualised environments with ESXi & vCenter
Design and Implementation of new software solutions
Administration of AIX LPAR pSeries big computing servers
Administration of company storage, FastT and Hitachi with McData and Brocade fibre switches
Tuning and Troubleshooting of platforms in developing
Patching and Updating production environments, new platforms deployments

07.2003 - 02.2004TelefonicaMadrid, Spain

Systems Operations

Telefonica · Madrid, Spain

Stack

ApacheDNSLDAPTACACS+DHCP

Infrastructure Operations:

Administration of Apache and Application Servers, performance troubleshooting, hardening and daily maintenance
Elaboration of monitoring scripts
Administration of corporative DNS, LDAP, TACACS+ and DHCP servers

02.2003 - 07.2003OrangeMadrid, Spain

Technical Support Specialist

Orange · Madrid, Spain

Stack

RADIUSADSLLinux

Before acquired by Orange, in Ya.Com:

Platform troubleshooting and escalation of incidents
Maintenance of platforms, patching, users management
Support for RIMA Network circuits (ADSL)
Administration of Radius ACL & Users

04 · Portfolio

15 entries

Every case study, fully expanded.

01 · 2023-2024Leaderboard

8x HuggingFace #1 Champion

Multiple #1 positions across two leaderboard eras

★8x #1 HuggingFace Open LLM Leaderboard

Eight #1 positions on the HuggingFace Open LLM Leaderboard across both v1 and v2 eras, competing with models from major tech firms and AI labs using original post-training techniques (UNA and MGS) applied systematically across different base architectures. Competed against 70B models with 7B, and maintained contamination-free benchmarks.

HuggingFace Profile

Client: Independent — Juanako.AI
Role: Sole author
Duration: 2023-2024
Team: Solo

Outcomes

Total #1 Positions

v1 & v2

Leaderboard Eras

1.5B to 34B

Model Sizes

Base Architectures

Highlights

•8 separate #1 positions across 2023-2024
•Displaced Intel's neural-chat from #1 (Nov 2023)
•#8 ALL SIZES with Cybertron v2 — 7B competing against 70B+
•#1 across ALL model sizes with TheBeagle (Jan 2024)
•Contamination-free verification with 5-gram analysis
•Consistent results across Mistral, Intel, Yi/Smaug, and Qwen bases

Model Releases

Juanako 7B UNA

#1 7B — displaced Intel neural-chat

MiniClaus 1.5B UNAMGS

#1 7-8B, contamination-free

21-Nov-2024

HuggingFace

The Track Record

8 separate #1 positions across 2023-2024. Displaced Intel's neural-chat from #1 in November 2023. Reached #8 ALL SIZES with Cybertron v2 — a 7B model competing against 70B+ models.

#1 across ALL model sizes with TheBeagle in January 2024. Contamination-free verification with 5-gram analysis. Consistent results across Mistral, Intel, Yi/Smaug, and Qwen bases.

#1 LeaderboardLLMPost-TrainingUNAMGSOpen Source

02 · 2021-2025Open Source

Enterprise Infrastructure Contributions

Merged contributions to Kubernetes, Argo, and Atlantis

★Merged PRs in Kubernetes, Argo, Atlantis & More

Contributions to mainstream infrastructure projects used in enterprise deployments. Kubernetes ingress-nginx, Argo Rollouts, Atlantis, SurfSense. All PRs merged into mainline repositories, addressing production-scale problems.

Client: Open Source Community
Role: Contributor
Duration: 2021-2025
Team: Solo contributions

Outcomes

Projects

PRs Merged

Enterprise-grade

Scope

Merged to mainline

Status

Highlights

•7+ PRs merged across 4 mainstream projects
•All contributions merged into mainline repositories
•Focus on large-scale deployment challenges

Pull Requests

kubernetes/ingress-nginx #7711

Nov 2021

Added AdmissionController metrics

No observability into admission controller performance — couldn't track timing, rendered ingresses, or config sizes

View PR

kubernetes/ingress-nginx #7514

Sep 2021

--disable-full-test flag for admission controller

Test duration scales with cluster size — 2000 ingresses = 150-200MB config files and ~20s delays impacting container resources

View PR

argoproj/argo-rollouts #1472

Oct 2021

Multiple TrafficRoutingReconciler support

Only the first traffic routing definition was used when multiple were specified — couldn't run multiple traffic managers in parallel

View PR

runatlantis/atlantis #1777

Oct 2021

BasicAuth Support for Atlantis ServeHTTP

No built-in authentication for Atlantis web UI — required external ingress auth layers

View PR

runatlantis/atlantis #1776

Nov 2022

Terraform version detection with >= and ~> specifiers

Atlantis failed to detect Terraform versions using >= or ~> operators in required_version

View PR

MODSetter/SurfSense #122

Jun 2025

GitHub Actions Docker publish workflow

No automated container builds — orgs had to fork/build the application themselves

View PR

MODSetter/SurfSense #117

May 2025

Slack rate limiting & GitHub Repos ORG filtering

HTTP 429 errors in large Slack workspaces; GitHub connector only showed user-owned repos, not org repos

View PR

Impact

Kubernetes ingress-nginx — PRs #7711, #7514: Added AdmissionController metrics and --disable-full-test flag for large-scale deployments.

Argo Rollouts — PR #1472: Multiple TrafficRoutingReconciler support enabling NGINX + SMI simultaneously.

Atlantis — PRs #1777, #1776: BasicAuth support and Terraform version detection with >= and ~> specifiers.

KubernetesArgoAtlantisOpen SourceInfrastructureGitOps

03 · 2024-PresentResearch

UNAVision

Neural Image Codec & Visual Tokenizer

★150K Params, 40MP Batching, 97.69% Fidelity

UNAVision is a compact neural vision codec and visual tokenizer. It compresses arbitrary RGB imagery into a dense latent at a fixed 16:1 spatial ratio and reconstructs at 1–4% fidelity loss — and the loss shrinks as resolution grows (inverse of typical codecs). I can batch 6x 40MP images on a single RTX 4090. Under 150K trainable parameters. 100% codebook utilization (zero dead codes). Dual continuous/discrete bottleneck on same weights with <0.10% gap.

Evaluation Repository

Client: Independent · Eval repo public
Role: Sole author · Architecture, training, evals
Duration: Ongoing
Team: Solo

Outcomes

16:1

Spatial Compression

97.69%

Avg Fidelity

99.42%

Peak Fidelity

<150K

Parameters

Highlights

•16:1 spatial compression ratio
•97.69% average reconstruction fidelity
•Under 150K trainable parameters
•Batches 6x 40MP images on single RTX 4090
•Loss decreases with resolution (inverse of typical codecs)
•Dual continuous/discrete bottleneck
•100% codebook utilization (zero dead codes)
•UNA Audio prototype also developed

Reconstruction Fidelity · Drag to Compare

Wildlife · 2560px

⇄

Original

Reconstructed

What it is

A compact neural vision codec and visual tokenizer. Compresses arbitrary RGB imagery into a dense, well-structured latent at a fixed 16:1 spatial ratio and reconstructs at 1–4% fidelity loss on natural imagery.

Loss shrinks as input grows: 4–6K photos land in the 1–2% band; 40 MP cases hold there comfortably. 100% active visual vocabulary utilization — zero dead codes.

Memory envelope

A batch of half a dozen 40 MP images fits in a single forward pass on one RTX 4090 — no tiling, no sharding, no gradient checkpointing acrobatics, no OOM.

Possible because activation memory is dominated by the 16:1 bottleneck and the network sits under 150K trainable parameters.

VisionImage CodecVisual TokenizerVAE AlternativeCompression

04 · 2025Research

HarEmb

Classification, Retrieval, and NLP from Embeddings Geometry

★93% Classification, 28x Faster Inference

HarEmb performs classification, retrieval, and NLP tasks by exploiting the geometry of LLM embedding matrices. Results achieved using Qwen2.5-0.5B, a very small model — demonstrating that embeddings geometry carries significant semantic information even at minimal scale. Lightweight components run 28x faster than conventional transformers.

Client: Independent — Author-attested
Role: Sole author
Duration: 2025
Team: Solo

Outcomes

93.16%

AG News

90.75%

Emotion

86.01%

IMDB

83.72%

SST-2

0.941

MS MARCO MRR@10

28x

Speedup

<150M

Total Network

<20M

Trainable Params

Highlights

•Only lightweight forward pass components
•Retrieval extension with MRR@10 >0.9
•Throughput: thousands of samples per second
•Exploits embeddings geometry with lightweight components

EmbeddingsEfficient InferenceNLPContent ModerationRAG

05 · 2023-2025Production

Cloudflare Workers AI

Global Edge Deployment for ~2 Years

★Only Independent Developer in Cloudflare's AI Catalog

Cloudflare hosted Cybertron 7B v2 on their global Workers AI inference platform as a first-party model — served at the edge with OpenAI-compatible endpoints, a 15,000-token context window, and a public playground. The only third-party fine-tune in their catalog under an independent-developer namespace. Hosted for nearly two years.

Client: Cloudflare — Independent Developer
Role: Model author
Duration: Dec 2023 - Oct 2025
Team: Solo

Outcomes

~2 years

Deployment Duration

15,000 tokens

Context Window

@cf/fblgit/una-cybertron-7b-v2-bf16

Model ID

OpenAI-compatible

API

Highlights

•First-party model in Cloudflare's curated catalog
•Only independent developer namespace in Workers AI
•Nearly two years of production hosting

Global Edge Deployment

Global edge deployment across Cloudflare's network with OpenAI-compatible API endpoints. 15,000-token context window with public playground available.

Only independent developer fine-tune in catalog. Approximately 2 years in production from December 2023 to October 2025.

CloudflareEdge DeploymentProductionWorkers AIGlobal Scale

06 · 2023-2024Technique

UNA — Uniform Neural Alignment

Yet-Unpublished LLM SFT/RLHF Technique

★8 Public Releases, Multiple #1 Positions

UNA is Uniform Neural Alignment — a transformers architecture change introducing an auxiliary loss, applied as a patch to HuggingFace Transformers models. Operates during SFT and RLHF training. Applicable to attention layers, MLP layers, or both. Memory intensive but compatible with LoRA. Training data does not need to be novel, but must not have been previously overfitted. Applied across Mistral, Intel, Yi/Smaug, Qwen2.5, LLaMA 1 & 2, Pythia, and Luxa architectures.

All UNA Models

Client: Independent — Juanako.AI
Role: Sole author · Method, training, releases
Duration: 2023 & 2024
Team: Solo

Outcomes

Public Releases

Base Architectures

1.5B to 34B

Model Sizes

Multiple

#1 Positions

Highlights

•Consistent positive delta over base models
•Multiple #1 leaderboard positions
•Applicable to different network layers

Selected Releases & Leaderboard Positions

9 of 18 total releases

Date	Model	Size	Base	1.5B	3B	7B	34B	70B+
28-Nov-2023	Juanako 7B UNA	7B	Mistral	·	·	🏆	·	·
02-Dec-2023	Cybertron v1	7B	Mistral	·	·	🏆	·	·
05-Dec-2023	Cybertron v2	7B	Mistral	·	·	🏆	🏆	🏆
09-Dec-2023	Xaberius 34B v1beta	34B	Yi	·	·	·	🏆	🏆
11-Jan-2024	UNA-TheBeagle v1	7B	Mistral	·	·	🏆	🏆	🏆
04-Feb-2024	UNA-SimpleSmaug 34B	34B	Yi/Smaug	·	·	·	🏆	·
30-Oct-2024	Cybertron v4 MGS	7B	Qwen2.5	·	·	🏆	·	·
07-Nov-2024	MiniClaus UNAMGS	1.5B	Qwen2.5	🏆	·	·	·	·
21-Nov-2024	Cybertron v4 UNAMGS	7B	Qwen2.5	·	·	🏆	·	·

🏆 marks the size tier reached on the leaderboard

transformersdeepspeedaccelerateaxolotltorchwandbsftrhlfdistributed-training

07 · 2025Ecosystem

DIL — Domain Intent Language

Where the agent, the spec, and the codebase share one surface.

★Every task lands as a reviewed spec before it lands as code.

Agentic coding on greenfield demos is easy. Doing it on the kind of code a business actually runs on — years of history, multiple owners, no authoritative map, and a context window that runs out before the work does — is where most agentic workflows fall apart. DIL exists to make that second case tractable.

Client: Independent — Ecosystem project
Role: Creator · Ecosystem author
Duration: 2025
Team: Solo

Outcomes

100K+ LOC

Proven Scale

Claude Code · Codex · Kiro

Host Agents

Language · Server · UI · MCP · CLI

Surfaces

SWE Approval Gates

Review Model

Screenshots

The substrate

DIL is three things welded into one surface: a spec layer the agent authors against, a graph of the project's structure and relationships, and an agent integration that reaches into the host coding tool — Claude Code, Codex, or Kiro — through MCP. The server hosts all of it (database, web UI, MCP endpoints), and a CLI sits alongside for humans who prefer the terminal.

The loop

An agent onboards a project fly-solo — crawling, building the graph, and registering itself without supervision, while a human watches progress through the CLI or the UI. From there, every task runs the same shape: the agent produces a DIL-SPEC through a workflow pipeline, the human reviews and approves at the gates built into the flow, and implementation proceeds against the approved spec.

During the work, the agent searches and reasons across the graph, the spec layer, and the source code in a single query — the three surfaces are one. When code inevitably drifts away from the spec, the ecosystem self-heals, either through a direct command or as a native step inside the SWE workflow. Skills, SubAgents, and Commands extend the reach inside the host agent, so the integration isn't a thin adapter — it's first-class behavior.

Lineage

DIL is what Tree-of-Knowledge symbolic tuning becomes when you push it into the software-engineering domain and make the symbolic structure load-bearing, not academic.

AgenticSpec-DrivenMCPGraphRAGBrown-FieldEnterpriseClaude CodeCodexKiro

08 · 2024-2025Technique

MGS — MultiGumbelSampling

Yet-Unpublished LLM Regularization Technique

★Compatible with UNA for Additive Gains

MGS is MultiGumbelSampling — a regularization technique introducing Gumbel-sampled noise across signal paths during SFT/RLHF training. Combinable with UNA (UNAMGS releases) for additive performance gains.

All MGS Models

Client: Independent — Juanako.AI
Role: Sole author
Duration: 2024-2025
Team: Solo

Outcomes

Public Releases

UNAMGS Releases

1.5B to 7B

Model Sizes

Multiple

#1 Positions

Highlights

•Compatible with UNA — UNAMGS combines both
•Operates on different network paths than UNA
•First public release: Oct 2024

Selected Releases & Leaderboard Positions

Date	Model	Size	Base	1.5B	3B	7B	34B	70B+
30-Oct-2024	Cybertron v4 MGS	7B	Qwen2.5	·	·	🏆	·	·
07-Nov-2024	MiniClaus UNAMGS	1.5B	Qwen2.5	🏆	·	·	·	·
21-Nov-2024	Cybertron v4 UNAMGS	7B	Qwen2.5	·	·	🏆	·	·
04-Nov-2024	Pancho v1 3B UNAMGS	3B	Qwen2.5	·	🏆	·	·	·
03-Feb-2025	MiniClaus UNAMGS GRPO	1.5B	Qwen2.5	🏆	·	·	·	·

🏆 marks the size tier reached on the leaderboard

transformersdeepspeedaccelerateaxolotltorchwandbsftrhlfdistributed-training

09 · 2024Research

SingleMoM

Exploratory Parameter-Efficient Adaptation

★Promising Early Results — More Research Underway

SingleMoM is an exploratory parameter-efficient adaptation approach that competes with LoRA on GLUE benchmarks at a fraction of the trainable parameter cost, while enabling zero-overhead expert switching at inference. Early experiments are encouraging — there's room to better understand its expressiveness, behavior across domains, and potential extensions (e.g. image adapters). SFT experiments on RoBERTa reproduce the LoRA paper's evaluation setup. RLHF track on LLaMA-3 explored per-expert datasets across language, conversational style, formatting, text-to-SQL, and structured output, with experts being combinable at inference (e.g. German × humanlike experts produced the fblgit/german-humanlike-clean-1k dataset).

Client: Independent — Author-attested
Role: Sole author
Duration: 2024
Team: Solo

GLUE Benchmark — RoBERTa-large

Method	Trainable Params	CoLA (MCC)	SST-2	QQP	QNLI	MRPC
Full Fine-Tuning	355M	68.0	96.4	92.2	94.7	90.9
LoRA	0.8M	68.2 ± 1.9	96.2 ± 0.5	91.6 ± 0.1	94.9 ± 0.3	90.9 ± 1.2
SingleMoM	<0.25M	67.5–68.8	96.0–96.3	90.7–91.0	94.5–94.7	89.5–90.4

Baselines from LoRA paper (arxiv 2106.09685, Table 2)SingleMoM scores reported as ranges across runs.

Highlights

•Competes with LoRA on GLUE at a fraction of trainable params
•Zero-overhead expert switching at inference
•Experts can be combined (e.g. German × humanlike → real dataset output)
•Tested under SFT, DPO, and PPO setups
•Open directions: expressiveness, cross-domain behavior, image adapters

transformerstorchwandbsftrlhflora-alternativeparameter-efficient

10 · 2025Tooling

ClaudeBench

Claude Code Best Friend

★Anticipated Anthropic's Harness Pattern

A Redis-first, event-driven workbench with swarm intelligence for decomposing complex tasks into specialist-assigned subtasks. Features JSONRPC 2.0 + WebSocket communication, MCP integration, and React dashboard with Kanban. Architecture anticipated Anthropic's published long-running-agent harness pattern.

Client: Open source (MIT)
Role: Creator & sole maintainer
Duration: Ongoing since 2025
Team: Solo

Highlights

•Redis-first coordination with direct primitives
•Swarm intelligence for task decomposition
•Event-driven with JSONRPC 2.0 + WebSocket
•MCP integration from day one
•React dashboard with Kanban
•579+ commits at time of writing

Screenshots

Why I built it

When you're running long coding sessions with Claude as the executor, you run out of context window before you run out of work, and the next session starts blind.

I solved that my way: a Redis-first, event-driven workbench with swarm intelligence for decomposing complex tasks into specialist-assigned subtasks — observable through a real-time dashboard.

Timeline

On 26-Nov-2025 Anthropic published 'Effective harnesses for long-running agents'. ClaudeBench was released approximately 8 weeks prior.

The architectural pattern ClaudeBench implements aligns with the concepts later published in that document.

AgentClaudeMCPRedisTask ManagementSwarm Intelligence

11 · 2025Tooling

eLLMulator

Agentic Distributed Trace Simulation

★Open Source · Claude Agent SDK + MCP

Traditional distributed tracing shows what happened at runtime but can't reason about intent or surface contract mismatches. eLLMulator takes a different approach: LLM agents become your software components. Each agent studies its assigned source file, then interacts with other agents via synchronous MCP tool calls that mirror real function calls. The call graph emerges naturally from code control flow, producing traces that capture not just what happened, but why each component behaved as it did.

GitHub Repository

Client: Open source
Role: Creator
Duration: 2025
Team: Solo

Outcomes

Finding Types

Trace Modes

MCP Servers

Open Source

License

Highlights

•Source files become autonomous Claude agents
•Agent communication mirrors real function calls via MCP
•Five finding types including contract mismatches and assumption bugs
•Three trace modes: Full, Targeted, and Lens
•OpenTelemetry export to standard observability platforms

Screenshots

The approach

Each source file becomes an autonomous Claude agent. Agent-to-agent communication via MCP tool calls mirrors real function calls.

Five finding types: contract mismatches, assumption bugs, missing error paths, dead spots, unexpected calls. Three trace modes: Full, Targeted, and Lens.

Infrastructure

OpenTelemetry export to Jaeger, Tempo, or Honeycomb. Smart entry point detection from natural language scenarios.

Dependency graph (Starmap) with SCC clustering. Multi-layer guardrails: cycle detection, depth limiting, rate limiting, circuit breakers.

Claude Agent SDKMCPOpenTelemetryCode AnalysisDistributed TracingOpen Source

12 · 2026Product

Juanako — This Site

A portfolio that responds.

★Dual-audience agent, one codebase, no shared accounts.

Most portfolio sites are passive — a scroll of work. Most AI chatbots are ungrounded — they hallucinate away from whatever they're supposed to be about. Most job-search tools pick a side — they serve recruiters or candidates, rarely both. This site refuses those defaults. The visitor-facing agent is strictly grounded in what's on record, drives the interface rather than just describing it, and can produce a printable match report scoped to a recruiter's job description. A local-only companion surface turns the same product into a private career copilot — application tracking, candid notes that never leave the user's machine, and a suite of generators for the moments that matter before, during, and after a job hunt.

Client: Independent
Role: Sole author
Duration: 2026 · Ongoing
Team: Solo

Outcomes

Deliverable Templates

Visitor + Candidate

Audiences

Self-Hostable · Free

Hosting

Local-First

Privacy Model

Highlights

•Agent drives the interface — navigation, highlighting, deep-dives happen through real actions, not claimed ones
•Seven printable deliverable templates spanning the full application arc
•Applications act as durable containers — JD plus every generated artefact pinned to the role
•Candid data stays on the user's machine; nothing personal is hosted or shared
•Generation reshapes style and wording, never substance — every claim traces to the source knowledge; voice rules cut hype and flattery without inventing facts
•Meaningful emphasis on security — surface isolation, session integrity, clear public/private boundaries

What visitors see

A portfolio that responds. The agent doesn't just answer questions — it navigates between pages, highlights the case studies and roles that map to a visitor's interest, opens deep-dives in a single step, and refuses to speculate outside what the site actually holds.

Recruiters can hand it a job description (dropped as a file or pasted as a URL) and get back a printable match report grounded in real work, real numbers, and honest gaps. One click, a new tab, a PDF ready to save.

The design stance

Grounded over clever. The agent is constrained to what exists on record; tailoring means picking depth from the source material, not inventing facts. Generation operates on **style and wording** — phrasing, cadence, register, ordering — never on substance: every claim, metric, project, and timeline traces back to the underlying knowledge. The voice rules cut hype, flattery, and unearned superlatives out of the first draft, but they reshape *how* the truth is told, not the truth itself. Security gets meaningful emphasis — strict isolation between the public and private surfaces, integrity checks on conversation history, and candid data that never crosses the network. One codebase, two audiences, no shared accounts, no SaaS layer, no tenants. Single user, free to run, self-hostable.

AgenticGrounded LLMSingle-UserFreeLocal-FirstDual-AudiencePortfolioCareer Copilot

13 · 2023-2025Datasets

Training Datasets

Datasets for Training, Reasoning, and RLHF

★5 Public Datasets · Powering #1 Models & RLHF Experiments

Custom datasets built for targeted training experiments. The simple-math family explores minimal arithmetic corpora for reasoning under SFT and DPO. Tree of Knowledge introduces symbolic knowledge structuring. The german-humanlike pair demonstrates downstream artifacts produced by composing SingleMoM RLHF experts.

Client: Independent — Juanako.AI
Role: Dataset author
Duration: 2023-2025
Team: Solo

Outcomes

Public Datasets

800K rows

Largest

Yes

Used in #1 Models

Math · Knowledge · Style

Coverage

Datasets

Name	Date	Size	Type	Purpose / Used In
simple-math	20-Jan-2024	779K rows	SFT	Arithmetic reasoning · SimpleSmaug #1 34B
simple-math-DPO	27-Jan-2024	800K rows	DPO	Arithmetic preference training
Tree of Knowledge	24-May-2023	~5MB	Symbolic	Knowledge structure · Cybertron 7B v1/v2
german-humanlike-clean-1k	26-Mar-2025	856 rows	RLHF	SingleMoM expert composition (curated)
german-humanlike-large	26-Mar-2025	10.8K rows	RLHF	SingleMoM expert composition (full)

datasethuggingfacesynthetic-datarlhfdposftdata-engineering

14 · 2023-2025Engineering

10,000+ Tracked Experiments

★10,000+ WandB Tracked Experiments

Over 10,000 experiments tracked in Weights & Biases — sweeps, ablations, hyperparameter searches, and training runs. Each technique developed (UNA, MGS, SingleMoM, HarEmb, UNAVision) came from methodical experimentation across documented training runs.

Client: Independent — Juanako.AI
Role: Sole researcher
Duration: Ongoing
Team: Solo

Outcomes

10,000+

Total Experiments

Techniques Developed

Weights & Biases

Tracking Platform

Systematic

Methodology

Highlights

•10,000+ total tracked experiments
•Systematic hyperparameter sweeps
•Architecture ablation studies
•Training dynamics analysis
•Reproducible experiment tracking
•Cross-technique comparison studies

Systematic ML Engineering

10,000+ total tracked experiments with systematic hyperparameter sweeps and architecture ablation studies.

Training dynamics analysis with reproducible experiment tracking. Cross-technique comparison studies across all developed methods.

MLOpsExperiment TrackingWandBAblationsSystematic Research

15 · 2000s–PresentHobby · Receipts

Engineering — Receipts Since the Early 2000s

Two Decades of Building in Public

★20+ Years Shipping — glFTPd, Docker, K8s, ML, IoT

Before AI became the headline, the craft was already there. Contributions to the glFTPd scene in the early 2000s in C, TCL, and SQL — networking primitives, sitebot tooling, and utilities. Docker images on the public registry since 2016, chasing scale, observability, and performance: MariaDB MaxScale, DBNinja, Rundeck, Cacti, and an HHVM repo-build image that packaged Facebook's top-performance PHP runtime into a container years before containerization of perf-first PHP was common. Today's tinkering continues in the same spirit — neural-net visualization, model-weight similarity analysis, declarative Jira, Kubernetes admission mutators driven by live Prometheus signals, Home Assistant + ESP smart-home glue, and ARM64/CUDA ports shared with the community. All hobby. At the job, I deliver more and better.

Client: Independent — Community
Role: Author / Contributor
Duration: Early 2000s – Ongoing
Team: Solo

Outcomes

20+

Years Shipping

2016

Docker Hub Since

Dozens

Public Repos

Net · Perf · ML · IoT

Domains

Highlights

•glFTPd community contributions in C, TCL, and SQL since the early 2000s
•Docker Hub publisher since 2016 — performance and observability focus
•HHVM repo-build image (2016–2018) — top-performance PHP in a container before it was common
•Neural-net debugger (transviz) with time-travel replay of training sessions
•Kubernetes admission mutator driven by live Prometheus metrics (nemutator)
•Home Assistant + ESP smart-home with OTP-gated physical access
•ARM64/CUDA Viseron NVR port shared to save others the build time
•All hobby — at work, delivers more and better

Timeline

Early 2000s

glFTPd community — contributions in C (networking), TCL (sitebots and tooling), and SQL (utilities). Archived at grandis.nu/glftpd/Mr_V/.

2016

Docker Hub publishing begins — fblgit/maxscale-docker, fblgit/dbninja, fblgit/rundeck, fblgit/cacti. Early affinity for scale, observability, and performance.

2016–2018

fblgit/hhvm-repo-build — Facebook's top-performance PHP runtime (HHVM) packaged into a container, iterated across 2016, 2017, and 2018. Containerization applied to performance-first workloads before it was standard.

Jan 2021

fblgit/jarvis-iot-hassio — Home Assistant + ESP firmware + Tuya smart-home integration with ESP-powered smart gate and OTP-based access flow.

Feb 2021

fblgit/nemutator — Kubernetes admission mutation webhook that rewrites pod specs (resources, labels, env, images, selectors) from live Prometheus metrics, with Redis-backed mutation logs for rollback.

Sep 2021

fblgit/jira_as_a_code — declarative YAML planning for Jira: epics, tasks, SOP templates, per-environment iteration.

May 2022

fblgit/viseron-arm64-cuda — ARM64 + CUDA port of the Viseron self-hosted NVR, shared with the community.

Feb 2024

fblgit/model-similarity — cosine-similarity analysis across transformer weights with CSV and interactive HTML reports, quantifying how close a fine-tune is to its base.

Feb 2025

fblgit/transviz — real-time neural-net visualization and debugging: tensor inspection, conditional breakpoints, training metrics, and time-travel replay of captured training sessions.

Oct 2025

fblgit/agentool — public release. Meta-framework for type-safe, composable AI workflows on top of pydantic-ai, with a three-layer architecture and state-driven execution.

The long game

Contributions to the glFTPd community in the early 2000s — C for networking primitives, TCL for sitebot/tooling, SQL for backing stores. Mirror archived at grandis.nu/glftpd/Mr_V/.

Docker images published on the public registry since 2016 — always chasing scale, observability, and performance. MaxScale, DBNinja, Rundeck, Cacti. The HHVM repo-build image packaged Facebook's top-performance PHP runtime into a container across 2016, 2017, and 2018 — applying containerization to performance-first workloads well before it was common practice.

Still tinkering

Prototypes released as they mature: transviz (real-time neural-net visualization with tensor inspection and time-travel replay of training sessions), model-similarity (cosine-similarity analysis of transformer weights with interactive HTML reports), agentool (meta-framework for type-safe, composable AI workflows on top of pydantic-ai).

Concepts and tools: jira_as_a_code (declarative YAML planning — epics, tasks, SOP templates, per-env iteration), nemutator (Kubernetes admission mutation webhook that rewrites pod specs from live Prometheus metrics without redeploy).

Hardware and community saves: jarvis-iot-hassio (Home Assistant + ESP firmware + Tuya smart-home integration with ESP-powered smart gate and OTP-based physical access), viseron-arm64-cuda (ARM64 + CUDA port of the Viseron NVR shared to save others the porting time).

The common thread

Every one of these is hobby. Weekend tinkering, personal itches, things that would have helped me if someone else had built them — so I built and published them instead.

At my job I deliver more and better. Same engineering instinct, wound tighter.

CTCLSQLDockerKubernetesIoTOpen SourceHobby

05 · Contact

Reach out.

Direct line to Xavier: [email protected].

Or via: juanako.ai · github.com/fblgit · huggingface.co/fblgit · linkedin.com/in/xmm-sg.

Prefer the structured forms?