Full record · everything, expanded

The complete record.

Every role's full responsibilities and every case study's full body — on a single page. The same content the modals render, served in long form so you can scan, search, or hand the URL to someone else.

01 · Person

Xavier Murias

I've been in infrastructure since 2003 — through sysadmin, security, virtualization, containers, SRE, platform, and now AI. The work kept changing names; the shape of the problem didn't. Whatever I'm good at now, I owe to the next thing always being harder than the last.

These days I run platform at Xendit — seven thousand services across dozens of clusters, the kind of fleet where you measure success in things that didn't happen. Together with my team, we keep iterating and advancing our stack: materializing our own utopia day by day.

I independently released models with UNA and MGS — post-training methods I developed and applied across multiple transformer architectures, reaching #1 on the HuggingFace LLM Leaderboard several times: TheBeagle, Juanako, miniClaus, and Cybertron, which was served for nearly two years in Cloudflare Workers AI. Each iteration sharpened my intuition over deep neural networks, so I kept building AI.

I read more than I write, run more experiments than I publish, and contribute upstream when the fix belongs in the project itself, not a private fork. The systems I'm proudest of are the ones nobody notices, because they just keep running.

02 · Education & Certifications

13 entries

Academic and accredited.

Education

  • Massachusetts Institute of Technology
    Massachusetts Institute of Technology
    6.00.1x — Introduction to Computer Science and Programming Using Python
    2018 – 2018

    Introduction to Computer Science and Programming Using Python, and Introduction to Computational Thinking and Data Science.

    MITx 6.00.1x Certificate
  • University of Washington
    University of Washington
    Essentials of CyberSecurity — Professional Certificate
    2017 – 2018

    Four-course UWx track (CYB001x–CYB004x): cybersecurity foundations, the CISO's view, defensive toolkit, and career-path framing.

    Professional Certificate UWx CyberSecurity
  • Liceo Scientifico
    Liceo Scientifico
    Computer and Information Sciences
    1995 – 2000

Certifications

  • Mensa International
    Mensa International
    Mensa · May 1994
  • LFS158x — Introduction to Kubernetes
    LFS158x — Introduction to Kubernetes
    The Linux Foundation · Jun 2018
  • Certified Linux Administrator (LPIC-1)
    Certified Linux Administrator (LPIC-1)
    Linux Professional Institute (LPI) · Sep 2009
  • Security+
    Security+
    CompTIA · Mar 2007
  • Check Point Certified Security Administrator (CCSA)
    Check Point Certified Security Administrator (CCSA)
    Check Point Software · Jan 2008
  • IBM Certified Advanced Technical Expert — Power Systems with AIX
    IBM Certified Advanced Technical Expert — Power Systems with AIX
    IBM · Aug 2004
  • Red Hat Certified Engineer (RHCE)
    Red Hat Certified Engineer (RHCE)
    Red Hat · Feb 2004
  • SUNSA
    SUNSA
    Sun Microsystems · May 2003
  • CHFI — Computer Hacking Forensic Investigator
    CHFI — Computer Hacking Forensic Investigator
    EC-Council · Feb 2007
  • Certified Ethical Hacker (CEH)
    Certified Ethical Hacker (CEH)
    EC-Council · Dec 2006

03 · Experience

16 entries

Every role, fully expanded.

08.2022 - PresentXenditSingapore
Xendit logo

Head of Infrastructure and Science

Xendit · Singapore

Reports15Squads
SREDataSecurity
Wins
99.999% uptimePure IaaC self-service7-figures cost efficiency
Stack
EKSArgoCDGitOpsTerraformCloudflareSplit DNSMulti-CDN

Spearheaded a comprehensive overhaul of infrastructure and engineering culture at Xendit, managing 7,000+ services across dozens of Kubernetes clusters and thousands of nodes. Delivered unprecedented efficiency, 7-figure cost savings, and a transition from a toil-focused squad to a high-performing SRE engineering unit.

Platform Engineering & Orchestration

  • Fleet Re-Architecture: Completely reimagined and rebuilt the Kubernetes fleet, eliminating technical debt and implementing a true Active/Active Multi-Cluster architecture for hyper-distributed workloads.
  • Advanced GitOps & Control Plane: Engineered a custom ArgoCD Macro-Scale Framework and plugin, enabling a dry, layered YAML structure to manage thousands of deployments across a distributed fleet with infinite-scale design.
  • Lifecycle & Scaling: Achieved zero-downtime, zero-error-rate EKS lifecycle management. Leveraged Karpenter for high-performance node provisioning and KEDA to drive extreme cost efficiency, enabling services to scale to zero during low-demand periods.
  • Deployment & Governance: Standardized canary deployments via Argo Rollouts and orchestrated complex QA/CI/CD pipelines with Argo Workflows, all governed by automated Kyverno policy enforcement.

Networking & Traffic Engineering

  • Multi-Cluster Connectivity: Deployed Cilium MultiCluster Mesh to provide seamless, secure, and observable connectivity across the global service landscape.
  • Traffic Steering & DR: Designed a fault-tolerant, transparent DR strategy via a Split DNS Horizon on multi-region/multi-AZ topologies. Developed an advanced DNS hierarchy for weighted blue/green load balancing between CDNs, clusters, and regions.
  • Edge Migration: Executed a flawless, zero-downtime migration from Imperva to Cloudflare using a Multi-CDN topology.

Reliability & Cost Engineering

  • SLA & Incident Management: Elevated platform availability from 99.98% to 99.999% SLA, maintaining a near two-year record of zero team-led incidents. Elevated RCA and Post-Mortems to mission-critical status.
  • Strategic Cost Optimization: Delivered consistent 20% YoY cost reductions through a dual-track strategy:
    • Architectural Efficiency: Implementing scaling-to-zero (KEDA), right-sizing via Karpenter, and optimized compute architectures (arm64/amd64).
    • Financial Engineering: Orchestrating complex capacity planning involving Spot instances, Reserved Instances, and Savings Plans.
  • Data Layer Resilience: Managed high-availability, self-hosted data layers including YugabyteDB, MongoDB, and PostgreSQL within the Kubernetes ecosystem.
03.2024 - 08.2024Candy.AIRemote
Candy.AI logo

AI/ML Lead (Contract)

Candy.AI · Remote

Reports5Squads
AIInfrastructure
Wins
Scale from thousands to millionsBuilt from scratch: pure IaaCGKE GPU cost-efficient fleets
Stack
GCPGKETerraformHelmArgoCDGitOpsK6PyTorch/TransformersStableDiffusionLLMPub/Sub

Single-handedly revolutionized a startup's AI capabilities, scaling from nascent to millions of users.

Platform & Orchestration

  • Fleet rebuild: from a handful of Azure VM GPUs to a fully orchestrated, hyperscale GKE on GCP environment with NVIDIA GPU autoscaling, supporting multi-GPU and diverse generative workloads
  • CloudNative IaC discipline: Terraform, Helm, ArgoCD, and GitOps
  • Just-in-time launch: delivered the GKE fleet ahead of media and TV exposure — fleet held the resulting ramp from hundreds of thousands to millions of users at 99.99% uptime
  • Dev → prod gating: containerized AI workflows and inference for consistent deployment and controlled promotion across environments

Inference & Generative Workloads

  • Data pipelines: engineered the foundations that fed training and inference workflows
  • 'SuperBooga' inference broker: an event-driven bus fronting GPU inference — nodes pull from a pub/sub queue and serve one request at a time per GPU, avoiding the performance hit a single GPU takes from concurrent inference
  • GPU & instance right-sizing: through performance and load-testing, matched adequate GPU and instance family to each workload while keeping a cost-efficient footprint
  • Distributed weights & adapters: intra-cluster storage kept in sync with the upstream source and fanned out to inference containers on demand, accelerating cold-start times and cutting storage costs

ML Engineering & Practice

  • End-to-end product delivery: translated aspirational & conceptual product requirements into shipped features — both the supporting software and custom-trained PyTorch/Transformers models
  • Performance discipline: custom stress-testing for LLM and Stable Diffusion, anchored in a K6-driven load-testing and observability culture
01.2021 - 08.2022foodpandaSingapore
foodpanda logo

Principal DevOps Engineer

foodpanda · Singapore

Reports7Squads
APAC SRE
Wins
Macro-scale ArgoCDMultiple mainstream contributionsSelf-service IaaC stateAPAC LBU enablement
Stack
KubernetesArgoCDArgoRolloutsArgoWorkflowsTerraformPythonGoHelmKustomizeingress-nginxAtlantisAWSEKSDatadog

First Principal Engineer of its kind at foodpanda — running APAC infrastructure not just for foodpanda but across the entire Delivery Hero group. Each cluster a self-contained Local Business Unit: a country or region running its own business, independently. Large-scale computing of containerised environments, driven by IaaC on high-availability and high-concurrency systems.

Platform & Orchestration

  • Kubernetes at large scale, concurrency, resilience — thousands of distributed services, thousands of nodes, tens of thousands of ingress resources under management
  • Cluster lifecycle on AWS/EKS: Terraform-imported the existing clusters into a blue/green blueprint, then spun new clusters inside the same VPC/networking — communicating internally and presenting as a single perimeter entity: a "metacluster"
  • Macro-scale ArgoCD ecosystem: a custom plugin, a shared chart, and a layered DRY footprint repo. Reproducible infrastructure built on meta-modular abstractions, so the group could stand up new Regions and Countries fast and safe
  • ArgoCD and ArgoRollouts (Design, Deployment, Customisation, Workshops)
  • Kubernetes Tailoring (MPA, Controllers, Advanced Scheduling, Affinity, etc.)
  • Self-service GitOps absorbed the bulk of the Jira Service Desk hot-request types — engineers spent their time reviewing PRs instead of crafting them

Reliability, Cost & Resilience

  • Observability: Datadog as the observability stack across the fleet
  • Cost engineering: introduced Spot and hybrid capacity, with on-demand fallback to tolerate instance-exhaustion events
  • GameDays: twice-yearly drills exercising failure scenarios — AZ-outage availability among them

People & Practice

  • Taking care of a small (7), talented APAC SRE squad — zero attrition and zero self-inflicted incidents during my tenure
  • Mentorship & upskilling: invested in team certifications (Terraform especially); the modules they shipped reached a level the team had not produced before, with continuous support so no engineer flew alone
  • Platform engineering at group scale: the macro-scale ArgoCD ecosystem and self-service GitOps enabled developer teams across many Local Business Units to ship to production fast and safely
  • Hackathons: drove the hackathon program with deliberate themes — disabilities and accessibility among them, surfacing features like ingredient-to-condition filtering (diabetes, allergies, G6PD, gluten) and visual-impairment support, several of which shipped to production
  • Tooling & Automation (Python, Terraform, Go, JS, HTML, CSS, Helm, Kustomize)
  • Infrastructure Security and Hardening (Topology & Perimetrical)

Mainstream Contributions

  • argo-rollouts — advanced canary across multiple rollout providers, so north-south and east-west traffic splits can be routed independently
  • kubernetes/ingress-nginx — incremental update capacity for fleets running tens of thousands of ingress resources, plus controller-runtime observability (timings, counts)
  • Atlantis — hardened the admin portal, and added proper SEMVER support so Terraform modules absorb minor tf-binary updates automatically across vast IaaC

Share is Care

06.2019 - 01.2021Prudential PACSSingapore
Prudential PACS logo

SRE Manager (Principal Engineer)

Prudential PACS · Singapore

Reports9Squads
SREArchitecture
Wins
OnPrem to cloud-nativeFull DevOps implementationComplete containerizationMAS TRM 2021-aligned SDLC
Stack
KubernetesOCPTerraformPythonPrometheusGrafanaEFKKafkaZookeeperOpenFaaSAKSJenkinsCassandraStolon

Drove a 100-year-old regulated financial institution from on-prem and early DevOps into a cloud-native, containerised operating model in almost two years. End-to-end PMO ownership delivered structural savings across cloud, licensing, and augmentation:

  • Leading a team of nine across SRE and Architecture squads — zero attrition during my tenure, and a materially shortened SLA turnaround on infrastructure tasks, driven by upskilling the team through certifications (CKA, CKAD, Terraform Associate, etc.)
  • OnPrem to Cloud Migrations Expertise — Planning, Architecture, End-to-End Execution
  • Containerisation and migration of legacy platforms — moved IBM WebSphere and JBoss workloads off OpenShift and VMs into AKS, containers only
  • Evolved Jenkins to behave like Drone-CI — declarative-YAML pipelines
  • TRM-aligned SDLC — crafted Prudential's SDLC covering DevOps, Agile, Pipelines, Artifact Lineage, and CAB, aligned with the MAS TRM 2021 Act (Monetary Authority of Singapore)
  • COVID WFH enablement — played a crucial role making sure thousands of employees could work from home safely as the pandemic started
  • Automation Specialist (CI/CD, Terraform, Python, DevSecOps)
  • Resilient stateful workloads — introduced Kafka self-hosted via the Confluent Kubernetes Operator and Stolon-managed self-hosted Postgres clusters; production StatefulSets with CSI-backed persistence
  • Production-grade MVP on AKSOpenFaaS + Kafka + Cassandra, with autoscaling and full instrumentation through ElasticSearch + Prometheus + Grafana
  • Performance and production incident troubleshooting
  • Solutions Architecture — re-engineering and new platforms toward cloud-native and containerised, while maintaining a cost-efficient footprint
02.2019 - 07.2019DBS BankSingapore
DBS Bank logo

VP of Reliability Engineering

DBS Bank · Singapore

Wins
Observability PlatformTooling and Toil ReductionSLO/ErrorBudgets compositions
Stack
PythonPrometheusGrafana

First SRE on board helping the organization understand and implement SRE practices on a legacy structure and systems:

  • Toil reduction by Python automation
  • Implementation of Monitoring platform with Prometheus and evangelisation of better monitoring practices
  • Definition of SLO, Error budget, Monitoring Dashboards
  • Development of prometheus exporters for databases and applications
  • Helping the organization understand what is SRE and how to implement SRE practices
06.2018 - 02.2019CloudflareSingapore
Cloudflare logo

Senior Site Reliability Engineer

Cloudflare · Singapore

Wins
SaltStack championEdge LifeCycle automationLarge scale DDoS mitigation
Stack
KafkaZookeeperKubernetesMesosCephSaltStackLinux

First line of response in the largest worldwide CDN, over 10k+ physical servers in 200+ locations:

  • Troubleshooting production issues on Kafka, ZK, K8, Mesos, Ceph and diverse complex systems
  • Developing and maintaining SaltStack reactors, modules, states and solutions for IaaC and automation
  • Root cause analysis for escalated support issues in a complex stack environment
  • Mitigation and troubleshooting of large DDoS attacks, performance issues and production incidents on linux systems
08.2015 - 06.2018WiproHong Kong
Wipro logo

Infrastructure Architect SRE/DevOps

Wipro · Hong Kong

Wins
K8s/Rancher/Swarm from scratchSDS (ScaleIO, Ceph, Nutanix)DevOps CAMS standards
Stack
KubernetesDockerRancherCoreOSSwarmCephScaleIONutanixCloudFormationCodeDeploy

Provide a complete consulting service for containers projects:

  • Creation of topology & diagrams, execution timelines
  • Hands-on creation images and logical split of large applications into microservices
  • Monitoring & reporting platforms and deploying clusters of Kubernetes, CoreOS, Rancher, Swarm, Kettle from scratch
  • Implementation of diverse SDS (Software Defined Storage) solutions like ScaleIO, Ceph, Nutanix
  • Troubleshoot production performance issues in complex stacks
  • Defined DevOps CAMS standards and values to accelerate the SDLC
  • Real CD/CI implementation with CloudFormation, CodeDeploy for large enterprises
09.2012 - 06.2015Ping An InsuranceBeijing, China
Ping An Insurance logo

Senior Infrastructure Architect

Ping An Insurance · Beijing, China

Wins
HA/DR across two datacentersCD/CI automation (Puppet/Chef)Tech roadmaps for production
Stack
PuppetChefVMwareLinuxHigh AvailabilityDisaster Recovery

Engineering process for design, build, and implementation of high availability/disaster recovery infrastructure model for Tier-1 applications across two datacenter:

  • Consistently delivered survey documentation packages ahead of schedule
  • Prepared thorough checklists, reports, and Visio drawings documenting circuit and network equipment
  • Elaboration of technological roadmaps for production environments
  • Pivotal role in the SDLC CD/CI at the operational side, automating with Puppet or Chef
  • Coordinated across multiple teams to complete assigned tasks and projects driven by business needs
  • Troubleshoot production performance issues and suggest architecture changes
04.2010 - 04.2012IndependentChina

Chinese Student & Freelance

Independent · China

Career break for language studies and cultural immersion:

  • Chinese Mandarin Student
  • English Student
  • Independent Freelance consulting
  • Traveler across China
03.2008 - 02.2010EndesaMadrid, Spain
Endesa logo

Infrastructure & CyberSecurity Architect

Endesa · Madrid, Spain

Wins
Security architecture roadmapLab & test scenario design
Stack
UnixNetworkingSecurity ArchitectureVMwareBare-metal

Defining Networking, Unix and Security architecture Roadmap:

  • Supervise the implementation and evolution of new providers solutions
  • Define best practices and security requirements
  • Assisting production outsourcing in troubleshooting production performance issues
  • Provide documentation, analytical evidences in order to improve performance, stability or scalability
  • Design of Laboratory and test scenarios under bare-metal, virtual environments
03.2007 - 03.2008TelefonicaMadrid, Spain
Telefonica logo

Network Security Engineer

Telefonica · Madrid, Spain

Wins
FW-1 National Cores adminNortel to IP migrationVulnerability assessment
Stack
Checkpoint Firewall-1IDS/IPSIPSOVulnerability ScanningNortel

Security and Network Engineering:

  • Performing in-depth application vulnerabilities scans, scheduling of automatic scans
  • Tracking & Assessment of new vulnerabilities/risks and the impact in our infrastructure
  • Administration and daily operation of IDS/IPS devices, reporting, mitigation and escalation
  • Intrusion tests, Perimetrical tests and related intrusion technics
  • Administration of Checkpoint Firewall-1 National Cores
  • Migration & Update of Firewall-1 on IPSO & x86 Hardware
  • Migration of old circuits from Nortel devices into modern IP solutions
10.2006 - 03.2007UCM UniversityMadrid, Spain
UCM University logo

CyberSecurity Engineer

UCM University · Madrid, Spain

Wins
DDoS mitigation & forensicsIPS pattern discoverySIEM rules design
Stack
DDoS MitigationForensicsIPSPacket AnalysisSIEM

Security Operations and Research:

  • Tracking & Assessment of new vulnerabilities/risks and the impact in our infrastructure
  • Provide countermeasures, suggested fix and reporting to relevant operational unit
  • Intrusion tests, Perimetrical tests and related intrusion technics
  • DDoS Mitigation, in-depth inspection of packets, Pattern discovery and IPS countermeasures
  • Provide Forensic Analysis of detected intrusions, used method and mitigation for recurrence
  • Daily operation and monitoring of correlational events platform, design of new rules and security triggers
03.2006 - 10.2006OrangeMadrid, Spain
Orange logo

Senior Systems Engineer

Orange · Madrid, Spain

Wins
Custom tripwire platform (Python)OS hardening specialistNetApp/Kerberos integration
Stack
LinuxPythonExpect/TCLNetAppFortiGateStoneGateKerberos AD

Before Orange acquisition, in Ya.Com:

  • Planning and execution of new projects infrastructure in the Linux field
  • Hardening of Operating Systems Linux & Windows
  • Administration of company distributed storage with NetApp filers
  • Migration into centralised user governance of Linux systems with Kerberos AD
  • Tracking & Assessment of new vulnerabilities/risks
  • Administration of Housing customers ISP side security with FortiGate Firewalls
  • Administration of company core firewalls with StoneGate and Firewall-1 (IPSO)
  • Develop centralised tripwire alike platform from scratch (Python & Expect/TCL)
02.2004 - 03.2006EndesaMadrid, Spain
Endesa logo

Systems Engineer

Endesa · Madrid, Spain

Wins
Bare-metal to ESXi migrationsAIX pSeries administrationSAN storage (McData/Brocade)
Stack
AIXLinuxVMware ESXivCenterFastTHitachiMcDataBrocade

Unix and Virtualization:

  • Participating in the migrations of Bare-metal AIX & Linux to Virtualised environments with ESXi & vCenter
  • Design and Implementation of new software solutions
  • Administration of AIX LPAR pSeries big computing servers
  • Administration of company storage, FastT and Hitachi with McData and Brocade fibre switches
  • Tuning and Troubleshooting of platforms in developing
  • Patching and Updating production environments, new platforms deployments
07.2003 - 02.2004TelefonicaMadrid, Spain
Telefonica logo

Systems Operations

Telefonica · Madrid, Spain

Stack
ApacheDNSLDAPTACACS+DHCP

Infrastructure Operations:

  • Administration of Apache and Application Servers, performance troubleshooting, hardening and daily maintenance
  • Elaboration of monitoring scripts
  • Administration of corporative DNS, LDAP, TACACS+ and DHCP servers
02.2003 - 07.2003OrangeMadrid, Spain
Orange logo

Technical Support Specialist

Orange · Madrid, Spain

Stack
RADIUSADSLLinux

Before acquired by Orange, in Ya.Com:

  • Platform troubleshooting and escalation of incidents
  • Maintenance of platforms, patching, users management
  • Support for RIMA Network circuits (ADSL)
  • Administration of Radius ACL & Users

04 · Portfolio

15 entries

Every case study, fully expanded.

01 · 2023-2024Leaderboard

8x HuggingFace #1 Champion

Multiple #1 positions across two leaderboard eras

8x #1 HuggingFace Open LLM Leaderboard

Eight #1 positions on the HuggingFace Open LLM Leaderboard across both v1 and v2 eras, competing with models from major tech firms and AI labs using original post-training techniques (UNA and MGS) applied systematically across different base architectures. Competed against 70B models with 7B, and maintained contamination-free benchmarks.

Client
Independent — Juanako.AI
Role
Sole author
Duration
2023-2024
Team
Solo
Outcomes
8
Total #1 Positions
v1 & v2
Leaderboard Eras
1.5B to 34B
Model Sizes
4+
Base Architectures
Highlights
  • 8 separate #1 positions across 2023-2024
  • Displaced Intel's neural-chat from #1 (Nov 2023)
  • #8 ALL SIZES with Cybertron v2 — 7B competing against 70B+
  • #1 across ALL model sizes with TheBeagle (Jan 2024)
  • Contamination-free verification with 5-gram analysis
  • Consistent results across Mistral, Intel, Yi/Smaug, and Qwen bases
01

The Track Record

8 separate #1 positions across 2023-2024. Displaced Intel's neural-chat from #1 in November 2023. Reached #8 ALL SIZES with Cybertron v2 — a 7B model competing against 70B+ models.

#1 across ALL model sizes with TheBeagle in January 2024. Contamination-free verification with 5-gram analysis. Consistent results across Mistral, Intel, Yi/Smaug, and Qwen bases.

#1 LeaderboardLLMPost-TrainingUNAMGSOpen Source
02 · 2021-2025Open Source

Enterprise Infrastructure Contributions

Merged contributions to Kubernetes, Argo, and Atlantis

Merged PRs in Kubernetes, Argo, Atlantis & More

Contributions to mainstream infrastructure projects used in enterprise deployments. Kubernetes ingress-nginx, Argo Rollouts, Atlantis, SurfSense. All PRs merged into mainline repositories, addressing production-scale problems.

Client
Open Source Community
Role
Contributor
Duration
2021-2025
Team
Solo contributions
Outcomes
4+
Projects
7+
PRs Merged
Enterprise-grade
Scope
Merged to mainline
Status
Highlights
  • 7+ PRs merged across 4 mainstream projects
  • All contributions merged into mainline repositories
  • Focus on large-scale deployment challenges
01

Impact

Kubernetes ingress-nginx — PRs #7711, #7514: Added AdmissionController metrics and --disable-full-test flag for large-scale deployments.

Argo Rollouts — PR #1472: Multiple TrafficRoutingReconciler support enabling NGINX + SMI simultaneously.

Atlantis — PRs #1777, #1776: BasicAuth support and Terraform version detection with >= and ~> specifiers.

KubernetesArgoAtlantisOpen SourceInfrastructureGitOps
03 · 2024-PresentResearch

UNAVision

Neural Image Codec & Visual Tokenizer

150K Params, 40MP Batching, 97.69% Fidelity

UNAVision is a compact neural vision codec and visual tokenizer. It compresses arbitrary RGB imagery into a dense latent at a fixed 16:1 spatial ratio and reconstructs at 1–4% fidelity loss — and the loss shrinks as resolution grows (inverse of typical codecs). I can batch 6x 40MP images on a single RTX 4090. Under 150K trainable parameters. 100% codebook utilization (zero dead codes). Dual continuous/discrete bottleneck on same weights with <0.10% gap.

Client
Independent · Eval repo public
Role
Sole author · Architecture, training, evals
Duration
Ongoing
Team
Solo
Outcomes
16:1
Spatial Compression
97.69%
Avg Fidelity
99.42%
Peak Fidelity
<150K
Parameters
Highlights
  • 16:1 spatial compression ratio
  • 97.69% average reconstruction fidelity
  • Under 150K trainable parameters
  • Batches 6x 40MP images on single RTX 4090
  • Loss decreases with resolution (inverse of typical codecs)
  • Dual continuous/discrete bottleneck
  • 100% codebook utilization (zero dead codes)
  • UNA Audio prototype also developed
Reconstruction Fidelity · Drag to Compare
Wildlife · 2560px
ReconstructedOriginal
Original
Reconstructed
01

What it is

A compact neural vision codec and visual tokenizer. Compresses arbitrary RGB imagery into a dense, well-structured latent at a fixed 16:1 spatial ratio and reconstructs at 1–4% fidelity loss on natural imagery.

Loss shrinks as input grows: 4–6K photos land in the 1–2% band; 40 MP cases hold there comfortably. 100% active visual vocabulary utilization — zero dead codes.

02

Memory envelope

A batch of half a dozen 40 MP images fits in a single forward pass on one RTX 4090 — no tiling, no sharding, no gradient checkpointing acrobatics, no OOM.

Possible because activation memory is dominated by the 16:1 bottleneck and the network sits under 150K trainable parameters.

VisionImage CodecVisual TokenizerVAE AlternativeCompression
04 · 2025Research

HarEmb

Classification, Retrieval, and NLP from Embeddings Geometry

93% Classification, 28x Faster Inference

HarEmb performs classification, retrieval, and NLP tasks by exploiting the geometry of LLM embedding matrices. Results achieved using Qwen2.5-0.5B, a very small model — demonstrating that embeddings geometry carries significant semantic information even at minimal scale. Lightweight components run 28x faster than conventional transformers.

Client
Independent — Author-attested
Role
Sole author
Duration
2025
Team
Solo
Outcomes
93.16%
AG News
90.75%
Emotion
86.01%
IMDB
83.72%
SST-2
0.941
MS MARCO MRR@10
28x
Speedup
<150M
Total Network
<20M
Trainable Params
Highlights
  • Only lightweight forward pass components
  • Retrieval extension with MRR@10 >0.9
  • Throughput: thousands of samples per second
  • Exploits embeddings geometry with lightweight components
EmbeddingsEfficient InferenceNLPContent ModerationRAG
05 · 2023-2025Production

Cloudflare Workers AI

Global Edge Deployment for ~2 Years

Only Independent Developer in Cloudflare's AI Catalog

Cloudflare hosted Cybertron 7B v2 on their global Workers AI inference platform as a first-party model — served at the edge with OpenAI-compatible endpoints, a 15,000-token context window, and a public playground. The only third-party fine-tune in their catalog under an independent-developer namespace. Hosted for nearly two years.

Client
Cloudflare — Independent Developer
Role
Model author
Duration
Dec 2023 - Oct 2025
Team
Solo
Outcomes
~2 years
Deployment Duration
15,000 tokens
Context Window
@cf/fblgit/una-cybertron-7b-v2-bf16
Model ID
OpenAI-compatible
API
Highlights
  • First-party model in Cloudflare's curated catalog
  • Only independent developer namespace in Workers AI
  • Nearly two years of production hosting
01

Global Edge Deployment

Global edge deployment across Cloudflare's network with OpenAI-compatible API endpoints. 15,000-token context window with public playground available.

Only independent developer fine-tune in catalog. Approximately 2 years in production from December 2023 to October 2025.

CloudflareEdge DeploymentProductionWorkers AIGlobal Scale
06 · 2023-2024Technique

UNA — Uniform Neural Alignment

Yet-Unpublished LLM SFT/RLHF Technique

8 Public Releases, Multiple #1 Positions

UNA is Uniform Neural Alignment — a transformers architecture change introducing an auxiliary loss, applied as a patch to HuggingFace Transformers models. Operates during SFT and RLHF training. Applicable to attention layers, MLP layers, or both. Memory intensive but compatible with LoRA. Training data does not need to be novel, but must not have been previously overfitted. Applied across Mistral, Intel, Yi/Smaug, Qwen2.5, LLaMA 1 & 2, Pythia, and Luxa architectures.

Client
Independent — Juanako.AI
Role
Sole author · Method, training, releases
Duration
2023 & 2024
Team
Solo
Outcomes
18
Public Releases
4+
Base Architectures
1.5B to 34B
Model Sizes
Multiple
#1 Positions
Highlights
  • Consistent positive delta over base models
  • Multiple #1 leaderboard positions
  • Applicable to different network layers
Selected Releases & Leaderboard Positions
9 of 18 total releases
DateModelSizeBase1.5B3B7B34B70B+
28-Nov-2023Juanako 7B UNA7BMistral··🏆··
02-Dec-2023Cybertron v17BMistral··🏆··
05-Dec-2023Cybertron v27BMistral··🏆🏆🏆
09-Dec-2023Xaberius 34B v1beta34BYi···🏆🏆
11-Jan-2024UNA-TheBeagle v17BMistral··🏆🏆🏆
04-Feb-2024UNA-SimpleSmaug 34B34BYi/Smaug···🏆·
30-Oct-2024Cybertron v4 MGS7BQwen2.5··🏆··
07-Nov-2024MiniClaus UNAMGS1.5BQwen2.5🏆····
21-Nov-2024Cybertron v4 UNAMGS7BQwen2.5··🏆··
🏆 marks the size tier reached on the leaderboard
transformersdeepspeedaccelerateaxolotltorchwandbsftrhlfdistributed-training
07 · 2025Ecosystem

DIL — Domain Intent Language

Where the agent, the spec, and the codebase share one surface.

Every task lands as a reviewed spec before it lands as code.

Agentic coding on greenfield demos is easy. Doing it on the kind of code a business actually runs on — years of history, multiple owners, no authoritative map, and a context window that runs out before the work does — is where most agentic workflows fall apart. DIL exists to make that second case tractable.

Client
Independent — Ecosystem project
Role
Creator · Ecosystem author
Duration
2025
Team
Solo
Outcomes
100K+ LOC
Proven Scale
Claude Code · Codex · Kiro
Host Agents
Language · Server · UI · MCP · CLI
Surfaces
SWE Approval Gates
Review Model
Screenshots
01

The substrate

DIL is three things welded into one surface: a spec layer the agent authors against, a graph of the project's structure and relationships, and an agent integration that reaches into the host coding tool — Claude Code, Codex, or Kiro — through MCP. The server hosts all of it (database, web UI, MCP endpoints), and a CLI sits alongside for humans who prefer the terminal.

02

The loop

An agent onboards a project fly-solo — crawling, building the graph, and registering itself without supervision, while a human watches progress through the CLI or the UI. From there, every task runs the same shape: the agent produces a DIL-SPEC through a workflow pipeline, the human reviews and approves at the gates built into the flow, and implementation proceeds against the approved spec.

During the work, the agent searches and reasons across the graph, the spec layer, and the source code in a single query — the three surfaces are one. When code inevitably drifts away from the spec, the ecosystem self-heals, either through a direct command or as a native step inside the SWE workflow. Skills, SubAgents, and Commands extend the reach inside the host agent, so the integration isn't a thin adapter — it's first-class behavior.

03

Lineage

DIL is what Tree-of-Knowledge symbolic tuning becomes when you push it into the software-engineering domain and make the symbolic structure load-bearing, not academic.

AgenticSpec-DrivenMCPGraphRAGBrown-FieldEnterpriseClaude CodeCodexKiro
08 · 2024-2025Technique

MGS — MultiGumbelSampling

Yet-Unpublished LLM Regularization Technique

Compatible with UNA for Additive Gains

MGS is MultiGumbelSampling — a regularization technique introducing Gumbel-sampled noise across signal paths during SFT/RLHF training. Combinable with UNA (UNAMGS releases) for additive performance gains.

Client
Independent — Juanako.AI
Role
Sole author
Duration
2024-2025
Team
Solo
Outcomes
5
Public Releases
4
UNAMGS Releases
1.5B to 7B
Model Sizes
Multiple
#1 Positions
Highlights
  • Compatible with UNA — UNAMGS combines both
  • Operates on different network paths than UNA
  • First public release: Oct 2024
Selected Releases & Leaderboard Positions
DateModelSizeBase1.5B3B7B34B70B+
30-Oct-2024Cybertron v4 MGS7BQwen2.5··🏆··
07-Nov-2024MiniClaus UNAMGS1.5BQwen2.5🏆····
21-Nov-2024Cybertron v4 UNAMGS7BQwen2.5··🏆··
04-Nov-2024Pancho v1 3B UNAMGS3BQwen2.5·🏆···
03-Feb-2025MiniClaus UNAMGS GRPO1.5BQwen2.5🏆····
🏆 marks the size tier reached on the leaderboard
transformersdeepspeedaccelerateaxolotltorchwandbsftrhlfdistributed-training
09 · 2024Research

SingleMoM

Exploratory Parameter-Efficient Adaptation

Promising Early Results — More Research Underway

SingleMoM is an exploratory parameter-efficient adaptation approach that competes with LoRA on GLUE benchmarks at a fraction of the trainable parameter cost, while enabling zero-overhead expert switching at inference. Early experiments are encouraging — there's room to better understand its expressiveness, behavior across domains, and potential extensions (e.g. image adapters). SFT experiments on RoBERTa reproduce the LoRA paper's evaluation setup. RLHF track on LLaMA-3 explored per-expert datasets across language, conversational style, formatting, text-to-SQL, and structured output, with experts being combinable at inference (e.g. German × humanlike experts produced the fblgit/german-humanlike-clean-1k dataset).

Client
Independent — Author-attested
Role
Sole author
Duration
2024
Team
Solo
GLUE Benchmark — RoBERTa-large
MethodTrainable ParamsCoLA (MCC)SST-2QQPQNLIMRPC
Full Fine-Tuning355M68.096.492.294.790.9
LoRA0.8M68.2 ± 1.996.2 ± 0.591.6 ± 0.194.9 ± 0.390.9 ± 1.2
SingleMoM<0.25M67.5–68.896.0–96.390.7–91.094.5–94.789.5–90.4
Baselines from LoRA paper (arxiv 2106.09685, Table 2)SingleMoM scores reported as ranges across runs.
Highlights
  • Competes with LoRA on GLUE at a fraction of trainable params
  • Zero-overhead expert switching at inference
  • Experts can be combined (e.g. German × humanlike → real dataset output)
  • Tested under SFT, DPO, and PPO setups
  • Open directions: expressiveness, cross-domain behavior, image adapters
transformerstorchwandbsftrlhflora-alternativeparameter-efficient
10 · 2025Tooling

ClaudeBench

Claude Code Best Friend

Anticipated Anthropic's Harness Pattern

A Redis-first, event-driven workbench with swarm intelligence for decomposing complex tasks into specialist-assigned subtasks. Features JSONRPC 2.0 + WebSocket communication, MCP integration, and React dashboard with Kanban. Architecture anticipated Anthropic's published long-running-agent harness pattern.

Client
Open source (MIT)
Role
Creator & sole maintainer
Duration
Ongoing since 2025
Team
Solo
Highlights
  • Redis-first coordination with direct primitives
  • Swarm intelligence for task decomposition
  • Event-driven with JSONRPC 2.0 + WebSocket
  • MCP integration from day one
  • React dashboard with Kanban
  • 579+ commits at time of writing
Screenshots
01

Why I built it

When you're running long coding sessions with Claude as the executor, you run out of context window before you run out of work, and the next session starts blind.

I solved that my way: a Redis-first, event-driven workbench with swarm intelligence for decomposing complex tasks into specialist-assigned subtasks — observable through a real-time dashboard.

02

Timeline

On 26-Nov-2025 Anthropic published 'Effective harnesses for long-running agents'. ClaudeBench was released approximately 8 weeks prior.

The architectural pattern ClaudeBench implements aligns with the concepts later published in that document.

AgentClaudeMCPRedisTask ManagementSwarm Intelligence
11 · 2025Tooling

eLLMulator

Agentic Distributed Trace Simulation

Open Source · Claude Agent SDK + MCP

Traditional distributed tracing shows what happened at runtime but can't reason about intent or surface contract mismatches. eLLMulator takes a different approach: LLM agents become your software components. Each agent studies its assigned source file, then interacts with other agents via synchronous MCP tool calls that mirror real function calls. The call graph emerges naturally from code control flow, producing traces that capture not just what happened, but why each component behaved as it did.

Client
Open source
Role
Creator
Duration
2025
Team
Solo
Outcomes
5
Finding Types
3
Trace Modes
2
MCP Servers
Open Source
License
Highlights
  • Source files become autonomous Claude agents
  • Agent communication mirrors real function calls via MCP
  • Five finding types including contract mismatches and assumption bugs
  • Three trace modes: Full, Targeted, and Lens
  • OpenTelemetry export to standard observability platforms
Screenshots
01

The approach

Each source file becomes an autonomous Claude agent. Agent-to-agent communication via MCP tool calls mirrors real function calls.

Five finding types: contract mismatches, assumption bugs, missing error paths, dead spots, unexpected calls. Three trace modes: Full, Targeted, and Lens.

02

Infrastructure

OpenTelemetry export to Jaeger, Tempo, or Honeycomb. Smart entry point detection from natural language scenarios.

Dependency graph (Starmap) with SCC clustering. Multi-layer guardrails: cycle detection, depth limiting, rate limiting, circuit breakers.

Claude Agent SDKMCPOpenTelemetryCode AnalysisDistributed TracingOpen Source
12 · 2026Product

Juanako — This Site

A portfolio that responds.

Dual-audience agent, one codebase, no shared accounts.

Most portfolio sites are passive — a scroll of work. Most AI chatbots are ungrounded — they hallucinate away from whatever they're supposed to be about. Most job-search tools pick a side — they serve recruiters or candidates, rarely both. This site refuses those defaults. The visitor-facing agent is strictly grounded in what's on record, drives the interface rather than just describing it, and can produce a printable match report scoped to a recruiter's job description. A local-only companion surface turns the same product into a private career copilot — application tracking, candid notes that never leave the user's machine, and a suite of generators for the moments that matter before, during, and after a job hunt.

Client
Independent
Role
Sole author
Duration
2026 · Ongoing
Team
Solo
Outcomes
7
Deliverable Templates
Visitor + Candidate
Audiences
Self-Hostable · Free
Hosting
Local-First
Privacy Model
Highlights
  • Agent drives the interface — navigation, highlighting, deep-dives happen through real actions, not claimed ones
  • Seven printable deliverable templates spanning the full application arc
  • Applications act as durable containers — JD plus every generated artefact pinned to the role
  • Candid data stays on the user's machine; nothing personal is hosted or shared
  • Generation reshapes style and wording, never substance — every claim traces to the source knowledge; voice rules cut hype and flattery without inventing facts
  • Meaningful emphasis on security — surface isolation, session integrity, clear public/private boundaries
01

What visitors see

A portfolio that responds. The agent doesn't just answer questions — it navigates between pages, highlights the case studies and roles that map to a visitor's interest, opens deep-dives in a single step, and refuses to speculate outside what the site actually holds.

Recruiters can hand it a job description (dropped as a file or pasted as a URL) and get back a printable match report grounded in real work, real numbers, and honest gaps. One click, a new tab, a PDF ready to save.

02

The design stance

Grounded over clever. The agent is constrained to what exists on record; tailoring means picking depth from the source material, not inventing facts. Generation operates on **style and wording** — phrasing, cadence, register, ordering — never on substance: every claim, metric, project, and timeline traces back to the underlying knowledge. The voice rules cut hype, flattery, and unearned superlatives out of the first draft, but they reshape *how* the truth is told, not the truth itself. Security gets meaningful emphasis — strict isolation between the public and private surfaces, integrity checks on conversation history, and candid data that never crosses the network. One codebase, two audiences, no shared accounts, no SaaS layer, no tenants. Single user, free to run, self-hostable.

AgenticGrounded LLMSingle-UserFreeLocal-FirstDual-AudiencePortfolioCareer Copilot
13 · 2023-2025Datasets

Training Datasets

Datasets for Training, Reasoning, and RLHF

5 Public Datasets · Powering #1 Models & RLHF Experiments

Custom datasets built for targeted training experiments. The simple-math family explores minimal arithmetic corpora for reasoning under SFT and DPO. Tree of Knowledge introduces symbolic knowledge structuring. The german-humanlike pair demonstrates downstream artifacts produced by composing SingleMoM RLHF experts.

Client
Independent — Juanako.AI
Role
Dataset author
Duration
2023-2025
Team
Solo
Outcomes
5
Public Datasets
800K rows
Largest
Yes
Used in #1 Models
Math · Knowledge · Style
Coverage
Datasets
NameDateSizeTypePurpose / Used In
simple-math20-Jan-2024779K rowsSFTArithmetic reasoning · SimpleSmaug #1 34B
simple-math-DPO27-Jan-2024800K rowsDPOArithmetic preference training
Tree of Knowledge24-May-2023~5MBSymbolicKnowledge structure · Cybertron 7B v1/v2
german-humanlike-clean-1k26-Mar-2025856 rowsRLHFSingleMoM expert composition (curated)
german-humanlike-large26-Mar-202510.8K rowsRLHFSingleMoM expert composition (full)
datasethuggingfacesynthetic-datarlhfdposftdata-engineering
14 · 2023-2025Engineering

10,000+ Tracked Experiments

10,000+ WandB Tracked Experiments

Over 10,000 experiments tracked in Weights & Biases — sweeps, ablations, hyperparameter searches, and training runs. Each technique developed (UNA, MGS, SingleMoM, HarEmb, UNAVision) came from methodical experimentation across documented training runs.

Client
Independent — Juanako.AI
Role
Sole researcher
Duration
Ongoing
Team
Solo
Outcomes
10,000+
Total Experiments
5+
Techniques Developed
Weights & Biases
Tracking Platform
Systematic
Methodology
Highlights
  • 10,000+ total tracked experiments
  • Systematic hyperparameter sweeps
  • Architecture ablation studies
  • Training dynamics analysis
  • Reproducible experiment tracking
  • Cross-technique comparison studies
01

Systematic ML Engineering

10,000+ total tracked experiments with systematic hyperparameter sweeps and architecture ablation studies.

Training dynamics analysis with reproducible experiment tracking. Cross-technique comparison studies across all developed methods.

MLOpsExperiment TrackingWandBAblationsSystematic Research
15 · 2000s–PresentHobby · Receipts

Engineering — Receipts Since the Early 2000s

Two Decades of Building in Public

20+ Years Shipping — glFTPd, Docker, K8s, ML, IoT

Before AI became the headline, the craft was already there. Contributions to the glFTPd scene in the early 2000s in C, TCL, and SQL — networking primitives, sitebot tooling, and utilities. Docker images on the public registry since 2016, chasing scale, observability, and performance: MariaDB MaxScale, DBNinja, Rundeck, Cacti, and an HHVM repo-build image that packaged Facebook's top-performance PHP runtime into a container years before containerization of perf-first PHP was common. Today's tinkering continues in the same spirit — neural-net visualization, model-weight similarity analysis, declarative Jira, Kubernetes admission mutators driven by live Prometheus signals, Home Assistant + ESP smart-home glue, and ARM64/CUDA ports shared with the community. All hobby. At the job, I deliver more and better.

Client
Independent — Community
Role
Author / Contributor
Duration
Early 2000s – Ongoing
Team
Solo
Outcomes
20+
Years Shipping
2016
Docker Hub Since
Dozens
Public Repos
Net · Perf · ML · IoT
Domains
Highlights
  • glFTPd community contributions in C, TCL, and SQL since the early 2000s
  • Docker Hub publisher since 2016 — performance and observability focus
  • HHVM repo-build image (2016–2018) — top-performance PHP in a container before it was common
  • Neural-net debugger (transviz) with time-travel replay of training sessions
  • Kubernetes admission mutator driven by live Prometheus metrics (nemutator)
  • Home Assistant + ESP smart-home with OTP-gated physical access
  • ARM64/CUDA Viseron NVR port shared to save others the build time
  • All hobby — at work, delivers more and better
Timeline
Early 2000s
glFTPd community — contributions in C (networking), TCL (sitebots and tooling), and SQL (utilities). Archived at grandis.nu/glftpd/Mr_V/.
2016
Docker Hub publishing begins — fblgit/maxscale-docker, fblgit/dbninja, fblgit/rundeck, fblgit/cacti. Early affinity for scale, observability, and performance.
2016–2018
fblgit/hhvm-repo-build — Facebook's top-performance PHP runtime (HHVM) packaged into a container, iterated across 2016, 2017, and 2018. Containerization applied to performance-first workloads before it was standard.
Jan 2021
fblgit/jarvis-iot-hassio — Home Assistant + ESP firmware + Tuya smart-home integration with ESP-powered smart gate and OTP-based access flow.
Feb 2021
fblgit/nemutator — Kubernetes admission mutation webhook that rewrites pod specs (resources, labels, env, images, selectors) from live Prometheus metrics, with Redis-backed mutation logs for rollback.
Sep 2021
fblgit/jira_as_a_code — declarative YAML planning for Jira: epics, tasks, SOP templates, per-environment iteration.
May 2022
fblgit/viseron-arm64-cuda — ARM64 + CUDA port of the Viseron self-hosted NVR, shared with the community.
Feb 2024
fblgit/model-similarity — cosine-similarity analysis across transformer weights with CSV and interactive HTML reports, quantifying how close a fine-tune is to its base.
Feb 2025
fblgit/transviz — real-time neural-net visualization and debugging: tensor inspection, conditional breakpoints, training metrics, and time-travel replay of captured training sessions.
Oct 2025
fblgit/agentool — public release. Meta-framework for type-safe, composable AI workflows on top of pydantic-ai, with a three-layer architecture and state-driven execution.
01

The long game

Contributions to the glFTPd community in the early 2000s — C for networking primitives, TCL for sitebot/tooling, SQL for backing stores. Mirror archived at grandis.nu/glftpd/Mr_V/.

Docker images published on the public registry since 2016 — always chasing scale, observability, and performance. MaxScale, DBNinja, Rundeck, Cacti. The HHVM repo-build image packaged Facebook's top-performance PHP runtime into a container across 2016, 2017, and 2018 — applying containerization to performance-first workloads well before it was common practice.

02

Still tinkering

Prototypes released as they mature: transviz (real-time neural-net visualization with tensor inspection and time-travel replay of training sessions), model-similarity (cosine-similarity analysis of transformer weights with interactive HTML reports), agentool (meta-framework for type-safe, composable AI workflows on top of pydantic-ai).

Concepts and tools: jira_as_a_code (declarative YAML planning — epics, tasks, SOP templates, per-env iteration), nemutator (Kubernetes admission mutation webhook that rewrites pod specs from live Prometheus metrics without redeploy).

Hardware and community saves: jarvis-iot-hassio (Home Assistant + ESP firmware + Tuya smart-home integration with ESP-powered smart gate and OTP-based physical access), viseron-arm64-cuda (ARM64 + CUDA port of the Viseron NVR shared to save others the porting time).

03

The common thread

Every one of these is hobby. Weekend tinkering, personal itches, things that would have helped me if someone else had built them — so I built and published them instead.

At my job I deliver more and better. Same engineering instinct, wound tighter.

CTCLSQLDockerKubernetesIoTOpen SourceHobby