Ron Jailall

Software Engineering Manager & AI Architect

Raleigh, NC | (608) 332-8605

rojailal@gmail.com

https://ironj.github.io/

Professional Profile

Software Engineering Manager & AI Architect with over 15 years of experience leading distributed engineering teams and delivering scalable, cloud-native AI solutions. Expert in bridging the gap between research and production, with deep hands-on experience in Generative AI, LLM infrastructure, and Computer Vision. Proven track record of architecting high-throughput data pipelines on AWS/GCP, optimizing model inference for edge/cloud (TensorRT/ONNX), and mentoring engineering talent to drive product strategy.

Technical Skills

AI/ML & GenAI: TensorFlow 2/Keras, PyTorch, TensorRT, ONNX, RAG Pipelines, Stable Diffusion, LLMs (Llama/Megatron), Computer Vision (MODNet/MobileNetV2).
Cloud & Infra: AWS (Sagemaker, Lambda, ECS), GCP (Vertex AI), Docker, Kubernetes, Terraform, GitLab CI/CD.
Languages: Python, C++, C#, JavaScript/TypeScript (React), CUDA, SQL.
Leadership: Agile/Scrum Management, Technical Strategy, Cross-Functional Team Leadership, Mentorship, Product Roadmap Definition.

Selected Technical Talks

Hyperfast AI: Rethinking Design for 1000 tokens/s

AI Tinkerers Raleigh, Dec 2025
  • Presented on hyperfast inference systems (Cerebras) and the fundamental shift in AI application design required when inference speed crosses critical thresholds.

Apple's On-Device VLM: The Future of Multimodal AI

Conference Talk, Sep 2025
  • Technical deep dive into the future of on-device multimodal AI, focusing on systems that understand images, video, and physical properties alongside text.

Professional Experience

ML Engineering Consultant / Technical Lead

2024 – Present
Remote

Providing high-level technical leadership and hands-on engineering for diverse clients, focusing on scalable AI infrastructure and Generative AI product development.

  • Custom Vision Language Model & Infrastructure (Matte Model Project): Architected a CPU-efficient, human portrait matting system (MODNet-style/MobileNetV2) using TensorFlow 2/Keras, replacing legacy SDKs in an Electron + React application.
  • Designed end-to-end MLOps pipelines on GCP Vertex AI for custom training jobs, dataset management (P3M-10k/Adobe), and seamless ONNX export.
  • Implemented CPU-focused inference strategies utilizing ONNX Runtime, enabling real-time alpha matte generation and video compositing with minimal latency.
  • Generative AI & RAG Architecture: Designed and implemented real-time RAG-based inference pipelines to power dynamic, personalized product recommendation engines.
  • Migrated Nvidia Riva/Triton microservices to AWS, re-architecting the deployment for enhanced scalability and high availability.
  • Performance Optimization & Edge AI: Optimized Computer Vision models for embedded Nvidia Jetson platforms using TensorRT and ONNX, achieving significant reductions in inference latency.
  • Engineered a cloud-based Stable Diffusion video pipeline for real-time webcam re-rendering and accelerated thumbnail generation.
  • Strategic Leadership: Advise client executive teams on AI integration strategies, identifying opportunities to leverage LLMs and diffusion models for competitive advantage.

Lead Engineer, AI R&D

2023 – 2024
Vidable.ai | Remote

Led the R&D engineering team in evaluating and productizing cutting-edge Generative AI tools.

  • Team Leadership & Strategy: Collaborated with PhD researchers to translate academic findings into production-ready features. Led weekly cross-company technical forums to align engineering and product teams on the latest AI developments.
  • Infrastructure Design: Built robust CI/CD pipelines using Terraform, GitLab, and AWS to deploy ML inference servers, supporting microservices architecture across Docker and Kubernetes.
  • Model Optimization: Modified C/C++ codebases (llama.cpp & Stable Diffusion Turbo) to suit specific business use cases, optimizing performance for Python-based API endpoints (FastAPI/Uvicorn).
  • Product Innovation: Prototyped and deployed React-based demos powered by LLMs and diffusion models, directly influencing the product roadmap and UX design.

Lead Engineer

2014 – 2023
Sonic Foundry | Remote

Progression from Engineer to Lead, managing critical data pipelines and spearheading the transition to cloud-native technologies.

  • Scale & Reliability: Architected and supported data pipelines serving the company's 5 largest enterprise customers (100k+ end users).
  • Cloud Transformation: Designed and built an AWS cloud-native Archive utility (Lambda, Batch, Terraform) that was adopted as a core product, generating millions in revenue.
  • AI Leadership: Founded and led an internal AI/ML reading group (growing to ~25% of the company) and initiated company-wide AI hackathons to foster innovation.
  • LLM Implementation: Designed the inference architecture for a Zoom/conferencing plugin and developed pipelines to deploy GGML Llama models to AWS Serverless architecture.

Technology Specialist

2006 – 2014
NC State University
  • Led AV/Room Design Engineering for 200+ learning spaces.
  • Developed novel embedded control systems and accessibility UIs; awarded the OIT Award for Excellence.

Recent Projects & Hackathons

Education & Certifications

NC State University | Electrical & Computer Engineering (Completed 75 Credit Hours)
Coursera Verified Certificates: Neural Networks for Machine Learning (Hinton), Image and Video Processing.