Sumit Pokharel

Software Engineer & an independent Machine Learning Researcher. Driven by a desire to contribute to breakthroughs in the field of AI research and ML engineering to make the world a better place.

Professional Experience

Rakuten Group, Inc., Software Developer - Frontend
Tokyo, Japan
  • Developing performance-optimized checkout interfaces for 7 critical Ichiba pages in an upcoming revamp, implementing modern accessibility standards for the 50M+ user base.
  • Architected and built 20+ reusable React components for the new checkout system, creating a unified component library that will power 200K+ daily transactions post-launch.
  • Architected and implemented the complete contactless delivery (okihai) system over 2 months, engineering 7 purpose-built components to optimize no-contact delivery workflows on the order confirmation page.
Best Path Research, Machine Learning Intern
Tokyo, Japan
  • Delivered a Python script that streamlines the conversion of digital text into handwritten images and hex code matching for image retrieval from a directory with 1+ million images of handwritten characters.
  • Advanced the development of an application to correct Japanese receipt distortions by synthesizing a dataset of artificially altered images & training an advanced open-source model to rectify these distortions.

Projects

  • Implemented a complete sequence-to-sequence neural network from first principles, faithfully reproducing the seminal Sutskever et al. (2014) architecture for machine translation.
  • Built with LSTM encoder-decoder, beam search decoding, CUDA acceleration, and includes pre-trained weights from custom training pipeline of my own.
  • Ongoing project replicating foundational ML research papers through complete from-scratch implementations to master transformer architectures and modern LLM designs.
  • Implemented the original Transformer, GPT-2, LLaMA-2/3, and Mistral 7B with faithful attention mechanisms and positional encodings, continuously expanding to new architectures.
  • Built a reverse-mode automatic differentiation engine from scratch using NumPy, replicating the core computational graph and gradient calculation mechanisms that power PyTorch.
  • Implemented dynamic computation graphs, backpropagation algorithms, and gradient accumulation to support fully connected neural network training.
  • Developed a command-line AI assistant using Cohere's Command-A model that provides intelligent code assistance directly in the terminal with natural language interaction.
  • Built with streaming response handling, advanced markdown rendering, and context-aware workspace analysis that understands project structure and file relationships.

Technical Skills

Languages: Python, TypeScript, JavaScript, HTML, CSS

Architectures & Concepts: Transformer, LLM Architectures (LLaMA, Mistral, GPT-2, gpt-oss, nanoVLM, etc.), Seq2Seq (LSTM), GRPO, Reverse-Mode Autograd, Tinygrad

Frameworks & Tools: PyTorch, NumPy, React.js, Next.js, Git, Vim

Languages

English (fluent) · Nepali (native) · Japanese (proficient, JLPT N2 certified)

Education

Ritsumeikan Asia Pacific University Bachelor's Degree in Business Administration
Beppu, Japan

CGPA 3.65