Sumit Pokharel

Software Engineer & an independent Machine Learning Researcher. Driven by a desire to contribute to breakthroughs in the field of AI research and ML engineering to make the world a better place.

Professional Experience

Rakuten Group, Inc., Software Developer - Frontend
Tokyo, Japan
  • Undertaking a comprehensive UI revamp across 7 critical Checkout pages for Ichiba — Japan's largest e-commerce platform with 50M+ monthly users.
  • Architected and delivered 10+ high-performance React components supporting 200K+ daily transactions across Ichiba's checkout flow.
  • Undertook a 4-month long project to engineer an API service enabling AI-powered product description generation from images and merchant inputs, with regeneration and translation features—designed to significantly streamline merchant workflows across our e-commerce platform.
Best Path Research, Machine Learning Intern
Tokyo, Japan
  • Delivered a Python script that streamlines the conversion of digital text into handwritten images and hex code matching for image retrieval from a directory with 1+ million images of handwritten characters.
  • Advanced the development of an application to correct Japanese receipt distortions by synthesizing a dataset of artificially altered images & training an advanced open-source model to rectify these distortions.

Projects

  • An eternal project where I learn about deep learning concepts and LLM architectures by replicating the papers & architectures from scratch.
  • Currently done with: the original Transformer paper, GPT-2, LLaMA-2, LLaMA-3, Mistral 7B.
  • A simple implementation of the reverse-mode automatic differentiation engine used in PyTorch that I built with NumPy to understand the core principles.
  • Features: automatic differentiation via backward passes, common arithmetic operations, gradient computation & broadcasting, and fully connected neural network layer implementation.
  • A terminal-based AI assistant powered by Cohere's Command-A model that I was inspired to build after using Anthropic's Claude Code.
  • Features: interactive terminal interface, live streaming responses, advanced markdown support, workspace awareness (understands directory structure and available files), and so on.
  • A full-stack chat application that I built with Next.js, FastAPI, and Gemini 2.0 Flash model.
  • Features a modern UI, conversation history, and customizable AI settings.

Technical Skills

Languages: Python, JavaScript, TypeScript

Frameworks/Libraries: PyTorch, NumPy, Tinygrad, React.js, Next.js

Other: LLM Architectures, Git, Vim

Languages

English (fluent) · Nepali (native) · Japanese (proficient, JLPT N2 certified)

Education

Ritsumeikan Asia Pacific University Bachelor's Degree in Business Administration
Beppu, Japan

CGPA 3.65