# Grayhat Career: Software Engineer Intern (AI Engineer) > Route-specific LLM context for the public career detail page. ## Page - [Career Detail](https://grayhat-company-site.grayhatstudio.workers.dev/careers/HR-OPN-2026-0004) - [Careers Index](https://grayhat-company-site.grayhatstudio.workers.dev/careers) - [Careers LLM Context](https://grayhat-company-site.grayhatstudio.workers.dev/careers/llms.txt) - [Career Full LLM Context](https://grayhat-company-site.grayhatstudio.workers.dev/careers/HR-OPN-2026-0004/llms-full.txt) - [Career API](https://grayhat-company-site.grayhatstudio.workers.dev/api/public/v1/jobs/HR-OPN-2026-0004) - [Jobs API](https://grayhat-company-site.grayhatstudio.workers.dev/api/public/v1/jobs) ## Role Snapshot - Job ID: `HR-OPN-2026-0004` - Status: Open - Department: Engineering - Company: Grayhat - Planned vacancies: 0 - Remaining vacancies based on current snapshot: 0 - Created at: 2026-05-26 12:40:25.744689 ## Page Content - Engineering @ Grayhat Think you can design, prototype, and ship games and digital products alongside designers and engineers in tight 2–3 week cycles without slowing down? The stepping stone for 80% of engineers at Grayhat. ## **The Role** We're building an AI infrastructure from the ground up, and this internship is a founding piece of that. You'll work directly on our self-hosted LLM infrastructure, solving real problems around model performance, usability, and adoption. If you're excited about the nuts and bolts of running LLMs in production (not just prompting them), this is your role. ## **What You'll Do** - Dig into bottlenecks in our self-hosted LLM stack around latency, throughput, and hardware utilization. - Build internal tools and interfaces that make our AI stack easier for non-technical team members to actually use. - Experiment with model optimization techniques like quantization, pruning, and batching strategies, all within our hardware constraints. - Identify gaps in how our tools are being used and build lightweight integrations or wrappers to close them. - Keep clear engineering notes so your work is reproducible and the next person can pick up where you left off. - Talk to the broader engineering team, understand their AI needs, and build toward them. ## **What We're Looking For** - Solid Python skills and comfort working in a Linux/server environment. - A real understanding of how transformer-based LLMs work, not just how to prompt them. - Some exposure to LLM serving frameworks like Ollama, vLLM, llama.cpp, or TGI. - Genuine interest in model optimization. You know what quantization means and you're not afraid to get into it. - A builder's mindset. You don't wait for perfect tooling, you make do and move forward. - Second year, third year, final year student (CS, AI, Data Science, or related) or fresh graduate. - We prefer onsite people for this role. ## **Nice to Have** - Experience with on-prem or edge AI deployments, not just cloud APIs. - Familiarity with FastAPI or similar frameworks for wrapping model endpoints. - Prior work with ONNX, llama.cpp, or hardware-specific inference optimization. - Some understanding of RAG pipelines or agent frameworks like LangChain or LangGraph. ## **What You'll Get** - A monthly stipend. This is a paid internship. - Hands-on experience running and optimizing production LLM infrastructure. This is not a "call the API and move on" role. - The chance to directly shape how an entire studio uses AI tooling. - Mentorship from engineers who've built across the full product stack. - Work that genuinely stands out in a portfolio. Self-hosted AI infrastructure experience is rare at this level. - A potential full-time offer if you stand out. ## Jobs API Filters - `q` - `status` - `department`