Job Description
We are at the forefront of the AI revolution, building the intelligent systems that will define the year 2026 and beyond. Nexus Future Labs is seeking a visionary Senior Generative AI Engineer to join our elite engineering team. You will be responsible for designing, training, and deploying state-of-the-art Large Language Models (LLMs) and multimodal AI systems that solve complex real-world problems.
In this role, you will work closely with product leaders and researchers to translate cutting-edge academic research into scalable, production-ready software. If you are passionate about the future of AI, have a deep understanding of machine learning architectures, and want to leave a lasting impact on how humans interact with machines, we want to hear from you.
Responsibilities
- Design, implement, and optimize Generative AI models (LLMs, GANs, Diffusion models) for high-scale production environments.
- Develop and fine-tune foundation models using PyTorch and TensorFlow to improve accuracy, latency, and cost-efficiency.
- Build Retrieval-Augmented Generation (RAG) pipelines and vector databases to enhance model context and reduce hallucinations.
- Collaborate with cross-functional teams to integrate AI capabilities into consumer-facing products and enterprise solutions.
- Establish best practices for model monitoring, evaluation, and ethical AI deployment.
- Research and prototype novel architectures to stay ahead of industry trends.
Qualifications
- Masterβs or PhD in Computer Science, Machine Learning, or a related field (or equivalent practical experience).
- 5+ years of professional experience in software engineering with a strong focus on AI/ML.
- Deep proficiency in Python and major ML frameworks (PyTorch, TensorFlow, JAX).
- Extensive experience with NLP libraries (Hugging Face, NLTK, SpaCy) and LLM APIs (OpenAI, Anthropic, Cohere).
- Strong understanding of distributed systems, cloud architecture (AWS/Azure/GCP), and containerization (Docker/Kubernetes).
- Proven track record of deploying models that handle high concurrency and low latency.