Jobiglo

No results.

AI Evaluation Engineer – Design Real‑World Benchmark Tasks

Gramian Consulting · Kenya

New Remote
Contract Remote Mid 🇬🇧 English
backend engineering infrastructure DevOps data systems MLOps cybersecurity platform engineering terminal CLI automation developer tooling AI systems LLMs benchmarking evaluation frameworks

Job description

About the role

Gramian Consultancy is seeking an AI Evaluation Engineer to design and implement realistic, terminal‑based benchmark tasks that assess how AI systems handle complex debugging, operational failures, and multi‑step problem‑solving scenarios. The role is fully remote and can be performed full‑time or part‑time.

Key responsibilities

  • Design realistic terminal‑based benchmark tasks for AI evaluation systems.
  • Create deep debugging and investigation scenarios that reflect production environments.
  • Develop specifications involving infrastructure, pipelines, and operational failures.
  • Write clear solution approaches and deterministic evaluation criteria.
  • Identify edge cases, failure modes, and system constraints.
  • Design multi‑step reasoning challenges across complex technical environments.
  • Collaborate with reviewers and researchers to refine benchmark quality and validation logic.

Required profile

  • 3‑10 years of experience in software engineering or related technical domains.
  • Strong analytical, debugging, and systems‑reasoning abilities.
  • Good understanding of system architecture, dependencies, and operational processes.
  • Experience with terminal, CLI, automation, or developer‑tooling workflows.
  • Exposure to AI systems, large language models, or evaluation frameworks is a plus.

Required skills

  • Backend engineering
  • Infrastructure
  • DevOps
  • Data systems
  • MLOps
  • Cybersecurity
  • Platform engineering
  • Terminal / CLI
  • Automation
  • Developer tooling
  • AI systems
  • Large language models (LLMs)
  • Benchmarking
  • Evaluation frameworks

Questions fréquentes

Le salaire n'est pas communiqué publiquement par le recruteur. Vous pouvez postuler et négocier directement avec Gramian Consulting.
Cliquez sur "Postuler maintenant" en haut de la page. Vous pouvez importer votre CV en 1 clic — Jobiglo extrait automatiquement vos informations et postule pour vous.
Le contrat proposé est un Contract basé à Kenya.

Why are you reporting this job?

Thank you for your report. We will review this job.

Apply in 30 seconds

Enter your email to apply. An account will be created automatically.

By continuing, you accept our terms of use.

Already have an account? Login

Published 3 days ago

Expires 1 month from now

9 views · 0 applications

Boost your chances

Upload your CV — we will match you with relevant openings.

Analyzing your CV...

Gramian Consulting

Kenya