Jobiglo

No results.

AI Evaluation Engineer – Design Real‑World Benchmark Tasks

Gramian Consulting · Égypte

New Remote
عقد Remote Mid 🇬🇧 English
Backend engineering Infrastructure DevOps Data systems MLOps Cybersecurity Platform engineering Terminal CLI Automation Developer tooling AI systems LLMs Benchmarking Evaluation frameworks

Job description

About the role

Gramian Consultancy seeks an AI Evaluation Engineer to create realistic, terminal‑based benchmark tasks that assess how large language models reason through debugging, operational failures, and complex multi‑step workflows. The role is fully remote and can be performed full‑time or part‑time over a five‑week contract.

Key responsibilities

  • Design technically deep debugging and investigation scenarios for AI evaluation systems.
  • Develop task specifications that involve infrastructure, pipelines, and operational failure modes.
  • Write clear solution approaches and deterministic evaluation criteria.
  • Identify realistic edge cases, failure modes, and system constraints.
  • Craft multi‑step reasoning challenges across complex technical environments.
  • Collaborate with reviewers and researchers to refine benchmark quality and validation logic.

Required profile

  • 3‑10 years of experience in software engineering or related technical domains.
  • Strong analytical, debugging, and systems‑reasoning abilities.
  • Good understanding of system architecture, dependencies, and operational processes.
  • Experience with terminal, CLI, automation, or developer‑tooling workflows.
  • Exposure to AI systems, LLMs, benchmarking, or evaluation frameworks is a plus.

Required skills

  • Backend engineering
  • Infrastructure
  • DevOps
  • Data systems
  • MLOps
  • Cybersecurity
  • Platform engineering
  • Terminal / CLI
  • Automation
  • Developer tooling
  • AI systems
  • Large language models (LLMs)
  • Benchmarking
  • Evaluation frameworks

Questions fréquentes

Le salaire n'est pas communiqué publiquement par le recruteur. Vous pouvez postuler et négocier directement avec Gramian Consulting.
Cliquez sur "Postuler maintenant" en haut de la page. Vous pouvez importer votre CV en 1 clic — Jobiglo extrait automatiquement vos informations et postule pour vous.
Le contrat proposé est un عقد basé à Égypte.

Why are you reporting this job?

Thank you for your report. We will review this job.

Apply in 30 seconds

Enter your email to apply. An account will be created automatically.

By continuing, you accept our terms of use.

Already have an account? Login

Published 7 hours ago

Expires 1 month from now

6 views · 0 applications

Boost your chances

Upload your CV — we will match you with relevant openings.

Analyzing your CV...

Gramian Consulting

Égypte