Prompt Evaluation (Bilingual) - Either Italian or German or French at Swanktek, Inc (Expired)

Job Title: Prompt Evaluation (Bilingual)

Job Location: Remote

Employment Type: Part Time - Ongoing Projects

Work Hours: 4 -5 hours a day (20-25 hours in weekly)

Language required: Either Italian, German or French

There will be 2 Online Test- LLM Assessment Test And Pearson Test.

Job Summary

We are looking for linguistically and culturally aware professionals to support the evaluation and enhancement of multilingual prompt-response datasets for large language models (LLMs). This role involves rubric design, evaluation of translations and model outputs, prompt creation, and red teaming focused on identifying and surfacing cultural nuances and biases in LLM behavior.

Key Responsibilities

Rubric Definition & Prompt Evaluation
Update rubric definitions with region/language-specific examples to ensure cultural and linguistic relevance.
Identify the need for additional rubrics tailored to specific languages or regional contexts.
Review prompts translated from English into the target language and revise where translations appear unnatural or inaccurate.
Writing of thoughtful prompts which can test the cultural awareness of LLM models.
Rate prompt-response pairs using a standardized evaluation template based on rubrics and provide detailed justifications to base the findings.
Document problematic outputs and annotate them with clear explanations of rubric violations or cultural insensitivities.

Required Qualifications

Native proficiency in the target language and deep familiarity with cultural norms in the corresponding region.
Experience in LLM evaluation, content moderation, or linguistic QA preferred.
Strong attention to detail with the ability to identify subtle issues in language use, tone, and cultural references.
Comfortable working in spreadsheets and evaluation templates.
Bachelor’s degree
Prior experience with prompt engineering or LLM testing.
Familiarity with tools such as Gemini, ChatGPT or similar LLM platforms.
Ability to clearly articulate reasoning behind rubric ratings or prompt edits

What you have to do : Categories & Sample Focus Areas

1. Language Comprehension

Grammar, tone detection, sentence correction, synonym selection

2. Prompt Engineering

Designing or evaluating prompts for LLMs to generate accurate, safe, and relevant outputs

3. Content Moderation

Identifying non-safe or biased content, applying filters, and flagging sensitive material

4. Summarization & Paraphrasing

Rewriting AI-generated text for clarity, conciseness, and neutrality

5. Bias & Factuality Checks

Spotting hallucinations, misinformation, or inappropriate responses from LLMs

6. Multiple Choice & Scenario-Based

Choosing best responses, interpreting ambiguous language, or applying judgment in edge cases

Job Type: Part-time

Pay: $18.00 - $20.00 per hour

Prompt Evaluation (Bilingual) - Either Italian or German or French

Job Title: Prompt Evaluation (Bilingual)

Job Location: Remote

Employment Type: Part Time - Ongoing Projects

Job Summary

Key Responsibilities

Required Qualifications

Job Type: Part-time

Work Location: Remote

Other Recent Opportunities

Vice President Collections – REMOTE

Program Director - Retirement Solutions (Government/Public Sector Retirement Plans)

Phone Sales Representative - Remote

Application Specialist - Remote - NJ/PA Area

[Hiring] Analyst, Insights & Evidence @Real Chemistry

Remote EA to CEO Position

Job Alerts

Need Help?