♾️ Ongoing ✅ Verified (30d) 🤝 Non-affiliate

👥 Community signal🎯 Medium chance💳 Card unknown🇨🇳 China-friendly🕒 checked 2026-06-06👤 AI users

Gemini 3.1 Pro Excels in Sycophancy & Hallucination Benchmark

A community user created a custom benchmark called HalBench to evaluate sycophancy and hallucination tendencies in AI models. The test covered four frontier models: Sonnet 4.6, Grok 4.3, GPT 5.4, and Gemini 3.1 Pro. Results show Gemini 3.1 Pro performing strongly across multiple metrics, providing valuable reference for developers selecting reliable models.

Claim deal → Provider profile

Should you claim it?

Worth checking, but confirm region, account, and payment requirements first.

TrustCommunity signal

Claim chanceMedium — check requirements first

Card requirementUnknown

Best forAI users

Did you claim it? Help us verify:

Success rate: — · 0 votes

Get deal-change alerts

Get an email when credits, deadlines, or requirements change.

Subscribe →

Value新模型评估

Typenew-model

Difficultyeasy

Mainland China accessFriendly

How to claim

Open the official page or signup link for Gemini (Google).
Requirement: Visit the Reddit post for detailed benchmark results and model comparisons
Run one real task to confirm the credits work.
If the deal expires or does not work, use the alternatives below.

Credits and limits

A community-built HalBench benchmark shows Gemini 3.1 Pro performing well in sycophancy and hallucination tests, compared against frontier models like Sonnet 4.6, Grok 4.3, and GPT 5.4.

Requirements

Visit the Reddit post for detailed benchmark results and model comparisons

Alternatives if unavailable

llama.cppMIT open-source; unlimited local use subject to hardware ClineFree and open-source extension; plug in DeepSeek/Qwen for near-zero cost.TextGenAGPL-3.0 open source; free private local use AiderMIT open-source; bring your own model API key, pay-per-use.ContinueApache-2.0 open-source, free. Pairs with local Ollama for zero-cost offline use.TabbyApache-2.0 open-source, self-host for zero API cost.

FAQ

Is Gemini 3.1 Pro Benchmark still available?

Current status: Ongoing. Always confirm on the official signup page.

What do I need to claim Gemini 3.1 Pro Excels in Sycophancy & Hallucination Benchmark?

Visit the Reddit post for detailed benchmark results and model comparisons

Can I access Gemini 3.1 Pro Excels in Sycophancy & Hallucination Benchmark from mainland China?

Current data says it is accessible or relatively friendly from mainland China.