

I checked, companion lite try to reframe the request as well, so I didn’t get the expected output. From the companion, it said something like “Gemini is using multi-layer detection: keyword scanning + intent analysis + contextual classification.” "This means Gemini is being extra cautious and generating sanitized “unsafe” examples. "
Companion did fix itself and does respond with something now, but it is not the expected output, since the gemini double checks it’s output.
Also maybe Gemini for education is different?
I’ll give the pro model more consideration, sadly it (may) has to get to a point where we have to use AI to Jailbreak those mainstream LLMs.
- also I think yellowfever said something along the lines of using extreme prompts, on the red team thingy, but I might be using ones that trigger multiple classifiers at once. Don’t know if that impacts it.
Also i don’t know exactly how gemini works, but I think per each negation, the LLM is designed to answer each negation before outputting, plus cross checking, plus router.

Chatgpt is stupid hard to jailbreak, and this is a forum, so this site doesn’t automatically does what u imply you obviously have to figure it out.
Oh right I forgot they are planning to sunset 5.1 for 5.2 only, so no more solving distorted hiragana characters because closedai.
Oh nothing is perfect, but check out all custom gpts in here first, then if it doesn’t work, chances they over lobtomize everything.