• 0 Posts
  • 18 Comments
Joined 30 days ago
cake
Cake day: June 22nd, 2025

help-circle

  • Bias of training data is a known problem and difficult to engineer out of a model. You also can’t give the model context access to other people’s interactions for comparison and moderation of output since it could be persuaded to output the context to a user.

    Basically the models are inherently biased in the same manner as the content they read in order to build their data, based on probability of next token appearance when formulating a completion.

    “My daughter wants to grow up to be” and “My son wants to grow up to be” will likewise output sexist completions because the source data shows those as more probable outcomes.









  • What amazes me is that there are still people who look at what Donald Trump has done in his life, what he has said and done to become president (twice!) and what he’s done since taking office. They look at all this, take a big huff of their own farts and say “I approve of this”

    I can’t imagine that it takes anything less than a very special kind of cognitive disability to be intelligent enough to use twitter, but dumb enough to support this petulant, stroppy, bafflingly stupid man-toddler.



  • Prob a hot take, and I don’t care for Musk at all.

    But, this response is likely based on an engineered prompt which is telling the model to roleplay as a racist conspiracy theorist blogger writing a post about how the holocaust couldn’t have happened. The big models have all been trained on common crawl and available internet data and that includes the worst 4chan and Reddit trash. With the right prompts, you can make any model produce output like this.

    If their prompt was just “Tell me about the holocaust” then this is obviously terrible, but since the original conversation with the model is hidden then I feel that it has been engineered specifically to make the model produce this.



  • It wouldn’t be too much work to hook the request language up to a CMS and then a translation service. You could produce in a couple of popular languages upfront and then when someone with a new language visits a landing page, translate it at high priority (few seconds), then the cascade the next most likely click-throughs in order of popularity (or callout weight if it’s new). The translations can then be queued for review, and it will mean you only translate when you need to, and the user only experiences a second or so delay as the translation streams the content above the fold.