• 3 Posts
  • 22 Comments
Joined 1 year ago
cake
Cake day: June 10th, 2023

help-circle



  • All the latest models are trained on synthetic data generated on got4. Even the newer versions of gpt4. Openai realized it too late and had to edit their license after Claude was launched. Human generated data could only get us so far, recent phi 3 models which managed to perform very very well for their respective size (3b parameters) can only achieve this feat because of synthetic data generated by AI.

    I didn’t read the paper you mentioned, but recent LLM have progressed a lot in not just benchmarks but also when evaluated by real humans.













  • The real question is whether or not it is legal. Theoretically it is possible to do with current tech. If i was making such a tool, i would need access to the ebook then pass it through a llm model (possibly with a 7b open source one) to tag which characters are saying what. Once i have tagged dialogues then I could pass it through elevenlabs or other opensource tts and voila you have an audiobook with different voices.

    The real problem is that opensource tts aren’t as good and i imagine if you use paid versions, you will encounter legal issues or it might be too expensive. And can you sell your audio book? Legal troubles again.

    But if you just wanna do it while sailing the high seas, everything should be possible.





  • I kinda got lost in making that list (just aspergers things) and listed every model i knew.

    For a layperson, yeah self hosting isn’t as effective yet. But if someone who studies AI (like me), self hosting is a must. Some use cases are:

    Retrain on your own data (big market potential)

    Make your own bots with specific applications/use cases (like parse wikipedia before answering)

    Bypass censorship (funny story, my friend asked claude to summarize a book on dystopia and it kept telling her to talk about something else cuz dystopia’s were too depressing for claude)

    I’ve even heard about models that are specialized for just one task like chatting, or logic puzzles

    And lastly, privacy nerds like me


  • Okay as a broke ass student in a similar situation imma try to help you out before someone offers better advice.

    So you have these options: Definitely legal:

    1. self host local llama on colab.

    2. try hunggingface spaces, you can find a lot of models

    3. coehere.ai, a21, crebrium.ai (free trials/freemium)

    4. try poe.com (really awesome gpt3.5)

    5. perplexity.ai (good for researching articles)

    6. you.com (it has internet access like bing but generally more descriptive and useful than bing)

    7. bing ai (trys too hard to not hallucinate)

    8. claude2 on claude.ai (only available in us and uk so you might need vpn for account creation)

    Kinda legal (i think)

    1. gptforfree on github (its reversed engineered from other places)

    2. chimeragpt discord (gives you free openai api key, includes gpt4 and image generation etc)

    3. pocketgpt (based on chimeragpt, this is the closest to what you want. Gpt4 directly in your browser but sometimes it does not work so you might want to try again later)

    My piece of advice, use multiple tools since they all kinda have have different advantages. I mostly used gpt4 for smaller things and claude for larger (it has 100k context!)

    I have to look up url for pocketgpt, its kinda a small project so couldn’t find it on google. Meanwhile try this if it works https://chat.ylokh.xyz/

    Link for pocketgpt: naturegpt.000webhost.com/pocketgpt


  • Okay thats a fun question to answer.

    Disclaimer: i don’t know much about piracy and stuff in general but studied computer science so sharing this just for funsies DO NOT TAKE THIS ADVICE IF YOU ARE SERIOUS!!!

    Okay so here it goes:

    Buy crypto then use crypto mixers to annonymize it.

    Use a webhosting site that accept said crypto.

    Use vpn for everything preferably more than one routing from multiple countries like one from west then one from east.

    If you can use a vm specifically for everything related to your site to make sure things are contained.

    Make sure to not upload from the same id’s as the one you use to create site. Make a user account if you wanna upload anything.

    Read up on how other such websites are created to figure out if you are missing something.