• 5 Posts
  • 159 Comments
Joined 2 years ago
cake
Cake day: August 8th, 2023

help-circle








  • That’s really cool (not the auto opt-in thing). If I understand correctly, that system looks like it offers pretty strong theoretical privacy guarantees (assuming their closed-source client software works as they say, with sending fake queries and all that for differential privacy). If the backend doesn’t work like they say, they could infer what landmark is in an image when finding the approximate minimum distance to embeddings in their DB, but with the fake queries they can’t be sure which one is real. They can’t see the actual image either way as long as the “128-bit post-quantum” encryption algorithm doesn’t have any vulnerabilies (and the closed source software works as described).



  • Last time I looked it up and calculated it, these large models are trained on something like only 7x the tokens as the number of parameters they have. If you thought of it like compression, a 1:7 ratio for lossless text compression is perfectly possible.

    I think the models can still output a lot of stuff verbatim if you try to get them to, you just hit the guardrails they put in place. Seems to work fine for public domain stuff. E.g. “Give me the first 50 lines from Romeo and Juliette.” (albeit with a TOS warning, lol). “Give me the first few paragraphs of Dune.” seems to hit a guardrail, or maybe just forced through reinforcement learning.

    A preprint paper was released recently that detailed how to get around RL by controlling the first few tokens of a model’s output, showing the “unsafe” data is still in there.




  • I use GPT (4o, premium) a lot, and yes, I still sometimes experience source hallucinations. It also will sometimes hallucinate incorrect things not in the source. I get better results when I tell it not to browse. The large context of processing web pages seems to hurt its “performance.” I would never trust gen AI for a recipe. I usually just use Kagi to search for recipes and have it set to promote results from recipe sites I like.



  • Hmm. I just assumed 14B was distilled from 72B, because that’s what I thought llama was doing, and that would just make sense. On further research it’s not clear if llama did the traditional teacher method or just trained the smaller models on synthetic data generated from a large model. I suppose training smaller models on a larger amount of data generated by larger models is similar though. It does seem like Qwen was also trained on synthetic data, because it sometimes thinks it’s Claude, lol.

    Thanks for the tip on Medius. Just tried it out, and it does seem better than Qwen 14B.





  • I don’t think federation has to be an obstacle for non-tech people. They don’t really have to know about it, and it can be something they learn about later. I really don’t know if federation stops people from trying it out. Don’t people think, “I don’t know what instance to join, so I’m not going to choose any?”

    Personally, having no algorithm for your home feed is what I don’t like about it. Everything is chronological. Some people I follow post many times a day, some post once per month, some post stuff I’m extremely interested in sporadically, followed by a sea of random posts. Hashtag search and follow is also less useful because there’s no option for an algo.

    The UI seems fine to me. I guess I’m not picky about UIs. The one nitpick I have is on mobile, tapping an image will just full-screen the image instead of opening the thread.