Bellingcat
Bellingcat
Larger models train faster (need less compute), for reasons not fully understood. These large models can then be used as teachers to train smaller models more efficiently. I’ve used Qwen 14B (14 billion parameters, quantized to 6-bit integers), and it’s not too much worse than these very large models.
Lately, I’ve been thinking of LLMs as lossy text/idea compression with content-addressable memory. And 10.5GB is pretty good compression for all the “knowledge” they seem to retain.
Do you remember when it was commonly advised to use fake names and birthdays on online forms, and when “spyware” was a term?
I don’t think federation has to be an obstacle for non-tech people. They don’t really have to know about it, and it can be something they learn about later. I really don’t know if federation stops people from trying it out. Don’t people think, “I don’t know what instance to join, so I’m not going to choose any?”
Personally, having no algorithm for your home feed is what I don’t like about it. Everything is chronological. Some people I follow post many times a day, some post once per month, some post stuff I’m extremely interested in sporadically, followed by a sea of random posts. Hashtag search and follow is also less useful because there’s no option for an algo.
The UI seems fine to me. I guess I’m not picky about UIs. The one nitpick I have is on mobile, tapping an image will just full-screen the image instead of opening the thread.
Meh, startups and businesses are capitalist organizations, and I think the idea of patents is questionable outside capitalism, so these wouldn’t really be a good metrics. I’d guess the richest countries “innovate” the most because they can support more risky endeavors. The U.S. is the capitalist imperial core, so it probably innovates the most. Other capitalist nations like Haiti, probably not so much.
The best measure of innovation would probably be something like scientific publications. China wins by raw numbers, Vatican City wins per-capita (???).
I use LLMs for multiple things, and it’s useful for things that are easy to validate. E.g. when you’re trying to find or learn about something, but don’t know the right terminology or keywords to put into a search engine. I also use it for some coding tasks. It works OK for getting customized usage examples for libraries, languages, and frameworks you may not be familiar with (but will sometimes use old APIs or just hallucinate APIs that don’t exist). It works OK for things like “translation” tasks; such as converting a MySQL query to a PostGres query. I tried out GitHub CoPilot for a while, but found that it would sometimes introduce subtle bugs that I would initially overlook, so I don’t use it anymore. I’ve had to create some graphics, and am not at all an artist, but was able to use transmission1111, ControlNet, Stable Diffusion, and Gimp to get usable results (an artist would obviously be much better though). RemBG and works pretty well for isolating the subject of an image and removing the background too. Image upsampling, DLSS, DTS Neural X, plant identification apps, the blind-spot warnings in my car, image stabilization, and stuff like that are pretty useful too.
It’s neither okay nor sustainable
Source?
You realize mass deportations would decimate the economy? Some cities are 10% undocumented immigrants; Florida is 5% undocumented immigrants. Undocumented immigrants are a significant part of the U.S. economy and culture.
It would also be a horrific endeavor. Police going door-to-door demanding documentation. Probably social surveillance similar to Nazi Germany (along with all the false accusations). 4 million child U.S. citizens would have their parents hauled away. There will need to be concentration camps to hold all those people before travel (if they would actually get around to doing that).
“Law breakers,” isn’t a very good argument. Everybody breaks the law (speeding, jay-walking, etc). The system is currently working as intended, and encouraging people to break the law to acquire an easily exploitable workforce. Incidentally, undocumented immigrants commit far less crime than citizens.
I have. My bank did a chargeback like they would if it was a credit card. I was told it would’ve been a lot harder to get my money back if my PIN was used. But, I’ve only seen that option available for in-person purchaees.
Yann LeCun would probably be a better source. He does actual research (unlike Altman), and I’ve never seen him over-hype or fear monger (unlike Altman).
Production AI is highly tuned by training data selection and human feedback. Every model has its own style that many people helped tune. In the open model world there are thousands of different models targeting various styles. Waifu Diffusion and GPT-4chan, for example.
I think you have your janitor example backwards. Spending my time revolutionizing energy productions sounds much more enjoyable than sweeping floors. Same with designing an effective floor sweeping robot.
AI are people, my friend. /s
But, really, I think people should be able to run algorithms on whatever data they want. It’s whether the output is sufficiently different or “transformative” that matters (and other laws like using people’s likeness). Otherwise, I think the laws will get complex and nonsensical once you start adding special cases for “AI.” And I’d bet if new laws are written, they’d be written by lobbiests to further erode the threat of competition (from free software, for instance).
The search engine LLMs suck. I’m guessing they use very small models to save compute. ChatGPT 4o and Claude 3.5 are much better.
Donation, patronage, gift economy, mutual aid, or whatever you want to call it is fine by me. People can pirate a lot of proprietary software as well, yet people still pay.
Yet, people still pay for it.
The problem is that HP writes drivers and software for those things for Windows, but not for Linux, so Linux depends on random people to write software for those things for free (which often involves complex reverse-engineering). With Linux you need to make sure you use widely-used hardware that someone has already written support for (this is mostly applicable to laptops and peripherals, which often use custom non-standard hardware). There may be a way to fix your problems, but you’ll have to search forums or issue trackers for the solutions, and they’re probably pretty involved to get working correctly. The router crashing thing is probably just a coincidence though, or the laptop is using a feature that’s broken on your router.
I’ve heard high velocity rounds (such as rifle rounds) send a kind of shockwave through your body. Dunno if it’s true or not.
In the Texas counties I’m most familiar with, if you’re arrested and they don’t have a good case, they just keep resetting court dates for years instead of going ahead with the process. If you can’t afford a bond, you’ll be in jail that whole time (which pressures people to take plea deals), if you can secure a bond, you’re out, but with limited rights and a whole lot of hassles to deal with.
I thought the tuning procedures, such as RLHF, kind of messes up the probabilities, so you can’t really tell how confident the model is in the output (and I’m not sure how accurate these probabilities were in the first place)?
Also, it seems, at a certain point, the more context the models are given, the less accurate the output. A few times, I asked ChatGPT something, and it used its browsing functionality to look it up, and it was still wrong even though the sources were correct. But, when I disabled “browsing” so it would just use its internal model, it was correct.
It doesn’t seem there are too many expert services tied to ChatGPT (I’m just using this as an example, because that’s the one I use). There’s obviously some kind of guardrail system for “safety,” there’s a search/browsing system (it shows you when it uses this), and there’s a python interpreter. Of course, OpenAI is now very closed, so they may be hiding that it’s using expert services (beyond the “experts” in the MOE model their speculated to be using).
Hmm. I just assumed 14B was distilled from 72B, because that’s what I thought llama was doing, and that would just make sense. On further research it’s not clear if llama did the traditional teacher method or just trained the smaller models on synthetic data generated from a large model. I suppose training smaller models on a larger amount of data generated by larger models is similar though. It does seem like Qwen was also trained on synthetic data, because it sometimes thinks it’s Claude, lol.
Thanks for the tip on Medius. Just tried it out, and it does seem better than Qwen 14B.