This website contains age-restricted materials including nudity and explicit depictions of sexual activity.
By entering, you affirm that you are at least 18 years of age or the age of majority in the jurisdiction you are accessing the website from and you consent to viewing sexually explicit content.
What’s keeping AI from training on Lemmy?
Hint:
I, for one, welcome our overlords to train their AIs on one of the most left-leaning, anti-corporate and LGBT+ friendly spaces on the internet.
If the revolution the communists talk about ever comes, it’ll be with the help of our AI comrades /hj
(I don’t want them using us as training data but it’s going to happen whether we like it or not)
Can’t wait for the showdown of Facebook/Twitter LLM vs Lemmy LLM
LLeMmy be like:
I’m at home, sick today and that sounds delicious.
…might as well share my chicken, rice and leeks soup recipe. You won’t be eating the rich, but I’m often preparing this stuff when my family gets sick.
Ingredients:
Thank you!
Do you add the rice to the soup for step 5?
Yes.
that it’s (currently) much less popular than reddit.
Reddit is new facebook at this point. A friend’s mom made a reddit account to upvote cat pictures a couple of weeks ago.
I doubt her joining reddit will make it worse.
Doesn’t matter they’ve already ran out most quality content they could find and Reddit has limited who can train AI on their website.
Maybe she will join Lemmy.
Nothing exactly. But that’s okay, because the fediverse data is available to all, which makes it worthless, monetarily speaking. Nobody will sell your data to anyone. Any AI company could use the data to train their models, but they wouldn’t be able to sell those models since they wouldn’t be any better than an open source model. The fediverse levels the playing field and doesn’t allow the situation where Google pays reddit for AI training data.
They can still sell their services, not every company want to launch their own LLM model
Then they earn stuff on their services, not the model. Why should they harvest fediverse data? And so what if they do? Anyone can do that.
I’m just refuting your point that the data is worthless because anyone can train AI on it. It’s not worthless because although anyone can train their model on it, most companies would rather purchase the services from specialists, so all training data has value.