urbanists.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
We're a server for people who like bikes, transit, and walkable cities. Let's get to know each other!

Server stats:

523
active users

#aitraining

3 posts3 participants0 posts today

In the middle of a small project I was doing with GPT a few weeks ago (failing to convert sheet music to MIDI), I had a good chat with it about ethics, the singularity, the stealing of the libraries and works of humanity, and right-wing bias training of AI models.

It was quite a good chat, so I decided to post it.

At the moment, I am quite optimistic about AI's future with humanity, but I am sure some billionaire owned corporation will come along and fuck that up somehow.

superhighwayman.com/2025/a-cha

#AI#GPT#Ethics

The thesis that each unauthorized use of a copyrighted work amounts to a lost sale going down the drain...

"At times, it sounded like the case was the authors’ to lose, with Chhabria noting that Meta was “destined to fail” if the plaintiffs could prove that Meta’s tools created similar works that cratered how much money they could make from their work. But Chhabria also stressed that he was unconvinced the authors would be able to show the necessary evidence. When he turned to the authors’ legal team, led by high-profile attorney David Boies, Chhabria repeatedly asked whether the plaintiffs could actually substantiate accusations that Meta’s AI tools were likely to hurt their commercial prospects. “It seems like you’re asking me to speculate that the market for Sarah Silverman’s memoir will be affected,” he told Boies. “It’s not obvious to me that is the case.”

When defendants invoke the fair use doctrine, the burden of proof shifts to them to demonstrate that their use of copyrighted works is legal. Boies stressed this point during the hearing, but Chhabria remained skeptical that the authors’ legal team would be able to successfully argue that Meta could plausibly crater their sales. He also appeared lukewarm about whether Meta’s decision to download books from places like LibGen was as central to the fair use issue as the plaintiffs argued it was. “It seems kind of messed up,” he said. “The question, as the courts tell us over and over again, is not whether something is messed up but whether it’s copyright infringement.”

A ruling in the Kadrey case could play a pivotal role in the outcomes of the ongoing legal battles over generative AI and copyright."

wired.com/story/meta-lawsuit-c

WIRED · A Judge Says Meta’s AI Copyright Case Is About ‘the Next Taylor Swift’By Kate Knibbs

Judge in #Meta case weighs key question for #AI #copyright lawsuits

A federal judge in SF will hear arguments on Thurs from Meta & a group of #authors in the 1st court hearing over a pivotal question in high-stakes copyright cases over #AItraining.

Judge Vince Chhabria will consider Meta's request for a pretrial ruling that it made "#FairUse" of books from writers including Junot Diaz & comedian Sarah Silverman to train its #LLaMa #LLM.

#law #tech #IntellectualProperty
reuters.com/legal/litigation/j

"As artificial intelligence continues to encroach into the media business, it seems the only options are to join ’em (as we saw this week with The Washington Post) or try to beat ’em, as we’re seeing today with Ziff Davis. Ziff Davis is one of the largest publishers in the U.S. with more than 45 sites globally, including IGN, CNET, Mashable, LifeHacker, and more. The publisher is now suing OpenAI over copyright infringement, claiming the company used Ziff Davis content to and generate responses through ChatGPT, per The New York Times.

“OpenAI seeks to move fast and break things on the assumption that the federal courts will not be able to effectively redress content owners’ sometimes existential concerns before it is too late,” the lawsuit reads (via Reuters). “OpenAI has intentionally and relentlessly reproduced exact copies and created derivatives of Ziff Davis Works without Ziff Davis’s authorization. … OpenAI has taken each of these steps knowing that they violate Ziff Davis’s intellectual property rights and the law.” The suit reportedly seeks hundreds of millions of dollars in damages."

avclub.com/openai-sued-ziff-da

PressGazette: ‘Unsustainable status quo’: AI companies and publishers respond to Govt copyright consultation. “The UK Government’s proposal to allow AI companies to automatically train their models on online content unless the rightsholder specifically opts out has been described as ‘unworkable’. A range of responses to the Government consultation on its proposed change to the existing […]

https://rbfirehose.com/2025/04/19/unsustainable-status-quo-ai-companies-and-publishers-respond-to-govt-copyright-consultation-pressgazette/

!!!!! F*ck off Meta !!!!! Meta gab heute bekannt, dass es in Kürze mit dem Training seiner KI-Modelle anhand von Inhalten erwachsener europäischer Nutzer auf seinen Social-Media-Plattformen Facebook und Instagram beginnen wird. Zu den Inhalten, die für das KI-Training verwendet werden, gehören Beiträge und Kommentare erwachsener Nutzer sowie Fragen und Anfragen aus der Interaktion mit dem Meta-KI-Assistenten.

"Finally, AI can fact-check itself. One large language model-based chatbot can now trace its outputs to the exact original data sources that informed them.

Developed by the Allen Institute for Artificial Intelligence (Ai2), OLMoTrace, a new feature in the Ai2 Playground, pinpoints data sources behind text responses from any model in the OLMo (Open Language Model) project.

OLMoTrace identifies the exact pre-training document behind a response — including full, direct quote matches. It also provides source links. To do so, the underlying technology uses a process called “exact-match search” or “string matching.”

“We introduced OLMoTrace to help people understand why LLMs say the things they do from the lens of their training data,” Jiacheng Liu, a University of Washington Ph.D. candidate and Ai2 researcher, told The New Stack.

“By showing that a lot of things generated by LLMs are traceable back to their training data, we are opening up the black boxes of how LLMs work, increasing transparency and our trust in them,” he added.

To date, no other chatbot on the market provides the ability to trace a model’s response back to specific sources used within its training data. This makes the news a big stride for AI visibility and transparency."

thenewstack.io/llms-can-now-tr

The New Stack · Breakthrough: LLM Traces Outputs to Specific Training DataAi2’s OLMoTrace uses string matching to reveal the exact sources behind chatbot responses

Big tech companies want total control but opt-out should be the way to go:

"OpenAI and Google have rejected the government’s preferred approach to solve the dispute about artificial intelligence and copyright.

In February almost every UK daily newspaper gave over its front page and website to a campaign to stop tech giants from exploiting the creative industries.

The government’s plan, which has prompted protests from leading figures in the arts, is to amend copyright law to allowdevelopers to train their AI models on publicly available content for commercial use without consent from rights holders, unless they opt out.

However, OpenAI has called for a broader copyright exemption for AI, rejecting the opt-out model."

thetimes.com/uk/technology-uk/

The Times · AI giants reject government’s approach to solving copyright rowBy Georgia Lambert
#AI#GenerativeAI#UK

"Now consider the chatbot therapist: what are its privacy safeguards? Well, the companies may make some promises about what they will and won't do with the transcripts of your AI sessions, but they are lying. Of course they're lying! AI companies lie about what their technology can do (of course). They lie about what their technologies will do. They lie about money. But most of all, they lie about data.

There is no subject on which AI companies have been more consistently, flagrantly, grotesquely dishonest than training data. When it comes to getting more data, AI companies will lie, cheat and steal in ways that would seem hacky if you wrote them into fiction, like they were pulp-novel dope fiends:
(...)
But it's not just people struggling with their mental health who shouldn't be sharing sensitive data with chatbots – it's everyone. All those business applications that AI companies are pushing, the kind where you entrust an AI with your firm's most commercially sensitive data? Are you crazy? These companies will not only leak that data, they'll sell it to your competition. Hell, Microsoft already does this with Office365 analytics:
(...)
These companies lie all the time about everything, but the thing they lie most about is how they handle sensitive data. It's wild that anyone has to be reminded of this. Letting AI companies handle your sensitive data is like turning arsonists loose in your library with a can of gasoline, a book of matches, and a pinky-promise that this time, they won't set anything on fire."

pluralistic.net/2025/04/01/doc

pluralistic.netPluralistic: Anyone who trusts an AI therapist needs their head examined (01 Apr 2025) – Pluralistic: Daily links from Cory Doctorow

The Conversation: Africa’s data workers are being exploited by foreign tech firms – 4 ways to protect them. “Since 2015, we have been studying the central role of African data workers in building and maintaining artificial intelligence (AI) systems, acting as ‘data janitors’. Our research found that companies rarely acknowledge the use of human workers in AI value chains, thus they […]

https://rbfirehose.com/2025/04/01/the-conversation-africas-data-workers-are-being-exploited-by-foreign-tech-firms-4-ways-to-protect-them/

Emboldened by #Trump , A.I. Companies Lobby for Fewer Rules

President Trump at the White House in January with, from left, Oracle’s chairman, Larry Ellison; SoftBank’s chief executive, Masayoshi Son; and OpenAI’s chief executive, Sam Altman.
#ai #privacy #openai #softbank #oracle #aitraining #training

nytimes.com/2025/03/24/technol

The New York Times · Emboldened by Trump, A.I. Companies Lobby for Fewer RulesBy Cecilia Kang

Fast Company: Hollywood warns about AI industry’s push to change copyright law. “A who’s who of musicians, actors, directors, and more have teamed up to sound the alarm as AI leaders including OpenAI and Google argue that they shouldn’t have to pay copyright holders for AI training material. In an open letter, submitted to the White House Office of Science and Technology, more than 400 […]

https://rbfirehose.com/2025/03/20/fast-company-hollywood-warns-about-ai-industrys-push-to-change-copyright-law/