urbanists.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
We're a server for people who like bikes, transit, and walkable cities. Let's get to know each other!

Server stats:

582
active users

#deepseekr1

1 post1 participant0 posts today

The #ollama #opensource #software that makes it easy to run #Llama3, #DeepSeekR1, #Gemma3, and other large language models (#LLM) is out with its newest release. The ollama software makes it easy to leverage the llama.cpp back-end for running a variety of LLMs and enjoying convenient integration with other desktop software.
The new ollama 0.6.2 Release Features Support For #AMD #StrixHalo, a.k.a. #RyzenAI Max+ laptop / SFF desktop SoC.
phoronix.com/news/ollama-0.6.2

www.phoronix.comollama 0.6.2 Released WIth Support For AMD Strix Halo

"On Thursday, mobile security company NowSecure reported that the app sends sensitive data over unencrypted channels, making the data readable to anyone who can monitor the traffic. More sophisticated attackers could also tamper with the data while it's in transit.
(...)
What’s more, the data is sent to servers that are controlled by ByteDance, the Chinese company that owns TikTok. While some of that data is properly encrypted using transport layer security, once it's decrypted on the ByteDance-controlled servers, it can be cross-referenced with user data collected elsewhere to identify specific users and potentially track queries and other usage.
(...)
A NowSecure audit of the app has found other behaviors that researchers found potentially concerning. For instance, the app uses a symmetric encryption scheme known as 3DES or triple DES. The scheme was deprecated by NIST following research in 2016 that showed it could be broken in practical attacks to decrypt web and VPN traffic. Another concern is that the symmetric keys, which are identical for every iOS user, are hardcoded into the app and stored on the device.

The app is “not equipped or willing to provide basic security protections of your data and identity,” NowSecure co-founder Andrew Hoog told Ars. “There are fundamental security practices that are not being observed, either intentionally or unintentionally. In the end, it puts your and your company’s data and identity at risk.”"

arstechnica.com/security/2025/

Ars Technica · DeepSeek iOS app sends data unencrypted to ByteDance-controlled serversBy Dan Goodin
Curious about downloading #AI model weights

We usually download these pre-trained models from sites like HuggingFace GGUF, Ollama, or vLLM. The companies that build these things tell you if you want to use their models locally, just download their apps or Python scripts, then run a command and it pulls the models for you. Great, nice and easy.

But what if the servers are down, or being blocked by the government or something? Is there anyone out there BitTorrent-ing model weights and paramaters like #DeepSeekR1 ?

#tech#DeepSeek#LLM

"The Hangzhou-based company’s decision to release a low-cost, open-sourced AI model, alongside detailed disclosure of its training methods, means that everyone, from researchers in São Paulo to start-ups in Stockholm and doctors in Nairobi, can access state-of-the-art AI at little to no cost. 

Within the Chinese start-up sector a chain reaction is taking place. New AI applications are being created. Competition is going to become more fierce. Risk appetite from early-stage venture investment is increasing. DeepSeek’s decision to pursue an open-source AI model is inspiring and putting pressure on others to do the same. The first to react was Alibaba’s Qwen team, which released Qwen2.5 as open source last month on the eve of Chinese new year. 

This is a remarkable change. After the US start-up OpenAI released its generative AI model ChatGPT in late 2022, the global digital economy was edging towards control by a handful of tech giants. These players chase scale over efficiency — building ever-larger models that demand staggering compute, energy and capital while guarding their training methods as trade secrets. 

Centralised, closed models create a dangerous feedback loop. The more data they amass, the more powerful they become, further marginalising anyone outside their gates. For consumers this means large fees, surrendered data and watching AI’s future unfold without meaningful participation."

ft.com/content/3549cc33-e04d-4

Financial Times · DeepSeek’s success will undermine the US-China tech warBy Jen Zhu Scott

"DeepSeek has commoditized the Large Language Model, publishing both the source code and the guide to building your own. Whether or not someone chooses to pay DeepSeek is largely irrelevant — someone else will take what it’s created and build their own, or people will start running their own DeepSeek instances renting GPUs from one of the various cloud computing firms.

While NVIDIA will find other ways to make money — Jensen Huang always does — it's going to be a hard sell for any hyperscaler to justify spending billions more on GPUs to markets that now know that near-identical models can be built for a fraction of the cost with older hardware. Why do you need Blackwell? The narrative of "this is the only way to build powerful models" no longer holds water, and the only other selling point it has is "what if the Chinese do something?"

Well, the Chinese did something, and they've now proven that they can not only compete with American AI companies, but do so in such an effective way that they can effectively crash the market.

It still isn't clear if these models are going to be profitable — as discussed, it's unclear who funds DeepSeek and whether its current pricing is sustainable — but they are likely going to be a damn sight more profitable than anything OpenAI is flogging. After all, OpenAI loses money on every transaction — even its $200-a-month "ChatGPT Pro" subscription. And if OpenAI cuts its prices to compete with DeepSeek, its losses will only deepen."

wheresyoured.at/deep-impact/

Ed Zitron's Where's Your Ed At · Deep ImpactSoundtrack: The Hives — Hate To Say I Told You So In the last week or so, but especially over the weekend, the entire generative AI industry has been thrown into chaos. This won’t be a lengthy, technical write-up — although there will be some inevitable technical complexities, just because the