Android
DROID DOES
Welcome to the droidymcdroidface-iest, Lemmyest (Lemmiest), test, bestest, phoniest, pluckiest, snarkiest, and spiciest Android community on Lemmy (Do not respond)! Here you can participate in amazing discussions and events relating to all things Android.
The rules for posting and commenting, besides the rules defined here for lemmy.world, are as follows:
Rules
1. All posts must be relevant to Android devices/operating system.
2. Posts cannot be illegal or NSFW material.
3. No spam, self promotion, or upvote farming. Sources engaging in these behavior will be added to the Blacklist.
4. Non-whitelisted bots will be banned.
5. Engage respectfully: Harassment, flamebaiting, bad faith engagement, or agenda posting will result in your posts being removed. Excessive violations will result in temporary or permanent ban, depending on severity.
6. Memes are not allowed to be posts, but are allowed in the comments.
7. Posts from clickbait sources are heavily discouraged. Please de-clickbait titles if it needs to be submitted.
8. Submission statements of any length composed of your own thoughts inside the post text field are mandatory for any microblog posts, and are optional but recommended for article/image/video posts.
Community Resources:
We are Android girls*,
In our Lemmy.world.
The back is plastic,
It's fantastic.
*Well, not just girls: people of all gender identities are welcomed here.
Our Partner Communities:
view the rest of the comments
I must admit I don't know enough about RISC-V performance. How's it with battery life? That's the one reason why no (mass-produced) phone or tablet will ever be made with an x86, so unless RISC-V is on-par (or better) with arm, it won't succeed.
ARM is an older Reduced Instruction Set Computing out of Berkeley too. There are not a lot of differences here. x86 could even be better. American companies are mostly run by incompetent misers that extract value through exploitation instead of innovation on the edge and future. Intel has crashed and burned because it failed to keep pace with competition. Like much of the newer x86 stuff is RISC-like wrappers on CISC instructions under the hood, to loosely quote others at places like Linux Plumbers conference talks.
ARM costs a fortune in royalties. RISC-V removes those royalties and creates an entire ecosystem for companies to independently sell their own IP blocks instead of places like Intel using this space for manipulative exploitation through vendor lock in. If China invests in RISC-V, it will antiquate the entire West within 5-10 years time, similar to what they did with electric vehicles and western privateer pirate capitalist incompetence.
I think it's actually the opposite. The actual execution units tend to be more RISC-like but the "public" interfaces are CISC to allow backwards compatibility. Otherwise, they would have to publish new developer docs for every microcode update or generational change.
Not necessarily a bad strategy but, definitely results in greater complexity over time to translate between the "external" and "internal" architecture and also results in challenged in really tuning the interfacing between hardware and software because of the abstraction layer.
You caught me. I meant this, but was thinking backwards from the bottom up. Like building the logic and registers required to satisfy the CISC instruction.
This mental space is my thar be dragons and wizards space on the edge of my comprehension and curiosity. The pipelines involved to execute a complex instruction like AVX loading a 512 bit word, while two logical cores are multi threading with cache prediction, along with the DRAM bus width limitations, to run tensor maths – are baffling to me.
I barely understood the Chips and Cheese article explaining how the primary bottleneck for running LLMs on a CPU is the L2 to L1 cache bus throughput. Conceptually that makes sense, but thinking in terms of the actual hardware, I can't answer, "why aren't AI models packaged and processed in blocks specifically sized for this cache bus limitation". If my cache bus is the limiting factor, duel threading for logical cores seems like asinine stupidity that poisons the cache. Or why an OS CPU scheduler is not equip to automatically detect or flag tensor math and isolate threads from kernel interrupts is beyond me.
Adding a layer to that and saying all of this is RISC cosplaying as CISC is my mental party clown cum serial killer... "but... but... it is 1 instruction..."
Yeah. I'm from more of a SysAdmin/DevOps/(kinda)SWE background so, I tend to think of it in a similar manner to APIs. The x86_64 CISC registers are like a public API and the ??? RISC-y registers are like an internal API and may or may not even be accessible outside of intra-die communication.
Very similar to where I'm at. I've finally gotten my AuADHD brain to get Vivado setup for my Zynq dev board and I think I finally have everything that I need to try to unbrick my Fomu (it doesn't have a hard USB controller so, I have to use a pogo pin jig to try to load a basic USB softcore that will allow it to be programmed normally).
Mind sharing that article?
I think that it's like the above way of thinking of it like APIs but, I could be entirely incorrect. I don't think I am though. Because the registers that programs interact with are standardized, those probably are "actual" x86, in that they are to be expected to handle x86 instructions in the spec defined manner. Past those externally-addressable registers is just a black box that does the work to allow the registers to act in an expected manner. Some of that black box also must include programmable logic to allow microcode to be a thing.
Its a crazy and magical side of technology.
Oh wow, so we are in kinda similar places but from vastly different paths and capabilities. Back before I was disabled I was a rather extreme outlier of a car enthusiast, like I painted (owned) ported and machined professionally. I was really good with carburetors, but had a chance to get some specially made direct injection race heads with mechanical injector ports in the combustion chamber... I knew some of the Hilborn guys... real edgy race stuff. I was looking at building a supercharged motor with a mini blower and a very custom open source Megasquirt fuel injection setup using a bunch of hacked parts from some junkyard Mercedes direct injection Bosch diesel cars. I had no idea how complex computing and microcontrollers are, but I figured it couldn't be much worse than how I had figured out all automotive systems and mechanics. After I was disabled 11 years ago riding a bicycle to work while the heads were off of my Camaro, I got into Arduino and just trying to figure out how to build sensors and gauges. I never fully recovered from the broken neck and back, but am still chipping away at compute. Naturally, I started with a mix of digital functionality and interfacing with analog.
From this perspective, I don't really like API like interfaces. I often have trouble wrapping my head around them. I want to know what is actually happening under the hood. I have a ton of discrete logic for breadboards and have built stuff like Ben Eater's breadboard computer. At one point I played with CPLDs in Quartus. I have an ICE40 around but have only barely gotten the open source toolchain running before losing interest and moving on to other stuff. I prefer something like Flash Forth or Micropython running on a microcontroller so that I am independent of some proprietary IDE nonsense. But I am primarily a Maker and prefer fabrication or CAD over programming. I struggle to manage complexity and the advanced algorithms I would know if I had a formal CS background.
So from that perspective, what I find baffling about RISC under CISC is specifically the timing involved. Your API mindset is likely handwaving this as black box, but I am in this box. Like, I understand how there should be a pipeline of steps involved for the complex instruction to happen. What I do not understand is the reason or mechanisms that separate CISC from RISC in this pipeline. If my goal is to do A..E, and A-B and C-D are RISC instructions, I have a ton of questions. Like why is there still any divide at all for x86 if direct emulation is a translation and subdivision of two instructions? Or how is the timing of this RISC compilation as efficient as if the logic is built as an integrated monolith? How could that ever be more efficient? Is this incompetent cost cutting, backwards compatibility constrained, or some fundamental issue with the topology like RLC issues with the required real estate on the die?
As far as the Chips and Cheese article, if I recall correctly, that was saved once upon a time in Infinity on my last phone, but Infinity got locked by the dev. The reddit post link would have been a month or two before June of 2023, but your search is as good as mine. I'm pretty good at reading and remembering the abstract bits of info I found useful, but I'm not great about saving citations, so take it as water cooler hearsay if you like. It was said in good faith with no attempt to intentionally mislead.