Onihikage

joined 2 years ago
[–] Onihikage@beehaw.org 2 points 1 week ago (1 children)

I appreciate the links, but these are all about how to efficiently process an audio sample for a signal of choice.

Your stumbling block seemed to be that you didn't understand how it was possible, so I was trying to explain that, but I may have done a poor job of emphasizing why the technique I described matters. When you said this in a previous comment:

I do think that they’re not just throwing away the other fish, but putting them into specific baskets.

That was a misunderstanding of how the technology works. With a keyword spotter (KWS), which all smartphone assistants use to detect their activation phrases, they they aren't catching any "other fish" in the first place, so there's nothing to put into "specific baskets".

To borrow your analogy of catching fish, a full speech detection model is like casting a large net and dragging it behind a ship, catching absolutely everything and identifying all the fish/words so you can do things with them. Relative to a KWS, it's very energy intensive and catches everything. One is not likely to spend that amount of energy just to throw back most of the fish. Smart TVs, cars, Alexa, they can all potentially use this method continuously because the energy usage from constantly listening with a full model is not an issue. For those devices, your concern that they might put everything other than the keyword into different baskets is perfectly valid.

A smartphone, to save battery, will be using a KWS, which is like baiting a trap with pheromones only released by a specific species of fish. When those fish happen to swim nearby, they smell the pheromones and go into the trap. You check the trap periodically, and when you find the fish in there, you pull them out with a very small net. You've expended far less effort to catch only the fish you care about without catching anything else.

To use yet another analogy, a KWS is like a tourist in a foreign country where they don't know the local language and they've gotten separated from their guide. They try to ask locals for help but they can't understand anything, until a local says the name of the tour group, which the tourist recognizes, and is able to follow that person back to their group. That's exactly what a KWS system experiences, it hears complete nonsense and gibberish until the key phrase pops out of the noise, which they understand clearly.

This is what we mean when we say that yes, your phone is listening constantly for the keyword, but the part that's listening cannot transcribe your conversations until you or someone says the keyword that wakes up the full assistant.

My question is, how often is audio sampled from the vicinity to allow such processing to happen.

Given the near-immediate response of “Hey Google”, I would guess once or twice a second.

Yes, KWS systems generally keep a rolling buffer of audio a few seconds long, and scan it a few times a second to see if it contains the key phrase.

[–] Onihikage@beehaw.org 5 points 1 week ago* (last edited 1 week ago) (3 children)

How can you catch the right fish, unless you’re routinely casting your fishing net?

It's a technique called Keyword Spotting (KWS). https://en.wikipedia.org/wiki/Keyword_spotting

This uses a tiny speech recognition model that's trained on very specific words or phrases which are (usually) distinct from general conversation. The model being so small makes it extremely optimized even before any optimization steps like quantization, requiring very little computation to process the audio stream to detect whether the keyword has been spoken. Here's a 2021 paper where a team of researchers optimized a KWS to use just 251uJ (0.00007 milliwatt-hours) per inference: https://arxiv.org/pdf/2111.04988

The small size of the KWS model, required for the low power consumption, means it alone can't be used to listen in on conversations, it outright doesn't understand anything other than what it's been trained to identify. This is also why you usually can't customize the keyword to just anything, but one of a limited set of words or phrases.

This all means that if you're ever given an option for completely custom wake phrases, you can be reasonably sure that device is running full speech detection on everything it hears. This is where a smart TV or Amazon Alexa, which are plugged in, have a lot more freedom to listen as much as they want with as complex of a model as they want. High-quality speech-to-text apps like FUTO Voice Input run locally on just about any modern smartphone, so something like a Roku TV can definitely do it.

[–] Onihikage@beehaw.org 5 points 2 weeks ago

Intellectual property as a concept ultimately stifles progress every time it's been tried. Information wants to be free, and we prosper far more when we accept that reality.

Everyone should read Against Intellectual Monopoly by Michele Boldrin and David K. Levine. It's on David's website, Internet Archive, Anna's Archive, and various bookstores. Feel free to buy or print some copies and distribute them to your favorite people, libraries, bookstores, and congress critters~

[–] Onihikage@beehaw.org 7 points 3 weeks ago

Always has been.

[–] Onihikage@beehaw.org 2 points 1 month ago

I would recommend against pairing Battlemage with a low-spec CPU. As shown by Hardware Canucks, Hardware Unboxed, and others, Intel's Arc graphics driver overhead is currently much higher than competitors, which means they're disproportionately affected by having a weaker CPU. This causes the B580 to lose significantly more performance when paired with low-end CPUs than a roughly equivalent Nvidia or AMD card. At the very low end, the difference is especially stark. In some games, the B580 goes from neck-and-neck with a 4060 on a high-end CPU to losing half its performance with a low-end older CPU, while the 4060 only loses about 25%.

If you're really stuck with a lower-end CPU, it would be far better to get a used midrange AMD or Nvidia GPU from an older product generation for the same price and use that.

[–] Onihikage@beehaw.org 3 points 1 month ago (1 children)

I'm late to the party but have you seen Linux Journey? https://linuxjourney.com/

[–] Onihikage@beehaw.org 1 points 1 month ago* (last edited 1 month ago)

Have you ever seen Linux Journey? It's a very informative set of tutorials on how Linux fundamentally works under the hood; all the separate systems that together create an operating system. The concepts you learn there will apply to almost any distro in some way, even if some distros (like Atomic ones) don't let you mess with all of it.

For more top-level transition concerns, given that you're coming from stock Debian running KDE... Bazzite can also run KDE, so provided you select KDE when you download it, your GUI experience should be pretty much identical. Some minor but important differences would include themes, but there are guides for that, too.

When it comes to package management, the intent on Atomic systems is you basically don't install traditional packages (Flatpaks are the preferred option), but Bazzite has frameworks in place such that you can install pretty much any package from any distro, as laid out in their documentation I linked in my previous post and just now. Work is also ongoing to make traditional package-based software installations more seamless with an incoming switch from rpm-ostree to bootc, but that's getting into the weeds. If you have a deb file for a GUI program that's not available as a Flatpak, you'll be using a Distrobox to install it.

If you have any specific concerns about the differences, let me know and I can hopefully give you more details.

[–] Onihikage@beehaw.org 2 points 1 month ago (2 children)

I can highly recommend Bazzite for your needs. It has a KDE version which is clearly your favorite Desktop Environment (DE), it's extremely safe/stable due to being an Atomic distro (you can always boot into the previous image if a system update broke something), has incredible documentation, supports almost any traditional app through Distrobox (VPN requires rpm-ostree for now), has a scripted easy install of Waydroid for native android emulation, and has a few tweaks preconfigured to ensure the desktop gaming experience is a little more seamless out of the box than a stock distro. It really seems to tick all the boxes for what you're looking for.

If you want more focus on development and less on gaming, the Universal Blue team also makes Aurora for more developer-focused workloads, but Steam not being included in the image does introduce some usability regressions - Steam running via Flatpak or Distrobox is just plain less capable than a native install, though work is ongoing to make native installs Just Work even on Atomic systems.

[–] Onihikage@beehaw.org 6 points 2 months ago

People really out here treating their web browser like it's a mainframe

[–] Onihikage@beehaw.org 4 points 2 months ago (1 children)

I agree, investing in a company is fine. It's when you have the ability to trade your investment without any consequence whatsoever that the madness begins. Investment is supposed to be risky for both the company and the investor! But we've managed to externalize that risk into a market in which no single actor can be held responsible when a company is looted and destroyed by greed. Publicly-traded shares are now an entirely tax-free substitute for money - but only for the rich who have turned this system into a game to enrich themselves.

[–] Onihikage@beehaw.org 3 points 2 months ago

My favorite response to that currently is "Okay, send me your email password and show me all your credit cards. Oh, why not? You've done nothing wrong, so you have nothing to hide, right?

[–] Onihikage@beehaw.org 12 points 2 months ago

Audile is on F-droid, though it uses AudD for the actual music recognition backend. I'm not sure it's possible to have a FOSS backend for this kind of service.

 

Innovations summarized:

  • Accurate, accessible weather forecasts to help optimize planting and harvesting in mid/low-income regions
  • Microbial fertilizers to reduce the need for nitrogen fertilizers
  • Reducing or eliminating methane from livestock, which accounts for about 20% of human greenhouse gas emissions
  • Helping farmers and communities implement better rainwater harvesting
  • Lowering the cost of digital agriculture that can help farmers use irrigation, fertilizer and pesticides most efficiently
  • Encouraging production of alternative proteins to reduce demand for livestock
  • Providing insurance and other social protections to help farmers recover from extreme weather events

I would have liked to see more focus on finding ways to avoid monocropping, and a callout to the heavy risks of the steady corporate consolidation of the agriculture industry, but breaking up corporations isn't exactly an innovation so I can see why it wouldn't get a mention. Some of these seem fairly weak as innovations go, and some sound so inexpensive that it's a wonder they aren't already done, but all of them sound like decent steps to take.

Which among this list do you think governments should focus on the most?

view more: next ›