Chrome and Chromium 80 Have A New Directly To Text Snippet Linking Feature

The latest version of Chromium (and Chrome) let you link directly to parts of web pages using a shiny new #:~:text= URL argument which works similarly to the existing #anchor feature, or formally a URI fragment, which has been around for decades. It is only worth mentioning because of the way some larger publications are making it out to be some kind of gigantic security concern. It's not. The most logical explanation as to why they claim it is is that it brings them clicks and attention and therefore profit. The only other possible explanation is ignorance and utter stupidity.

written by 윤채경 (Yoon Chae-kyung) 2020-02-22 - last edited 2020-06-19. © CC BY

This stock image has absolutely nothing to do with the article.

Here is how URL anchors have worked since forever:

Make a portion of a web page with <span id="foo"></span> somewhere in the page.
Make a link to yourpage.html with #foo at the end - as in yourpage.html#foo
Those who click that yourpage.html#foo link go directly to the foo section of yourpage.html

MediaWiki has built-in functionality which makes it possible to link directly to any headline on any page. The page HOWTO make GIMPs interface colorful and happy (again) has a section titled "HOWTO restore GIMPs interface to happiness" which can be linked directly using a #anchor as in HOWTO make GIMPs interface colorful and happy (again)#HOWTO restore GIMPs interface to happiness.

Chromium (and Chrome) 80 lets you link to any part of any web page by adding #:~:text= to the end of any URL. It works with any part of any website since it relies on the client filtering the web page for the given text. This feature is defined in a WICG specification draft titled ScrollToTextFragment.

A few larger web publications are currently presenting this new feature a some kind of huge privacy-issue.

Why It's Cool

Let's get my personal bias for this new Chromium feature out of the way before addressing the imagined privacy concerns. Being able to link directly to a specific part of an article on another website can be really useful for anyone who writes web pages. What do you do if you cite a long article on another website in an article you write? You link to that article and hope your readers can use the ctrl-f search function to find the paragraph you are referring to. The new ScrollToTextFragment feature makes it possible to cite a paragraph and link directly to that paragraph in the article you are citing. That's cool and it is potentially really useful.

Why It's NOT Cool

The web relies on standards. You can put a .jpg image on a web page and be fairly confident that the vast majority of your websites readers are able to see it. Those using a text-only browser like lynx won't but the majority will. That is not the case if you put a .xwd (X image dump) image or a .xcf.bz2 (compressed GIMP) image on a web page.

Chrome/Chromium seems to be the only web browser with support for ScrollToTextFragment. Some will accept it as a universally supported standard because Chrome has a huge market share and use it. Those of us who refuse to use standards that are not supported by Firefox, Pale Moon and other even less-known web browsers won't use it until or unless all the browsers support it. A link to a web page with #:~:text=something will, of course, work regardless of browser support - unsupported browsers will load the target page and simply omit jumping to the part of the page where the text something appears after the linked page is loaded.

It would be, and perhaps it will be, a cool and useful basic web feature if all the web browsers had support for it. And they should, it really is a really simple and basic feature.

Why Google Wants It

ScrollToTextFragment support allows company behind the world's most used search engine to link directly to the portion of a web pages text where the keywords users search for appear. It is not hard to understand why they want browsers to have such a feature.

A Loss Of Control

The concept of web page anchors has been around since forever. The MediaWiki content management system this site uses makes every heading an anchor you can link to. Most content management systems do not do that. Adding a client-side anchor feature takes some perceived control away from website owners who are left unable to control what parts of their web pages people link to.

The simple truth is that website owners never had any control to begin with. How a web page looks and acts is, and always has been, up to the client visiting a web server. You can not really visit this website by typing curl https://linuxreviews.org/News. That will now show the News page "as intended" and it will not serve you advertisements from our Google AdSense partner (which you should be blocking with Ublock Origin or a similar filter anyway). curl will show you the News page's raw HTML code. If that's how someone wants to visit that page then that's up to them.

The Privacy Implications

The otherwise reliable British publication The Register has an article titled "Chrome deploys deep-linking tech in latest browser build despite privacy concerns" and the always unreliable American rag Forbes has a similar blog post titled "Google Just Gave Millions Of Users A Reason To Quit Chrome". Both cite a Peter Snyder, supposedly a "privacy" researcher from the Brave Web Browser, who apparently gave this quote:

"Consider a situation where I can view DNS traffic (e.g. company network), and I send a link to the company health portal, with [the anchor] #:~:text=cancer. On certain page layouts, I might be able [to] tell if the employee has cancer by looking for lower-on-the-page resources being requested."

Peter Snyder, Brave Web Browser researcher

Well, let's do consider a situation where you can view a victim's DNS traffic. Loading a regular web page will result in the same DNS queries regardless of there being a #:~:text= snippet at the end of the link. Claiming that there is a practical difference is just laughable. There's not. It is, in theory, possible to do timing attacks which could reveal if someone is visiting https://someplace.tld/ or https://someplace.tld/#:~:text= since the order of the DNS queries could, in some cases, vary. Someone will probably write a research paper on how they can see a difference if they already know what specific page on a site someone will visit and they know exactly when it will be visited. The obvious question is: Why would you need to do some attack to confirm what you already know if you already have that information? There is no practically useful attack there.

As for the premise that someone visiting a page by following a link with #:~:text=cancer means that they have the cancer: It doesn't. You can click fitness.fandom.com/wiki/Main_Page#I_Bench_Four_Plates as many times as you want but you won't be able to bench four plates unless you go to the gym three times a week for several years. And it's a non-issue anyway.

Web browsers do not send web servers the #anchor part of a URLs, or any part starting with # because #... is a client-side tag. Hence UTM URL codes use ? and & (as in ?utm_source=Newsletter&utm_campaign=Update_10) instead. A link to page.html#headline will result in a GET request for page.html. The web browser client will handle the #headline part without any server-side interaction or awareness. That is why the Brave Web Browser "researcher" talks about silly theoretical DNS / resource-loading attacks; but you can't look at web server logs and see any difference in the real world.

The Forbes story about the new ScrollToTextFragment support in Chromium/Chrome is somehow "A Reason To Quit Chrome" is laughable. There are many good reasons to use other more privacy-respecting web browsers. This is not one of them. ScrollToTextFragment has absolutely no practical privacy problems whatsoever. None. It is simply a cool new web feature which, if standardized, would be quite useful.

The only problem with it is, as of today, that you can't use it and expect all your visitors to have web browsers who support it any time soon. The scrolldown to the specified text portion will simply not happen in firefox, but rather the page top will be shown instead. So the link does not fully work as intended for most users, thus it is kinda futile to even bother posting such a link.

0.00

(0 votes)

Enable comment auto-refresher

Oyvinds

50 months ago

Score 0

All the Linux distributions call "Ungoogled Chrome" Chromium with some name specifying what kind of build it is. The Fedora rpmfusion repositories have chromium-freeworld (with VAAPI patches, AVC and HEVC codecs and some other patches) and "chromium-browser-privacy" (VAAPI patches, extra codecs, "ungoogled" patches). The Arch Linux package is called "ungoogled-chromium", The ungoogled-chromium patches raise some interesting questions like why is there site-specific chromium code for several google-owned websites?

Permalink