Setting Element Ordering With HTML Rewriter Using CSS

 After shipping my work transforming HTML with Netlify’s edge functions I realized I have a little bug: the order of the icons specified in the URL doesn’t match the order in which they are displayed on screen.

Why’s this happening?

I have a bunch of links in my HTML document, like this:

<icon-list>
  <a href="/1/"></a>
  <a href="/2/"></a>
  <a href="/3/"></a>
  <!-- 2000+ more -->
</icon-list>

I use html-rewriter in my edge function to strip out the HTML for icons not specified in the URL. So for a request to:

/lookup?id=1&id=2

My HTML will be transformed like so:

<icon-list>
  <!-- Parser keeps these two -->
  <a href="/1/"></a>
  <a href="/2/"></a>
  
  <!-- But removes this one -->
  <a href="/3/"></a>
</icon-list>

Resulting in less HTML over the wire to the client.

But what about the order of the IDs in the URL? What if the request is to:

/lookup?id=2&id=1

Instead of:

/lookup?id=1&id=2

In the source HTML document containing all the icons, they’re marked up in reverse chronological order. But the request for this page may specify a different order for icons in the URL. So how do I rewrite the HTML to match the URL’s ordering?

The problem is that html-rewriter doesn’t give me a fully-parsed DOM to work with. I can’t do things like “move this node to the top” or “move this node to position x”.

With html-rewriter, you only “see” each element as it streams past. Once it passes by, your chance at modifying it is gone. (It seems that’s just the way these edge function tools are designed to work, keeps them lean and performant and I can’t shoot myself in the foot).

So how do I change the icon’s display order to match what’s in the URL if I can’t modify the order of the elements in the HTML?

CSS to the rescue!

Because my markup is just a bunch of <a> tags inside a custom element and I’m using CSS grid for layout, I can use the order property in CSS!

All the IDs are in the URL, and their position as parameters has meaning, so I assign their ordering to each element as it passes by html-rewriter. Here’s some pseudo code:

// Get all the IDs in the URL
const ids = url.searchParams.getAll("id");

// Select all the icons in the HTML
rewriter.on("icon-list a", {
  element: (element) => {
    // Get the ID
    const id = element.getAttribute('id');
    
    // If it's in our list, set it's order
    // position from the URL
    if (ids.includes(id)) {
      const order = ids.indexOf(id);
      element.setAttribute(
        "style",
        `order: ${order}`
      );
    // Otherwise, remove it
    } else {
      element.remove();
    }
  },
});

Boom! I didn’t have to change the order in the source HTML document, but I can still get the displaying ordering to match what’s in the URL.

I love shifty little workarounds like this!


Reply via: Email · Mastodon · Bluesky

Jim Nielsen's Blog

02 Jul 2025 at 20:00

An Analysis of Links From The White House’s “Wire” Website

 A little while back I heard about the White House launching their version of a Drudge Report style website called White House Wire. According to Axios, a White House official said the site’s purpose was to serve as “a place for supporters of the president’s agenda to get the real news all in one place”.

So a link blog, if you will.

As a self-professed connoisseur of websites and link blogs, this got me thinking: “I wonder what kind of links they’re considering as ‘real news’ and what they’re linking to?”

So I decided to do quick analysis using Quadratic, a programmable spreadsheet where you can write code and return values to a 2d interface of rows and columns.

I wrote some JavaScript to:

  • Fetch the HTML page at whitehouse.gov/wire
  • Parse it with cheerio
  • Select all the external links on the page
  • Return a list of links and their headline text

In a few minutes I had a quick analysis of what kind of links were on the page:

Screenshot of the Quadratic spreadsheet, with rows and columns of data on the left, and on the right a code editor containing the code which retrieved and parsed the data on the left.

This immediately sparked my curiosity to know more about the meta information around the links, like:

  • If you grouped all the links together, which sites get linked to the most?
  • What kind of interesting data could you pull from the headlines they’re writing, like the most frequently used words?
  • What if you did this analysis, but with snapshots of the website over time (rather than just the current moment)?

So I got to building.

Quadratic today doesn’t yet have the ability for your spreadsheet to run in the background on a schedule and append data. So I had to look elsewhere for a little extra functionality.

My mind went to val.town which lets you write little scripts that can 1) run on a schedule (cron), 2) store information (blobs), and 3) retrieve stored information via their API.

After a quick read of their docs, I figured out how to write a little script that’ll run once a day, scrape the site, and save the resulting HTML page in their key/value storage.

Screenshot of 9 lines of code from val.town that fetches whitehouse.gov/wire, extracts the text, and stores it in blob storage.

From there, I was back to Quadratic writing code to talk to val.town’s API and retrieve my HTML, parse it, and turn it into good, structured data. There were some things I had to do, like:

  • Fine-tune how I select all the editorial links on the page from the source HTML (I didn’t want, for example, to include external links to the White House’s social pages which appear on every page). This required a little finessing, but I eventually got a collection of links that corresponded to what I was seeing on the page.
  • Parse the links and pull out the top-level domains so I could group links by domain occurrence.
  • Create charts and graphs to visualize the structured data I had created.

Selfish plug: Quadratic made this all super easy, as I could program in JavaScript and use third-party tools like tldts to do the analysis, all while visualizing my output on a 2d grid in real-time which made for a super fast feedback loop!

Once I got all that done, I just had to sit back and wait for the HTML snapshots to begin accumulating!

It’s been about a month and a half since I started this and I have about fifty days worth of data.

The results?

Here’s the top 10 domains that the White House Wire links to (by occurrence), from May 8 to June 24, 2025:

  1. youtube.com (133)
  2. foxnews.com (72)
  3. thepostmillennial.com (67)
  4. foxbusiness.com (66)
  5. breitbart.com (64)
  6. x.com (63)
  7. reuters.com (51)
  8. truthsocial.com (48)
  9. nypost.com (47)
  10. dailywire.com (36)

A pie chart visualizing the top ten links (by domain) from the White House Wire

From the links, here’s a word cloud of the most commonly recurring words in the link headlines:

  1. “trump” (343)
  2. “president” (145)
  3. “us” (134)
  4. “big” (131)
  5. “bill” (127)
  6. “beautiful” (113)
  7. “trumps” (92)
  8. “one” (72)
  9. “million” (57)
  10. “house” (56)

Screenshot of a word cloud with “trump” being the largest word, followed by words like “bill”, “beautiful” and “president”.

The data and these graphs are all in my spreadsheet, so I can open it up whenever I want to see the latest data and re-run my script to pull the latest from val.town. In response to the new data that comes in, the spreadsheet automatically parses it, turn it into links, and updates the graphs. Cool!

Screenshot of a spreadsheet with three different charts and tables of data.

If you want to check out the spreadsheet — sorry! My API key for val.town is in it (“secrets management” is on the roadmap). But I created a duplicate where I inlined the data from the API (rather than the code which dynamically pulls it) which you can check out here at your convenience.

Update: 2025-07-03

After publishing, I realized that I wasn’t de-duplicating links. Because this works by taking snapshots once a day of the website’s HTML, if the same link stayed up for multiple days, it was getting counted twice.

So I tweaked my analysis to de-duplicate links because I want a picture of all the links shared over time. It didn’t really change the proportions of which sites were shared most frequently, just lowered their occurrence because links now weren’t counted twice.

Given that, here’s an update of the “top 10 links by domain” from May 8th to July 3rd.

  1. youtube.com (73)
  2. foxnews.com (36)
  3. x.com (31)
  4. breitbart.com (29)
  5. nypost.com (28)
  6. thepostmillennial.com (26)
  7. foxbusiness.com (22)
  8. truthsocial.com (20)
  9. washingtontimes.com (16)
  10. dailywire.com (15)

A pie chart visualizing the top ten links (by domain) from the White House Wire


Reply via: Email · Mastodon · Bluesky

Jim Nielsen's Blog

30 Jun 2025 at 20:00



Refresh complete

ReloadX
Home
(166) All feeds

Last 24 hours
Download OPML
*
A Very Good Blog by Keenan
*
A Working Library
*
Alastair Johnston
Anna Havron
*
Annie
Annie Mueller
Apple Annie's Weblog
*
Articles – Dan Q
*
Baty.net posts
bgfay
Bix Dot Blog
Brandon's Journal
*
Chris Coyier
Chris Lovie-Tyler
Chris McLeod's blog
*
Colin Devroe
Colin Walker – Daily Feed
Content on Kwon.nyc
*
Crazy Stupid Tech
daverupert.com
Dino's Journal 📖
dispatches
dominikhofer dot me
*
Dragoncatcher the blog
Excursions
*
*
Flashing Palely in the Margins
Floating Flinders
For You
*
Frank Meeuwsen
frittiert.es
Hello! on Alan Ralph
*
Human Stuff from Lisa Olivera
inessential.com
*
jabel
*
Jake LaCaze
James Van Dyne
*
Jan-Lukas Else
*
Jim Nielsen's Blog
*
Jo's Blog
*
Kev Quirk
lili's musings
*
Live & Learn
*
Lucy Bellwood
Maggie Appleton
*
Manton Reece
*
Manu's Feed
Matt's Blog
*
maya.land
*
Meadow
Minutes to Midnight RSS feed
Nicky's Blog
*
Notes – Dan Q
*
On my Om
Own Your Web
Paul's Dev Notes
*
QC RSS
rebeccatoh.co
reverie v. reality
*
Rhoneisms
ribbonfarm
Robert Birming
*
Robert Birming
Robin Rendle
Robin Rendle
Sara Joy
*
Scripting News for email
Sentiers – Blog
*
Simon Collison | Articles & Stream
strandlines
*
Tangible Life
the dream machine
*
The Torment Nexus
*
thejaymo
theunderground.blog
Thoughtless Ramblings
tomcritchlow.com
*
Tracy Durnell
*
Winnie Lim
*
yours, tiramisu

About Reader


Reader is a public/private RSS & Atom feed reader.


The page is publicly available but all admin and post actions are gated behind login checks. Anyone is welcome to come and have a look at what feeds are listed — the posts visible will be everything within the last week and be unaffected by my read/unread status.


Reader currently updates every six hours.


Close

Search




x
Colin Walker Colin Walker colin@colinwalker.blog