Page 1 of 2
#
 I've been making more changes to the site, nothing user facing.

I decided to create a dummy GitHub repository with a few core files and let Codex make some recommendations.

I'm always looking for ways to reduce database queries so implemented a couple of basic JSON file-based caches:

  • one for the 10 most recent posts (rebuilt when I add/delete/edit a post), and
  • one for site options (rebuilt if I log in to refresh the session info)

I've been having issues with certain IP addresses hammering the site overnight — on one occasion, I had over 160,000 hits from the same IP in just a few minutes. 1 Codex suggested a simple file-based IP throttle control.

I'll see if the changes help to 1) handle load a bit better, and 2) rate limit problematic IPs. I don't want to block them outright.

As well as a couple of CSS changes, there are some additional recommendations that I might look into in the future, but this is enough to be getting on with.


  1. no, it wasn't an AI scraper bot 

#
 Five years ago today, I started journaling properly.

I was still using WordPress at the time and reinstalled a private posting plugin I wrote a couple of years earlier but never used.

It took another three months to migrate that to the current site.

In that time I have written an average of 270 entries per year, but with some big gaps in places.

For the past month, I've gotten back in the habit of writing something every day, often just a recap of events, and it feels good to get this down. I've always thought of the blog as an external memory, but the journal is rapidly taking over that function as I blog less frequently.

Since I stopped using The Garden, the public/private duality of /reader has spread to the rest of the site — blog and journal. Different sides of the same coin, operating in similar ways. While it might seem like they have different purposes, they are more aligned than they appear with considerable cross-talk between the two.

The are just intended for different audiences.

#
 I realised there was a fundamental flaw with my podcasts view for /reader:

tracking progress.

If I were to close the browser or refresh the page each audio tag would reset to 0 and I'd have to seek through until I found where I was.

Not any more...

Various attributes of the audio tag are accessible via script including, importantly, currentTime.

The page, like /reader itself, is public so I run a check to see if I'm logged in. If so, a script will periodically (every 10 seconds) check the value of currentTime and, if it has changed, write that to the database entry for that item.

Likewise, if logged in, I check against the database for the last recorded time and set that on page load. This means I will only ever be a maximum of 10 seconds out should I accidentally close the browser or navigate away.

#
 Now that I've got everything working properly (and shouldn't be hitting any rate limits) I wanted to detail how the Bluesky integration is all set up.

As I mentioned before, I'm using Clark Rasmussen's simple BlueskyAPI library but have now forked it – more on that later.

Posting

When posting to the blog (except for certain conditions) I will send it over to Bluesky – I'll strip tags and decode any HTML entities then check the post length. For anything over 275 characters (Bluesky has a limit of 300) the post gets truncated at the nearest full-stop and a link card generated:

$args = [
  'collection' => 'app.bsky.feed.post',
  'repo' => $bluesky->getAccountDid(),
  'record' => [
    'text' => $htmlcontent,
    'createdAt' => date('c'),
    '$type' => 'app.bsky.feed.post',
    'embed' => [
      '$type' => 'app.bsky.embed.external',
      'external' => [
        'uri' => $postlink,
        'title' => $title,
        'description' => 'Read the full post on the blog...',
      ],
    ],
  ],
];

If I decide to have a featured image (add the 'feat' class to it) this gets uploaded via the uploadBlob endpoint:

$body = file_get_contents($imageUrl);
$response = $bluesky->request('POST', 'com.atproto.repo.uploadBlob', [], $body, 'image/'.$imgExt);
$image = $response->blob;

$embed = [
  'embed' => [
    '$type' => 'app.bsky.embed.images',
    'images' => [
      [
        'alt' => $imageAlt,
        'image' => $image,
      ],
    ],
  ],
];

I can then add it as the thumbnail for the link card:

$args['record']['embed']['external']['thumb'] = $image;

Or, for a short post, have it as a normal image:

$args['record'] = array_merge($args['record'], $embed);

The post is submitted to the com.atproto.repo.createRecord endpoint with the relevant body ($args), I get the at:// address of the item on Bluesky and attach it to the blog post in the database.

Replies

When viewing the blog, if a post has an at:// address attached this will get passed to the app.bsky.feed.getPostThread endpoint with the required thread depth (currently 2). If the returned data includes replies I pull out the array and reverse it (Bluesky has them newest first) then recursively get the author, avatar and content of each to be displayed under the post comments.

Limits

I was having a problem hitting the rate limits on the com.atproto.server.createSession endpoint so needed to be able to reuse sessions. This is where forking the library comes in.

When creating a session I now save the API refresh token (refreshJwt) to a PHP session variable. Instead of going straight to 'createSession' I check if the session variable exists and pass it to com.atproto.server.refreshSession to get a new apiKey. I've read that atprotocol refresh tokens last for 2 months so I could technically save them to a more permanent location (the database) but this is easiest for now.

If no refresh token exists (or one has expired – unlikely) then a new session is created as normal. Much better.

#
 Finished removing jQuery, feels good to streamline things a bit more.

I found a couple of things that weren't working (even with jQuery) so removed them – must have been stuff I introduced before some of the more recent changes.

The only place jQuery is still used is in my SPARKS installation. It's not part of the site itself (and I don't actually use it) so there's no point taking the time to remove it.