MB

▄▄▄·▪ ▐ ▄ ▄▄▄ . ▐█ ▄███ •█▌▐█▀▄.▀· ██▀·▐█·▐█▐▐▌▐▀▀▪▄ ▐█▪·•▐█▌██▐█▌▐█▄▄▌ .▀ ▀▀▀▀▀ █▪ ▀▀▀

func (Image) Hooks() []ent.Hook { return []ent.Hook{ hook.On( func(next ent.Mutator) ent.Mutator { return hook.ImageFunc( func(ctx context.Context, m *gen.ImageMutation) ( ent.Value, error) { _, hasRelease := m.ReleaseID() _, hasArtist := m.ArtistID() if hasRelease == hasArtist { if hasRelease { return nil, fmt.Errorf("but both were provided") } else { return nil, fmt.Errorf("neither was provided") } } return next.Mutate(ctx, m) }) }, ent.OpCreate|ent.OpUpdate|ent.OpUpdateOne, ), } }

I have spent far too much time designing a custom ID type for my current project. I wanted to use it as the primary key in a SQLite database, which imposes some constraints. Specifically, the ID must fit within 63 bits, since SQLite only supports signed integers and I want to avoid negative values. (Technically, negative IDs would work, but they’re not ideal.)

You might be thinking, “Why not just use a BLOB as the primary key? That gives you much more flexibility.” And that’s a fair point, but I am intentionally avoiding that because of how SQLite handles its hidden rowid. When you use an integer as the primary key, SQLite internally aliases it to the rowid, which makes operations significantly faster. Using a BLOB would remove that performance advantage and make the database larger.

So the next step is choosing the bit layout. The first bit is unused to prevent negative values. Then I went with 43 bits for a Unix millisecond timestamp. This gives me 278 years of ranges, should be plenty. Using the default Unix epoch this will work until the year 2248. It will outlive me so that is more than enough.

The remaining 20 bits are random, which gives 1_048_576 possible values. I am using random values because I don’t want to keep track of state (as with an autoincrement), and my current system can handle collisions. It is still possible to swap approaches down the road while keeping the already generated IDs. 1_048_576 Sounds like a lot, but this gives a 1% chance of a collision occurring when only generating 146 IDs. Then again, those IDs would need to be generated within the same millisecond. I am not expecting that much volume.

Bit  | 63 (MSB) | 62 ... 20 | 19 ... 0 |        some        1JTWRZPBJ4DSE
-----|----------|-----------|----------|        example     1JTWRZSPA1NBS
Use  | Unused   | Timestamp | Random   |        IDs         1JTWRZTQSE9G6
Size | 1 bit    | 43 bits   | 20 bits  |        ->          1JTWRZVY5R7RT

The reason for using a timestamp in the leading bits is to minimize B-tree rebalances. As time advances, the generated IDs grow in sequence, allowing the B-tree to insert new entries without reorganizing older pages. By contrast, a completely random primary key (like a UUID v4) forces the B-tree to rebalance frequently, which can significantly degrade database performance.

Finally, the string representation: I chose Crockford’s Base32 (without the check digit). Just 13 characters to represent a int64. To me it’s practically perfect from a technical standpoint, and I like how the IDs look. I know aesthetics shouldn’t matter, but this is my project. So I set the rules, and I want things to look cool and and have some aesthetic appeal. Looks way better than those stupid UUIDs.

One final note, please store your IDs in a binary presentation (BLOB or integer). It hurts me every time I seed a ID stored as its string representation. It is way slower and waists storage. It mainly happens with UUIDs, most people don’t realize it actually is a binary ID and not a string. Even the spec states it but I guess people just don’t read it.

Looking back, I’m really happy that I used web scraping for my Spotify embeds, seems to be quite resilient. This post has a track that doesn’t work on Spotify anymore, yet the embed still works fine. This isn’t a guarantee though, one day it will break, but not today. That’s why I am thinking about downloading all Spotify assets, so if it works now, it’ll work forever. It always feels hard to add these kinds of things while trying to stay minimalistic… maybe I should.

I am still looking for another way to integrate some kind of music embed. Maybe with Navidrome, since I’m currently using that for my ever-growing offline music library. I could share a track and then scrape that share to make an embed, that would work the same way Spotify works for now. Would work with a minimal amount of code, which aligns with the philosophy of this blog. But there are two small problems with this: having a full-length song publicly available (basically illegal distribution) and It depends on the Navidrome server and that specific share never going down. I could make the share low quality so people wont download it, but this is not a perfect fix.

This is running form a share on my Navidrome instance, it works: (KLOUD - DEFECT)

Another solution is using a custom music library management tool (which I am planning on making, soon™) that exports a music snippet to be upload in the media management of MB (automatically or manually) and a piece of html that will become the “embed”. I could already do this now manually, but I want to keep posts low-friction and this would help. It’s an almost perfect solution, no changes for MB and none of the Navidrome problems.

But a lot of code is needed for that music library management tool first. So it won’t be soon, while I could bang out the Navidrome solution in an evening or two. Maybe I should first make MB download all Spotify assets to lock them in place forever (I know it’s against the TOS but IDC). Still, it would increase the complexity of MB. It can’t be perfect, but what’s more important… I am going to download them, can’t trust Spotify with anything.

1 Year of MB!!

The first post was on 04/02/2024, missed the anniversary but better late than never. My goal was to make 1 post per week. I have made 70 posts this year (not including this one). So it looks like I hit my goal on average, but it was not really consistent, so that could be better. Nevertheless the goal of this blog was to make posting as easy as possible and I think it succeeded in that manner.

Development has slowed down but I still have some plans to add things like syntax highlighting for code blocks. There is nothing that I am missing at the moment, only some small nice-to-haves. I also want to keep it small, so I am keeping feature creep in check.

for chunk := range slices.Chunk(allSimpleTracks, 100) { ids := make([]spotify.ID, len(chunk)) for i, a := range chunk { ids[i] = a.ID } f, err := client.GetAudioFeatures(ctx, ids...) if err != nil { return nil, err } fullTracks := make([]*spotify.FullTrack, len(ids)) for subChunk := range slices.Chunk(ids, 50) { full, err := client.GetTracks(ctx, subChunk, spotify.Limit(50)) if err != nil { return nil, err } fullTracks = append(fullTracks, full...) } for i := range len(ids) { allTracks[i] = &FullerTrack{ Track: fullTracks[i], Features: f[i], } } }