https://pine32.be - © pine32.be 2025
Welcome! - 74 total posts. [RSS]
A funny little cycle. [LATEST]


Search 74 posts with 35 unique tags


#1740225625


[ dev ]

Turns out that HTTP/0.9 is thing. No headers, no methods, no status codes, only GET. It is so simple you can easily make a request manually. Firefox still supports it apparently.

echo -e "GET /\r" | nc pine32.be 80

#1738273490


[ dev ]

Still alive, just busy. First post of the year… yay I guess.

I have been working on my fork of Navidrome that will have some audio and music analysis with the help of Essentia, the MVP is almost done. Because fuck Spotify, I am done with their bullshit. But I still want my cool data, so I am making it myself, all open source ofcourse.

Navidrome is written in Golang but most analysis libraries are written in C/C++ or Python, so gRPC was the solution because I am not writing a wrapper. My first time working with gRPC, it’s really nice once you get it all set up. But the setup can be a pain, like WTF are .pyi files. I have never seen them before, they are interface files so you have your types available (this is because Python Protobuf does some weird runtime C thing that processes the proto files or something). It has been nice to use, apart from setup. The end-to-end types are just a huge plus, you just know that they will work.

Python multiprocessing has also been a journey. Tested all types of pools but they always just deadlocked or didn’t run. In the end I just made my own worker pool with each worker having its own process. And they share a multiprocessing-safe queue for input and output. Sort of like I would do in Golang with channels instead of queues. And this dead simple approach worked first try of course, after everything I tried. But I am just happy that it works now. Still feels weird to see python use almost 100% of your CPU on all cores.

One year anniversary coming up for MB.

#1731188466


[ dev | meta_raid ]

I am making a scraper for Spotify meta data. My testing numbers indicates that I could scrape 100% of Spotify in less then a week, something feels wrong.

INFO Stats per minute id=0 request=204 tracks=2451
INFO Stats per minute id=2 request=193 tracks=2086
INFO Stats per minute id=1 request=212 tracks=2392

#1730630725


[ dev | golang | meta_raid ]

Golangs new integrators came in handy for request pagination. I know the code is not optimal but is very readable and it is just for a proof of concept. I am try to get Spotify metadata in bulk. Hopefully I won’t get IP banned, fingers crossed.

for chunk := range slices.Chunk(allSimpleTracks, 100) {
	ids := make([]spotify.ID, len(chunk))
	for i, a := range chunk {
		ids[i] = a.ID
	}

	f, err := client.GetAudioFeatures(ctx, ids...)
	if err != nil {
		return nil, err
	}

	fullTracks := make([]*spotify.FullTrack, len(ids))
	for subChunk := range slices.Chunk(ids, 50) {
		full, err := client.GetTracks(ctx, subChunk, spotify.Limit(50))
		if err != nil {
			return nil, err
		}
		fullTracks = append(fullTracks, full...)
	}
	for i := range len(ids) {
		allTracks[i] = &FullerTrack{
			Track:    fullTracks[i],
			Features: f[i],
		}
	}
}

#1728318672


[ dev | corap ]

Apart from a few minor glitches (which have been fixed) the scraper and scheduler runs fine. The frontend is also coming along nicely, needs a few more pages and then the CSS. And I also need to figure out how to load a dynamic amount of columns from a materialized view, shouldn’t be to hard but I want to make it fault tolerant. I don’t know what I want to do regarding design. But I know some body that maybe wants to help me, fingers crossed.

#1727195369


[ dev | corap ]

Python (scraper) rewrite is done. Almost no dependencies now. Reduced the Docker image from 1.2 GB to less then 100MB. Feels a lot better to update and modify to. Now time for the fronted webserver.

beautifulsoup4==4.12.3
requests==2.32.3
python-dotenv==1.0.1
psycopg==3.2.2
psycopg-binary==3.2.2

#1726927465


[ dev | corap | database ]

The amount of cursed SQL that I am writing just to keep it in pure SQL. It would be way faster to just make the query in Python. Anyway… Corap rewrite is coming along nicely.

DO $$
DECLARE
    cols text;
    query text;
BEGIN
    SELECT string_agg(quote_ident(name) || ' text', ', ')
    INTO cols
    FROM (
        SELECT name
        FROM (SELECT DISTINCT name, priority FROM device_analyses) AS o
        ORDER BY priority DESC
    ) AS o;
    
    BEGIN
        EXECUTE 'DROP MATERIALIZED VIEW IF EXISTS device_analysis_summary';
        query := format('
            CREATE MATERIALIZED VIEW device_analysis_summary AS
            SELECT *
            FROM crosstab(
                ''SELECT d.deveui, da.name, da.value
                FROM devices d
                LEFT JOIN device_analyses da ON d.deveui = da.device_id
                ORDER BY d.deveui, da.name'',
                ''SELECT name
                FROM (SELECT DISTINCT name, priority FROM device_analyses) AS o
                ORDER BY priority DESC''
            ) AS ct(deveui text, %s);
        ', cols);
        EXECUTE query;
    EXCEPTION
        WHEN OTHERS THEN
            RAISE NOTICE 'Error creating materialized view: %', SQLERRM;
            ROLLBACK;
            RETURN;
    END;
END $$;

#1726662585


[ dev | corap ]

Time to rewrite Corap finally, starting with the scheduler. The current docker images is more then 1 GB. Going to remove a lot of dependencies. Also going to rewrite the fronted, learned a lot about Golang sins starting that project.

#1720948971


[ dev | rant ]

I am officially done with ORM’s. My latest experiment was ent, a code gen based ORM for Golang. Works fine, I like the API, and then you want to do something slightly complex and it just doesn’t work. I wanted a many to many with extra data in the join table, so for so good, this did work. Until I wanted to make it not unique. I needed this because I wanted to add one track multiple times to a playlist, in my current project. But this was not allowed, the codegen would not build. Other people have the same issue but no solution is known. So my solution is to rewrite my code again, this time with pgx. I also have tried and used sqlc in some projects but it won’t scale for my current project. But I do like it a lot for smaller projects, like this blog uses it for example.

I have tried a lot of ORM’s over the year but I am finally done, not a chance. They are cool great until they are not, then they are just a pain.