Building a container tracker for four ports
Port agents in Karachi handling import containers have to check four separate terminal websites every day to find out where each container is. For a broker tracking 200 containers across a week, this was quite time consuming.
I built CTS to encapsulate that into one workflow which allows a broker to drop an Excel file in, get an enriched Excel back, and a WhatsApp message the moment a container actually leaves the port.
Four APIs, four problems
KGTL is an old ASP.NET site with __VIEWSTATE, __EVENTVALIDATION, and an UpdatePanel that returns HTML fragments wrapped in a custom delimiter format. You can’t hit it as a clean POST. You have to keep the viewstate alive across requests and parse the fragment payload by hand.
SAPT and KICT both do a GET-then-POST dance. The first request fetches a history page, the second posts back to get details. The detail response is technically HTML but the actual data is a JSON blob embedded in a script tag. So you parse HTML to get a string, then parse the string as JSON.
QICT was the only one that behaved like an API in the modern sense: a JSON REST endpoint with proper auth. I appreciated this more than I expected to.
I wrapped all four behind a single ContainerTracker class so the rest of the codebase doesn’t have to know which terminal it’s talking to. That bit’s standard. The adapter pattern earning its keep.
The concurrency problem
When I first tried running a batch through, I used a flat thread pool per terminal with whatever number felt sensible. KGTL would happily eat about ten concurrent requests, then start returning 500s. SAPT would tolerate fifty. QICT could go higher. Worse, the limits drifted: fine in the morning, choking by 4pm when other traffic picked up.
A static thread count was always going to be wrong. Either I’d be conservative and slow, or aggressive and getting myself blocked.
So I wrote an adaptive limiter. The rules are simple enough to fit in a paragraph:
- Each terminal has a configured ceiling (KGTL 25, SAPT 75, KICT 50, QICT 100).
- Start at 20% of the ceiling. Don’t open the taps on day one.
- Every window, count errors. If the error rate is above 10%, scale down (divide by somewhere between 1.33 and 2 depending on severity) and enter “drain mode” — stop accepting new work until in-flight requests finish.
- On a clean window, add 3 workers. Additive, not multiplicative. Multiplicative growth overshoots and gets you blocked again two seconds later.
- Track the highest level you’ve seen errors at. Don’t probe above that ceiling until you’ve had ten clean windows in a row.
The last point is the one I’m most proud of, because it’s the one I got wrong twice before figuring out. Without the failure-ceiling memory, the limiter would happily climb back into the same wall it had just bounced off, every 30 seconds, forever.
SQLite, of all things
I picked SQLite because the deployment target is one VM, the data is small, and Postgres felt like overkill for what is essentially a glorified spreadsheet processor. It mostly works. Two things were not obvious going in.
The default journaling mode locks the whole database during a write. When the background refresh thread is bulk-updating thousands of containers and the dashboard is trying to read, you get database is locked errors that show up as ugly 500s in the UI. WAL mode (PRAGMA journal_mode=WAL) fixes this. Readers don’t block on the writer. One pragma, big quality-of-life win.
The other thing: I had a progress bar updating during batch jobs by writing the current row count to the batch_jobs table on every iteration. That was 200 writes per second during a fast batch, and SQLite was not enjoying it. I added a small throttle helper that batches progress writes to at most one per second using time.monotonic(), and the symptoms went away. The progress bar still feels live. The database stops being the bottleneck.
The Excel problem nobody warns you about
Every terminal exports its own Excel format. So do the auction houses. So do the brokers. The container number column might be called “Container No”, “Cntr No.”, “CONTAINER NUMBER”, or just “BL” with the container in some weird substring. Sometimes the headers are on row 4 because rows 1–3 are a logo, an address, and a blank.
I tried writing rules for each variant and gave up after about the fifth file format. The current pipeline has two passes:
- Look for known column-name patterns. If one matches, use it.
- If nothing matches, scan every cell in the file for the ISO 6346 container regex (
[A-Z]{4}\d{7}). Whichever column has the most matches is the container column.
Pass two is ugly and I love it. It has correctly handled every weird file the client has thrown at it, including one where someone had pasted container numbers into the notes column of a sales template. The regex doesn’t care about your spreadsheet hygiene.
What I’d do differently
If I were starting over, I’d put a real message queue between the API and the workers. Right now the batch processor uses ThreadPoolExecutor directly, which is fine for one machine but means horizontal scaling is a rewrite. I went with the simpler thing because the deployment target was always going to be one VM, and adding Redis felt like ceremony for no payoff. That call was probably right for the constraint, but it’s the first thing I’d revisit if usage grew.
The other thing: I have unit tests for the parsing layer (because the parsing layer is where bugs hide), but I don’t have any tests for the adaptive limiter. The limiter is the most novel piece of the system and the one I’m least sure I haven’t subtly broken. Writing a test harness that simulates a flaky terminal under load is on the to-do list.
Where it runs
Docker Compose, one VM, a named volume for the SQLite file. APScheduler runs the daily refresh at a configurable hour. Evolution API handles the WhatsApp side: open-source gateway, reasonable ergonomics, fire-and-forget calls from a background thread so a flaky webhook can’t block the API.
Currently doing the job, quietly. The brokers stopped complaining, which is the only metric that really matters here.