2023, week 03

returned from India trip

Created: by Pradeep Gowda Updated: Jan 22, 2023 Tagged: weekly

Monday, 2023-01-16 to Sunday, 2023-01-22

Returned from India after a trip from Dec 16, 2022 - Jan 19, 2023. Met a few people from twitter, old friends; visited our newly acquired farm, planted a few saplings. Flew Emirates via Dubai. The second leg of the journey to ORD was long, and boring but otherwise quite nice. My next seat neighbour wanted me to exchange his families middle seat with mine.. not for a 16 hour flight, my friend. Head ached likely from poor circulation and higher CO2? I’ve seen many people carry CO2 monitors on planes and have complained about high levels (1000ppm? being high and some of the planes showing 3000ppm? values; not sure about the units rn).


Links from around the web:

Alvin York … ” stormed a German machine gun nest by himself and was such a dead eye that 132 Germans surrendered to him just so he’d stop smoking them with headshots.” via. His “After the War” section on Wikipedia is very interesting and impressive.

Utopian for Beginners - The New Yorker; via

Tollense valley battlefield

My favorite archeologic discovery when it comes to ancient central Europe is the Tollense Valley Battle. Around 3300 years ago in central Europe two groups, estimated 4000 men total, have clashed on and around a 120 meters long stone bridge crossing the river and valley of Tollense. By the time those men engaged, this bridge was some 500 years old already, which means it was build 3800 years ago. This battle tells us that there were organized forces in central Europe thousands of years ago, and that there was important road going through the region that required construction and maintenance of such structure for this many years. But we currently have no idea who were those people or even what points of interests was this road supposed to connect. via HN


Programming related:

  • The Metapict Blog – small programs using Metapict to draw figures and images using Racket Programming Language.

Machine learning

Started the “Introduction to Machine Learning in Production” course on Coursera by deeplearning.ai taught by Andrew Ng et. al.

Week 1 Optional References

Hidden Technical Debts in Machine Learning Systems (PDF) by Sculley et al.

  • explore several ML-specific risk factors to account for in system design. These include boundary erosion, entanglement, hidden feedback loops, undeclared consumers, data dependencies, configuration issues, changes in the external world, and a variety of system-level anti-patterns.

  • technical debt, a metaphor introduced by Ward Cunningham in 1992 to help reason about the long term costs incurred by moving quickly in software engineering.

  • Technical debt may be paid down by refactoring code, improving unit tests, deleting dead code, reducing dependencies, tightening APIs, and improving documentation [8]. The goal is not to add new functionality, but to enable future improvements, reduce errors, and improve maintainability.

  • ML tech debt may be difficult to detect because it exists at the system level rather than the code level.

  • Complex Models erode boundaries:

    • Entanglement: CACE principle: Changing Anything Changes Everything. CACE applies not only to input signals, but also to hyper-parameters, learning settings, sampling methods, convergence thresholds, data selection, and essentially every other possible tweak. One possible mitigation strategy is to isolate models and serve ensembles. However, in many cases ensembles work well because the errors in the component models are uncorrelated. Relying on the combination creates a strong entanglement: improving an individual component model may actually make the system accuracy worse if the remaining errors are more strongly correlated with the other components. A second possible strategy is to focus on detecting changes in prediction behavior as they occur.
    • Correction Cascades – There are often situations in which model ma for problem A exists, but a solution for a slightly different problem A′ is required. In this case, it can be tempting to learn a model m′a that takes ma as input and learns a small correction as a fast way to solve the problem.
    • Undeclared consumers – Without access controls, some of these consumers may be undeclared, silently using the output of a given model as an input to another system. “Visibility Debt”
  • Data dependencies are more expensive than code dependencies

    • unstable data deps - even “improvements” to input signals may have arbitrary detrimental effects in the consuming system that are costly to diagnose and address. common mitigation strategy for unstable data dependencies is to create a versioned copy of a given signal.
    • Underutilized Data Dependencies – legacy features, bundled features, eta-Features, Correlated features.

See also: MLOps, System design