Calibrating Confidence

If you’re here for the game, visit this. Calibration Everyone knows that being overconfident can often lead you to making reckless, unnecessarily aggressive decisions, and being underconfident leads to not taking enough opportunities. This duality shows up everywhere in real life - from aspects like investing/business, where real money is at stake, to more personal matters like career progression and navigating interpersonal relationships. If we were always right about things, we could blindly believe in ourselves, and overconfidence would not exist. If we were completely clueless (for some definition of completely clueless), we would be better off asking a stochastic parrot to take our decisions for us. ...

January 2, 2025 · 8 min · 1605 words · nor

LLMs and dQw4w9WgXcQ

If you have been on the internet for a while, you probably know by now that LLMs (large language models) are a thing, and talking to them feels pretty close to talking to a human. If you have been on the internet for a decade or so, you can probably guess what this blog is going to be about. A harmless query… or was it? The task at hand was downloading an interesting video from YouTube that I happened to come across in my old link archive using the ever-handy yt-dlp. ...

November 9, 2024 · 8 min · 1602 words · nor

The intuition and the math behind Simpson's paradox

Introduction When we reason qualitatively, we often rely on our intuition. However, intuition is often loaded with certain meta-biases that cloud our judgment; one of these biases comes into play when we think about “local” and “global” statements. What is a local or a global statement? One way of distinguishing between them is how many conditions we must guarantee hold before we can talk about the statement - so this terminology is relative. A local statement is a statement that uses many more conditions (a higher amount of specificity) than a global statement. For example, any statement about a certain group of people in the past few years is a local statement relative to a statement that is about all species that have ever existed on the earth. ...

October 2, 2024 · 12 min · 2492 words · nor

On Probabilistic Thinking

As one can easily guess, this blog is about a mental model of thought. I chose to write about it since I feel that introspection about ways of thinking (and consequently what “thinking before doing something” means) is greatly lacking among people, and that it is critical to make better decisions (and realizing when there is no “better” decision). I won’t bore you with the philosophical details, so I’ll approach it from a probabilistic perspective, which is closer to what I personally choose to think in terms of. A word of caution: I will sometimes oversimplify things in order to drive home the point, but sometimes it might seem to contradict with what I say in the later parts of this post. The key here is context, and if you keep track of it, things will make more sense. To understand probability concepts that I mention here in a bit more detail, I recommend reading my post on probability. If you don’t understand/don’t want to go through the math examples here, don’t worry - I’ll intersperse it with a general idea of what we are trying to do, so looking for those explanations should help. Wherever you find something that’s not something you already know, you should probably just make a note of it and go ahead (and read more about it later). If it is still not clear, you can let me know and I’ll try to clarify that part (for you and the other readers of this post). Or even by just looking up the thing on your favorite search engine/LLM, you’ll likely learn a lot, even if in a less pointed manner. ...

September 6, 2024 · 45 min · 9545 words · nor

Thoughts on the present and future of AI and problem-solving

This is a collection of my thoughts on AI and problem-solving, which have remained relatively constant over the past few years, and some comments on recent efforts in this direction. Recently, Deepmind claimed that they had made significant progress on the IMO Grand Challenge, and their automated systems called AlphaProof and the buffed version of their original AlphaGeometry, called AlphaGeometry 2 now perform at the IMO silver level. Having monitored this field in the past on-and-off, this set off some flags in my head, so I read through the whole post here, as well as the Zulip discussion on the same here. This post is more or less an elaboration on my comment here, as well as some of my older comments on AI/problem-solving. This is still not the entirety of what I think about AI and computer solvability of MO problems. ...

July 28, 2024 · 35 min · 7342 words · nor

Implementing FFT

The other day I had a discussion with someone about how to implement FFT-based convolution/polynomial multiplication - they were having a hard time squeezing their library implementation into the time limit on this problem, and it soon turned into a discussion on how to optimize it as much as possible. It turned out that the bit-reversing part of their iterative implementation was taking a pretty large amount of time, so I suggested not using bit-reversal at all, as is done in a few libraries. Since not a lot of people turned out to be familiar with it, I decided to write a post on some ways of implementing FFT and deriving these ways from one another. ...

June 1, 2024 · 19 min · 4001 words · nor

On using RSS feeds

I first got to know about the existence of RSS feeds back when I was in middle school, but didn’t figure out their appeal and promptly forgot about them. Fast-forward to around a year ago - I was starting to realize that I was reading way too many interesting tech blogs, so I set out to find a way to aggregate all these updates together. A quick search later, it seemed like RSS feeds were the perfect fit for the job. In fact, it also ended up helping me keep up with research in some niche fields that I was following on arxiv! ...

May 12, 2024 · 3 min · 523 words · nor

Convenient and near-optimal binary search on floating point numbers

This post was originally written on Codeforces; relevant discussion can be found here. TL;DR Use the following template (C++20) for efficient and near-optimal binary search (in terms of number of queries) on floating point numbers. Template template <std::size_t N_BITS> using int_least_t = std::conditional_t< N_BITS <= 8, std::uint8_t, std::conditional_t< N_BITS <= 16, std::uint16_t, std::conditional_t< N_BITS <= 32, std::uint32_t, std::conditional_t< N_BITS <= 64, std::uint64_t, std::conditional_t<N_BITS <= 128, __uint128_t, void>>>>>; // this should work for float and doubles, but for long doubles, std::bit_cast will fail on most systems due to being 80 bits wide. // to handle this, consider using doubles instead or std::bit_cast the long double to an 80-bit bitset and convert it to a 128 bit integer using to_ullong. /* * returns first x in [a, b] such that predicate(x) is false, conditioned on * logical_predicate(a) && !logical_predicate(b) && logical_predicate(-inf) && * !logical_predicate(inf) * here logical_predicate is the mathematical value of the predicate, not the * machine value of the predicate * it is guaranteed that non-nan, non-inf inputs are passed into the predicate * if NaNs or infinities are passed to this function as argument, then the * inputs to the predicate will start from smallest/largest representable * floating point numbers of the input type - this can be a source of errors * if you multiply the input by something > 1 for example * strictly speaking, the predicate should also be perfectly monotonic, but if * it gives out-of-order booleans in some small range [a, a + eps] (and the * correct order elsewhere), then the answer will be somewhere in between * the same holds for how denormals are handled by this code */ // template <bool check_infinities = false, bool distinguish_plus_minus_zero = false, bool deal_with_nans_and_infs = false, std::floating_point T> T partition_point_fp(T a, T b, auto&& predicate) { static constexpr std::size_t T_WIDTH = sizeof(T) * CHAR_BIT; using Int = int_least_t<T_WIDTH>; static constexpr auto is_negative = [](T x) { return static_cast<bool>((std::bit_cast<Int>(x) >> (T_WIDTH - 1)) & 1); }; if constexpr (distinguish_plus_minus_zero) { if (a == T(0.0) && b == T(0.0) && is_negative(a) && !is_negative(b)) { if (!predicate(-T(0.0))) { return -T(0.0); } else { // predicate(0.0) is guaranteed to be true because b = 0.0 return T(0.0); } } } if (a >= b) return NAN; if constexpr (deal_with_nans_and_infs) { // get rid of NaNs as soon as possible if (std::isnan(a)) a = -std::numeric_limits<T>::infinity(); if (std::isnan(b)) b = std::numeric_limits<T>::infinity(); // deal with infinities if (a == -std::numeric_limits<T>::infinity()) { if constexpr (check_infinities) { if (predicate(-std::numeric_limits<T>::max())) { a = -std::numeric_limits<T>::max(); } else { return -std::numeric_limits<T>::max(); } } else { a = -std::numeric_limits<T>::max(); } } if (b == std::numeric_limits<T>::infinity()) { if constexpr (check_infinities) { if (!predicate(std::numeric_limits<T>::max())) { b = std::numeric_limits<T>::max(); } else { return std::numeric_limits<T>::infinity(); } } else { b = std::numeric_limits<T>::max(); } } } // now a and b are both finite - deal with differently signed a and b if (is_negative(a) && !is_negative(b)) { // check 0 once if constexpr (distinguish_plus_minus_zero) { if (!predicate(-T(0.0))) { b = -T(0.0); } else if (predicate(T(0.0))) { a = T(0.0); } else { return T(0.0); } } else { if (!predicate(T(0.0))) { b = -T(0.0); } else { a = T(0.0); } } } // in the case a and b are both 0 after the above check, return 0 if (a == b) return T(0.0); // start actual binary search auto get_int = [](T x) { return std::bit_cast<Int, T>(x); }; auto get_float = [](Int x) { return std::bit_cast<T, Int>(x); }; if (b > 0) { while (get_int(a) + 1 < get_int(b)) { auto m = std::midpoint(get_int(a), get_int(b)); if (predicate(get_float(m))) { a = get_float(m); } else { b = get_float(m); } } } else { while (get_int(-b) + 1 < get_int(-a)) { auto m = std::midpoint(get_int(-b), get_int(-a)); if (predicate(-get_float(m))) { a = -get_float(m); } else { b = -get_float(m); } } } return b; } It is also possible to extend this to breaking early when a custom closeness predicate is true (for example, min(absolute error, relative error) < 1e-9 and so on), but for the sake of simplicity, this template does not do so. ...

March 5, 2024 · 8 min · 1611 words · nor

PSA: target pragmas won't work on the new g++-13 compiler on CF

This post was originally written on Codeforces; relevant discussion can be found here. As of May 2024, the bug has been fixed in GCC 14, but has not been ported to Codeforces yet. MikeMirzayanov added a new compiler in response to the bug mentioned here. However, it does not come without a catch. Namely, any pragma that is of the form #pragma GCC target(...) would NOT work with this new compiler. The issue is well-known by people who use up-to-date compilers but there has not been much progress towards a fix. ...

March 5, 2024 · 2 min · 232 words · nor

Write recursive DP without thinking about memoization

This post was originally written on Codeforces; relevant discussion can be found here. Someone asked me about my template that used a “cache wrapper” for lambdas, so I decided to write a post explaining how it works. For reference, here is a submission of mine from 2021 that uses that template. Here’s what you will find implementations for in this post: Generalized hashing (for tuple types, sequence types and basic types) Convenient aliases for policy based data structures Wrappers for recursive (or otherwise) lambdas that automatically do caching (memoization) for you. Jump to the Usage section if you only care about the template, though I would strongly recommend reading the rest of the post too since it has a lot of cool/useful things in my opinion. ...

January 14, 2024 · 11 min · 2236 words · nor
>