(Context: I was working as a research assistant; my main task was to find efficient algorithms to update connected components in a dynamic graph whenever edge insertions and deletions happen).
Algorithm description. Pseudocode. Proof of correctness. I had everything in my written using LaTeX in PDF format. Now it’s the time to implement it using C++ on my Windows Visual Studio.
Despite having the pseudocode to refer from time to time, the implementation is itself a toilsome process. It wasn’t that difficult in the beginning: I just had to modify my code from the previous version (that dealt only with static graphs) to fit the changes added. I breezed through the first few sections which handle easier cases: they shouldn’t be that difficult to code, isn’t it? But the gist of the algorithm, as usual, handles the non-trivial cases that incurs countless of mutations in the data stored. With my own pseudocode detailing “do this, do that” but not “how to do them”, I found myself stuck and not too sure what I am doing.
As if this is not enough, the never-ending debugging process did equally well to drive me crazy. On one time there was a subtle mistake that took me two whole days to find out; on other times the fix is simply by removing one optimization part without me knowing why it ruined the whole program.
In fact, it would be generous for the program to let you know whether your output is correct in minutes.
If you think that this is all, no. The highest record was 115 hours and I killed it after determining that it’s not worth a try.
The piece that best relates to the status aforementioned is none other than Chopin’s Fantaise Impromptu. The running notes in the beginning of the piece complements the beautiful pedal point of the phrase coming a few bars after that to impress the audience. Nevertheless, all the audience knows is that “hey, the rapid finger movement is awestruck!” without understanding what’s going on–they simply have to enjoy the music.
Sometimes we just had to accept what we are doing without knowing why does it work (that said, it’s always good to be curious on these matters).
For some weird reason I visited HackerRank and tried the Malaysian Computing Olympiad Qualifying Round 3 again, and Question 4 captured my attention:
Given a tree, denote the score of a vertex v as the sum of pairwise product of the weight of all vertex pairs of the subtree of which v is the root. Find the score of each vertex.
Basically, my way of solving this goes as:
For each edge we identify which one is the parent and which one is the child. (1 is the root of the main tree).
For each vertex v we identify the sum of weight of the vertices in the subtree of which v is the root, and the sum of square of weight of the vertices in the same subtree by depth-first search. This is because 2*sum of xy=(sum of x)^2-sum of (x^2)
Now it’s the implementation that is worth a mention. The fact that the number of children of each vertex is unknown beforehand suggests that we need a flexible container, so a 2D vector was chosen. I tried hard to make the running time proportional to the number of vertices by making sure that each computation of weight takes constant time. As I submitted…
“Status: terminated due to timeout for subtask 3”
Mindblown. Confused. I tried going back and investigate which part took the most time to run, and noticed the following: the running time was significantly decreased once I removed the part where I update the adjacency vector via the push_back function (of course my program isn’t functional properly in this case so a “wrong answer” flag is raised, but at least it shed some light on the running time).
I tried the problem again by modifying my code from adjacency vector to adjacency linked list. Now it said:
“Status: Accepted. Time: 1.8s” (time limit is 2s)
Seriously? I guess the problem stems from the fact that the push_back function has amortized (not worst case) constant runtime (I think), which implies a (probably) high constant factor. Alex once explained (in a lunch-together session) that a the STL vector implementation involves dynamic allocation of a chunk of memory, then reallocation to a bigger chunk of memory as necessary. Reallocation, though only happens occasionally, takes running time proportional to the number of elements (hence increasing the amortized constant factor).
Is linked list better? Yes, in this case. But I can’t tell much about the general case.
Fine. I have been trolled.
Here’s my pseudocode in any case:
Function dfs (integer i, vector weight, vector sum, vector square, vector desc) is:
For each v in desc[i] do:
if sum[v] not known do:
dfs(v, weight, sum, square, desc)
Given: edgelist, weight
For each edge <u,v> do:
prepend v to adj[u]
prepend u to adj[v],
where adj[i] is an adjacency list containing the neighbours of i
queue q, push 1 into it
while q not empty do:
let w be the front element.
For each x in adj[w] do:
if x hasn't been in the queue:
prepend x to desc[w]
where desc[i] is the list containing the children of i
Perform dfs(1, weight, sum, square, desc)
For each vertex v do:
“Stretching beyond your limits without over-stressing yourself”, a task I pondered upon when preparing for A2 exams and university applications last year. Here’s the classical responses:
“It’s fine as long as you don’t go beyond elastic limit” (Zhi Khang)
“Trying something new is considered stretching beyond your limits yet wouldn’t hurt you” (Carin)
“Are you talking to a rubber band?! :O ” (Jean)
The whole feel came to me again right now–one year after I wrote this.
Suffices to say, the main part of it is because I am venturing into CS 145, the so called advanced first term computer science course. I can’t tell how hard it is compared to the normal CS classes, but the time consumption is real: the recent assignments just morphed themselves into a marathon.
What explains that?
The finding of efficient algorithms and debugging. Depressed, when the judging site displayed “wrong answer” repeatedly despite every effort to troubleshoot, yet couldn’t prevent contract error from invading frequently. Sometimes I am amused that determining whether to assign (second of first of x), (first of second of x) or (the remaining of first of x) can take me hours to figure out (x=nested list).
It’s great idea to test programs offline before submission, but at one point I am too strained to do every testing and submitted my code once I thought it as “working well”, resulting in smeared record:
This isn’t new. Back then, coding in C++ gave me enough trouble when there were edge cases that I never fully consider of. Here’s a problem that took me 4 hours to discover that my code got into the abysmal of infinite loop:
(The problem has a grid as a setting, with each square given direction to move to one of four directions. We are to find the number of squares that can be used as a starting point to get to the target square eventually).
Here’s my solution:
Another example: I tried to think over this problem, break it into sub-problems and construct the solution to each of them meticulously (basically, finding the gcd and inverse). After days of thought, done were the problem and I threw it into the submission machine, only to witness my frustration come live when the judging server jeered “wrong answer at test 8” repeatedly.
I searched for my composure, trying to see some borderline cases and noticed an error, and it took me just an extra line to eliminate this superbug.
Finally another submission that’s too crazy to mention:
They shared a common feature, however: my submission passed, eventually. Did it always happen? No–I wrote a contest without having any submission accepted, only to realize that a very subtle case was forgotten. But I believe that working insanely hard towards code acceptance was what programmers are looking for, and it’s worth the effort. 🙂
Things ‘began’ in December 2013 when I attended the Malaysian Computing Olympiad training camp, whereby I learned about how dumb the computer system is, why we should tell everything to our beloved device, what are the rules on declaring or doing iterations. Looks simple, but the progress up to December 2015 was minimal at the very best (or rather, negligible).
That’s when I picked up my books again, trying to understand data types, if…else selection, and try ‘if/else’ selection in order to write at least some baby code like asking for grades in A-Level by inputting your uniform score (e.g. scoring 90 will earn you an A*). As I progressed into learning arrays and loops, senses of inadequacy wafted into my mind and I kept questioning if I had been slow.
Little did I understand that in competitive programming, ideas and knowledge have to come hand in hand until I explored into actual competitions, ranging from the easiest junior category of Canadian Computing Competition (CCC) to some mind-blowing Malaysian Computing Olympiad (MCO) 2016 qualifier rounds. Coincidentally, all the selection rounds to the MCO were presented on Hackerrank, which attracted me to have a try on them and subsequently, involving myself in hacking techniques like graph theory and dynamic programming (all thanks to its baiting of badges to me, which required me to solve at least a question under these sections).
Hackerrank is right: the application of these two topics are very widespread (in the context of what I’ve done now).
As if time doesn’t exist for graphs and debugs
That’s the day when I learned graph theory (in late March), which took me a day or two to understand what is breadth-first search (connecting all edges as fast as possible) and its applications. As practice, I started off with snake and ladder game, where the goal was to locate the minimum number of turns needed to travel from 1 to 100 (warning on greedy approach: you won’t get the right answer!) Simple application of graph theory, but it took me 6 submissions before passing all test cases and realizing that my whole afternoon was gone.
Extending the idea to simple Dijkstra tormented my mind yet another round, and till now I could only claim that I got 30% out of it. Basically, programmers employ this idea to conquer the following real-life problem:
Several houses are in a town, and some of the houses are connected by roads. Given the pairs of houses connected by roads and length of roads, and given starting point. Find:
1. Whether it is possible to reach each of the houses by roads.
2. If yes to the previous question for some houses, what is the minimum length of travel to each of them?
Here I went, bearing in mind that this problem is ‘harder’ with 3000 towns (compared to 1000 previously). My algorithm as follows:
1. create an two-dimensional array such that the i-th column and j-th row is the length of road if the road exists, and zero if no road exists.
2. using the data above, keep updating the shortest path available at each step (e.g. Town B is recorded 5, and the road between Town A and Town B has length 1, then record Town A as 6 unless it already has a value lower than 6).
For some weird reasons my first few submissions returned zero, probably because of conflicts in naming variables. There I changed one of the variables from ‘i’ to ‘c’ before getting “all passed except one”, no thanks to segmentation fault.
From problem solving to fault finding
I then inspected several possible fault lines by consulting Google. Accessing ranges of arrays that don’t even exist? It couldn’t be: adding some spare spaces (e.g. allocating n+100 spaces instead of just n spaces) would end me up scoring 42.35/60 again. I gave up and escaped to another problem: this time to determine whether a number is prime by creating an array and by employing prime sieve.
As I gave up and moved on to determining whether a number is prime, a discovery bemused me: testing for numbers beyond 8 million will cause segmentation fault as well! Then Zi Song came to my computer and said “it must be memory limitations.”
I tried fixing it by reducing the memory needed for each element in the array by changing ‘int’ to ‘short’ (since I know too well that the array elements cannot exceed 350). As I submitted my edits, I starred at the feedback from several testcases, relieved as the code passed some test cases but couldn’t be satisfied until the grading system performed its last run. ‘Yay!’ I exclaimed as the scoreboard recorded 60/60.
To summarize, it was an accomplishment to be able to solve these problems, but there will be more to come later!
IMO Camp 1 was over! While waiting for one more month for the second camp, there was another camp awaiting me: MCO camp. Related, but different in nature (handwritten vs typed, math vs programming, etc).
Qualification: MCC individual gold/silver (hmm……perhaps the only gold I got since 2011 besides AMC 2012??), and only 36 applicants accepted.
I quickly applied 1 day after the announcement with alacrity (many thanks to Mr. Yeoh Yean Cheong, Zi Song’s father for informing me) and was selected to the camp. Felt great, isn’t it? My classmate (Xin En), and IMO teammate (Yi Kye), who were gold medalist in MCC 2013 were selected, too. I was lucky to have my IMO peers to be facilitators – they were expert and performed decently in MCO 2013.
Similar to IMO camp, I had to arrange everything, but it was a little different from IMO camp:
1. Accommodation. Unlike IMO camp where participants stayed at “Permata”, for this camp accommodation was not arranged and we needed to arrange ourselves. Since we started at 9am and ended the session at 5pm, we decided to book for the date one day earlier and check out one day later.
How about hotels? Prescott (just beside Job Street) was full, so the only candidate was Tune Hotel.
2. Transport. Since there was only 2 of us, Xin En bought bus ticket few days before the camp, with Puduraya as our destination. Still 2km away from Tune! Rather than exhausting ourselves by walking, we studied the route of KL Integrated Public Transport System (or simply Rapid KL) beforehand: the only thing you need to know is the route from Rapid KL station and your destination.
Friday: “Hello!” to KL again.
Exhausted from car lesson (where I “accidentally” overtook an MPSP truck), I “recharged” myself before heading to bus station. Upon my arrival, I found Xin En (with his parents), and soon informed my him that we were boarding into Plusliner bus, 2:45 p.m. “Tick tock tick tock!” I looked at my watch, the double-decked bus arrive at 3pm and departed at 3:15 pm, 30 minutes late. What’s more annoying was, our tickets showed seats FB, FC, and after being directed to the upper deck we had difficulties in finding the seats-the row number jumped from 4, 5 , M,…, luckily most passengers have boarded the bus, so we spotted the only row with 2 empty seats in the front row (F for “front”?) Plus the front row was not air-conditioned! We should have claimed compensation from the bus company.
(Two passengers who were irate at the poor service of bus company)
We ended up arriving Puduraya late at 8:45 pm (partially because of traffic jam), with our stomachs complaining for hunger. My father is right: Consortium is better (it would have arrived at 7:30 pm despite heavy jam in KL). We quickly settled our dinner and proceeded to the nearest Rapid KL station: Plaza Rakyat, where few people approached me:
“Sir, could u pls …… ” I had no patience in listening their begs, but I saw a piece of paper recording names, and a column recording figures (perhaps, donation?)
“I am short of time,” with stony expression, I replied, then tried to move forward to approach Xin En.
“Sir! Sir!” Their voice gradually faded from my ears.
Oops! I should have kept quiet and shown a “No” sign to them. I might have hurt them (if they were sincerely asking for donation), but I was not culpable for this: nowadays scammers are everywhere so we really have to be careful to prevent ourselves from being cheated.
Finally, after a smooth journey from Plaza Rakyat to Medan Tuanku (before 5 minutes walk to Tune Hotel), we settled down and prepared ourselves for tomorrow. Luckily, the ventilation was satisfactory and we did not need air-conditioner.
Saturday-Sunday: MCO Camp
First order of business was always introduction, no exception for this. Good thing: that was anticipated by the “Hour of Code”, which captivated all novices: http://learn.code.org/hoc/1. Dealing with angry bird and zombie wasn’t all: this was the baby step to learn “moveForward”, “turnRight” and “turnLeft” commands. We were soon introduced the basics of C++ programming by Mr. Shawn Tan — components of commands in C++, like declaration of integer vs floating, arithmetic and logic operations, etc. Rocket flying! In 2 hours he already covered a lot of topics, and I was still scratching my head, unable to digest the essence.
Practice: all of us signed up at Coderbytes and utilized Ideone to code. Note that coding in Coderbytes was not identical as in Ideone: you need to maintain “gets(stdin)”, so it took me a long time to edit code from Ideone. I really need to practice and study first before taking these challenges. No point of getting 5/10 or below, isn’t it?
Apart from this, two facilitators, Steven and Sher Minn taught us something out of the box, fake logo and HTML+CSS language respectively. To be honest I didn’t really understand how to link Fake Logo with your contacts such as Facebook and Twitter, and even Wikipedia! However, I really enjoyed the guide of HTML (homepage making) and CSS (decoration). Feel proud to have a homepage 🙂
16 hours (8 hours *2) passed, and the camp ended, leaving me question marks on challenging C++. Umm, MCO 2014 as a motivation. IOI2013 Malaysia team leader, Dr. Ong Shien Jin who also conducted the MCO workshop, state that a prerequisite to take part in MCO is to solve 1 full problem in USACO and COCI.
(facilitators who helped us a lot, though sometimes it was a little hard to understand)
What’s after 5pm?
Shawn warned us during the Hour of Code session that introduction and C++ would be boring. Simply because, C++ was new to us and we could not absorb 100% from the lesson. Evening time was supposed to be used as revision, right? But we did the other way: Xin En and I “lofted” in KLCC for both evenings. Actually, I was finding a C++ book, where I wished to order from MPH but there was no stock, unfortunately. Luckily I found it at Kinokuniya bookstore in KLCC the first day.
Another contrast with IMO Camp Experience, right? For IMO camp we had to be in Permata Campus all time, but for MCO camp we had freedom. Furthermore, no point staying in hotel for the entire evening and night.
Again, the best (only) way to reach there was through Rapid KL system. Have a look at the map: http://www.myrapid.com.my/rail/routes. Having known that the route of Medan Tuanku->Bukit Nanas-> Dang Wangi -> KLCC required only 3 stops, we were surprised at RM 4.40 for one-way! Moreover, since Bukit Nanas and Dang Wangi were connecting stations, we were surprised when we had to exit the station and repurchase tokens to KLCC using the direct route. After inquiring the officers, here goes our plan: Bukit Nanas->Titiwangsa (+ 3 minutes walk) -> Masjid Jamek (+5 minutes walk) -> KLCC, and it was 7pm upon our arrival (Saturday evening). Gosh! 😦 Rapid KL would win the efficiency contest hands down.
Once bitten, twice shy. We decided to split our tokens into 2: Medan Tuanku -> Bukit Nanas and Dang Wangi -> KLCC for subsequent trips, and it’s just 6-7 minutes from Bukit Nanas station to Dang Wangi station.
Scrumptious sub-of-the-day for dinner on Saturday (at Avenue K). Guess what? It (as well as stuff at KLCC food court) was cheaper than food at Lotus Nasi Kandar near Job Street where we had our lunch for both days. Taking one type of vegetables could cost you RM 2, let alone my adventurous feel of trying out “Briyani” rice. Sigh! Gonna have a frugal meal after back home.
We spent our Saturday night in Kinokuniya book store, where I tried to find a C++ book and asked a Counter Service Assistant about its availability, here’s what he showed me:
Yeah! I got it! The rest of the time (including second day in KLCC) was to loiter in Kunokuniya, story books searching, and of course, walking from A to Z in KLCC. After this practical research, we came out with this report: 95% of shops in KLCC (mostly boutiques) were irrelevant for students like us. However, road sign on “The Banana Republic” captured our attention, and we had a look on what it is before knowing that it was a boutique brand.
Monday: Farewell to KL.
In contrast with the bus to KL, this coach was punctual (departure time at 10 am). Xin En and I were then amazed by its speed where we arrived here at 3 pm (plus there was 30 minutes lunch time for us at Sungai Perak rest area. Main reason, I suppose, was better traffic condition.
What’s next? Since Mr. Suhaimi issued homework as practice for problem solving, that was the time for me to solve the problems. What’s the most important: learn C++ and be active in Coderbytes and Codeforces! Time to get myself equipped with the basics.