Week 12 update

After rereading my last post, I was still frustrated by not having a constant-time solution.

Well… after spending a couple more hours on it, I have one!

from math import sqrt
def ideal_i(n: int):
    return round(sqrt((2 * n) + 1)) - 1

I knew that in my previous solution, the number of times we loop must be derivable from n itself without actually performing the loop. There was still some “naivety”. Even though we’d avoided brute force calculation of i, we were still waiting to satisfy a condition.

I realized that this problem is very similar to that story where Gauss figures out that to find the sum of the numbers from 1 to x, you do a fancy thingy with lining them up. We’ll omit the part where my math sucks and I spend a couple of hours fiddling with that. I eventually found this: sum = x(y + x)/2, where x is the number of terms to add and y is the first element. That lets us start at 2: n = x(2 + x)/2. Luckily, a “solve for x” website gave me the x = sqrt(2n+1) equivalent.* Which we want to round.

Now we know how many times we would have incremented i, which means we have i.

Just to be sure, I tested the two versions using this helpful old strategy:

all(ideal_i(x) == ideal_i_2(x) for x in range(1,10000))

Satisfaction! Someone should update Wikipedia with how to calculate i given four stools. I’m sure that’s where all of us looked first, only to realize that it doesn’t know the answer either.

That said, we should remember that removing the naivety of the original formula gives us a huge advantage in speed, but means we have to have faith that the pattern repeats infinitely.

(P.S. My real last post! Adieu.)

*zingarod later showed me the algebraic steps to get from one to the other. Thanks!

Week 12

I loved the lecture on memoization, mostly because it related to something I’ve been thinking about on the side.

Think back to assignment 1. The solution the four-stool solution is:

(1) move i cheeses to an intermediate stool using this method;

(2) move n-i cheeses to the destination stool using the 3-stool method, without the intermediate stool;

(3) move i cheeses from the intermediate stool to the destination stool using this method.

And the question is, okay, we’ve never been told the value of i or how to calculate it. So how do we pick it?

This question took me a while to think of a good answer to — and I was unsatisfied with it.

Solution number 1: make a helper function that finds the minimum number of moves for n cheeses; then make a helper function that tries all the options for an i value and picks the one that, at each level, returns the minimum number of moves.

Code:

def moves_req(n: int) -> int:
    '''
    Return the min number of moves to move n cheeses on 4 stools.'''
    
    if n == 1:
        return n
    else:
        return min((2*moves_req(n-i) + 2**i - 1) for i in range(1, n))
    
def ideal_i(n: int) -> int:
    '''
    Return the value i that produces the min number of moves to
    move n cheeses on 4 stools.'''
    
    ideals = [1]
    while len(ideals) < n:
        
        # Build a dict of all possible i: to moves using it.
        temp = {}
        for i in range(1, n):
            temp[i] = 2*moves_req(n-i) + 2**i - 1
            # Pick the i that required the fewest moves.
            ideals.append(min(temp, key=temp.get))
            
    return ideals[n-1]

One thing you’ll notice is the redundancy. Both moves_req and ideal_i are trying values of i and picking the smallest one. So why not just call moves_req with n+1 and grab the i it picked for its call to itself at n?! The whole structure of ideal_i is a glorified version of what moves_req does much more simply.

But anyway, the biggest problem I found was that this is massively slow. Try it yourself in a blank Python file. Call ideal_i(10). It returns. Call ideal_i(20). It returns, a little slower. Call ideal_i(25). Noticeably slow. Call ideal_i(30)… why don’t you go put on the kettle?

Luckily, for the purposes of the assignment, there don’t seem to ever be that many cheeses, so I had no problems with actually executing it according to my final report. But still, I knew it had to be possible to do this faster…

My frustration built for a little while. One day — a couple of weeks later — I realized I had a sort of solution. Every time you call moves_req with n == 5, it calculates the answer for n == 4, n == 3… etc. So why not capture its solutions at each level and reuse them the next time? I had, of course, school and work to do, but I kept trying to write code in my mind all day. Each time I got a chance to try it at a computer, something went wrong.

Finally, while getting a ride home, I took out pen and paper and worked out a version that made sense to me. (Good thing we have all that practice from doing tests by hand, eh?) When I got home, I tried it out, and it worked! Here it is:

def ideal_i(n: int, ideals: list=[(1, 1)]):
    
    while len(ideals) < n:
        options = {}
        for i in range(1, n):
            options[i] = 2 * ideal_i(n-i, ideals)[1] + 2**i - 1
        best = min(options, key=options.get)
        ideals.append((best, options[best]))
    return ideals[n-1]

As you can see, it’s very similar to the previous one, but there are a few significant differences. For one thing, I collapsed the two functions into one.

Now, this one function takes a second parameter called ideals. Note that it has an equals sign after it, which means that it has a default value. Any calls that actually specify a second parameter will use the specified value; any calls that only specify one parameter will use the default value. Very useful feature of parameters.

So what is that parameter? It’s a list of tuples; the index corresponds to n (the first index is n == 1), and in each tuple, the first element is the ideal value of i for that n and the second element is the number of moves for that n.

Our strategy is that, given n, we check whether we have a tuple for it in our list. If not, we sort through all the i values we need and find out which one generates the minimum moves for n-1… which we now have in a handy tuple that’s the last element in our list. Then, we record both the i and the moves it itself took, and append that to the list. When our list of ideals is long enough to look up n, we return the last one.

So, great! This one can calculate n at comfortable speeds up to a much larger number. Note that it actually has to be done in blocks of 800 or so, though… because it hits the maximum recursion limit. (Python seems to remember the value of ideals, so as long as the shell doesn’t restart, a call of ideal_i(1500) will fail but a call of ideal_i(750) followed by a call of ideal_i(1500) will work.) So now I could calculate an i value for an n in the low thousands.

I sent this to zingarod and he introduced me to dynamic programming via Wikipedia. That’s when I learned that I’d come up with something along the lines of memoization. Cool!

I put this to bed for a while, but one Saturday I woke up and decided to fiddle with it. And suddenly, I noticed a pattern. (Note: Due to a minor error in my code that I just noticed and fixed five minutes ago, the pattern I noticed was off. Everything I write after this point will use the correct pattern and code, since the methods are the same anyway, even though the actual history is slightly different.)

What was this pattern? Well, printing the i value for each n sequentially, starting at n == 1, we get…

1 1 2 2 2 3 3 3 3 4 4 4 4 4 ...

Wow! Pretty simple, eh? A run of length 2 where the value is 1; then a run of length 3 where the value is 2; then a run of length 4 where the value is 3 …

So I came up with this next solution, which has what is an advantage, in this case, of not being recursive:

def ideal_i(n: int):
    
    ideals, i, step = [], 1, 2
    
    while len(ideals) < n:
        ideals += [i] * step
        i += 1
        step += 1
    return ideals[n-1]

As you can see, it starts off with an empty ideals list. Then it appends values to that: first it appends [1] * 2 to it, then [2] * 3, etc, just as we saw above. Then it picks the correct value for n.

This is a good solution in some ways because of another concept in dynamic programming: naivety. The solution in which we actually calculate i for each n by testing possible values is a “naive” one, meaning we treat it as though we don’t yet know the answer. But once we’ve seen the pattern, we in fact do know the answer by a much simpler, faster means.

Hence, I saved this in a file called Tour_radical.py. It just felt so much like cheating. After all, that pattern could technically not hold at, say, n == 10,000. We can’t be sure, for example, that the length of each run really increases by 1 instead of by 1.00001 or something, meaning that eventually our prediction of the length of a run would be off. And, of course, the naive version is too slow to test it and compare at every value. So I tested it up to 800 or so in a separate file using code like this:

all(ideal_i(x)[0] == ideal_i_2(x) for x in range(1, 800))

*Note: If you’re following along and you use this to compare all the versions later in the post, it will not always return True. This is because for some values of n, there are actually two values for i that yield the same number of moves required, and the algorithms might not choose the same one.

So that’s as far as we can test the pattern… The rest is based on faith.

But there was still one optimization I could make to the code. If you test it, you’ll find that it only works up to about n == 125,000,000, and that probably depends on your machine. What happens after that? Well, first I got Python aborting. Then, with smaller numbers, I started to get MemoryErrors.

That’s because that list is, of course, getting to be utterly massive at large n. I tried writing it to a text file with a very large n once, and the text file was over 200mb. The TEXT FILE. DOT T X T. So, yeah. Not good.

Then I realized something more. Why do I have to keep an actual list of the values up to that point? I can calculate how long it would be if we did keep an actual list. So why don’t I create a variable to capture that hypothetical length, keep track of what the ideal i would be at each step, and return that value when the length is long enough?

def ideal_i(n: int):

    i, pseudo_len = 1, 2
    while pseudo_len < n:
        i += 1
        pseudo_len += i+1
    return i

This one is pretty self-explanatory.

And very quick. It’s instant about up to n == 1,000,000,000,000, and can run tolerably quickly up to about n == 1,000,000,000,000,000. Its time is at least as good as linear with a very small constant (maybe better; I’m not quite sure how to analyze the complexity here).

What does frustrate me, though, is that I’d probably be able to make it instant if I’d taken math further than grade 12… Maybe someone else can help out here? My suspicion is that with such an obvious pattern, there must be a way to predict in a single line, given a value n, how many times we would run the while loop if we did run it. And that number would, of course, be the same as our ideal i minus 1.

Oh well. I’ll have to be satisfied with this. After all, Anne will probably never be able to fit more than a quadrillion cheeses on a single stool, no matter how much she reinforces the stool’s legs.

But back up a second. I took this solution to zingarod, following up from my previous email, and he liked it. It clearly takes less time than other algorithms. But he invited me to his office to show me his own solution as well.

What was his? Well, it was basically my memoization one. Except that instead of recursion, he just used iteration to extend the list of ideals. In terms of runtime, they should be about equal — mine essentially iterates through calls to the function since the base case checker, ideals < n, stays updated — but remember that mine kept hitting a maximum recursion depth and needed to be “primed” in blocks of 800 or so. Switching recursion to iteration fixes that.

Here it is, as best as I remember it and edited for uniformity with my style (I don’t have a copy of what he showed me in his office):

def ideal_i(n):
    
    ideals = [(0, 1), (1, 1), (1, 3)] # list of (i, min_moves)
  
    for value in range(3, n + 1):
        options = {}
        for i in range(1, n):
            if value - i >= 0:
                options[i] = 2 * ideals[value - i][1] + 2 ** i - 1
        best = min(options, key=options.get)
        ideals.append((best, options[best]))
    
    return ideals[n]

Now, here’s something to think about: Why might we use zingarod’s solution, which still only runs on an n in the low thousands, rather than my ultra-fast one?

The answer: Remember that ideals was a list of tuples in that version of the function. The second element in each tuple was the moves required. Keeping track of the moves required could be a very valuable behaviour, depending on what you need the function for. (You don’t technically need it for assignment 1, but that’s because we keep track of the required moves by actually making them. Speaking of which… tour_of_four_stools is only tolerably fast for up to about 120 cheeses anyway, so all this ideal i stuff is very much hypothetical!)

The other reason zingarod’s solution is good is because, if you remember it up there, it uses “naive” calculation, i.e. we can be sure that each i really is the one that requires the fewest moves instead of having to have faith in the pattern.

But before I came to the end of my journey with this function, I realized there could be one more optimization to the code, and offered a hybrid solution to zingarod. We use a bit of naive calculation and a bit of the pattern.

Here’s how it works. The original formula for the four-stool solution required picking some value i where 0 < i < n. That means that for n == 1,000,000, we have to try 999,999 values of i before we find the one that generates the fewest moves. This fact is also what makes the algorithm take quadratic time: it goes once through every value < n, and for every value, it has to try n possible values of i.

But in reality, we can make much, much better guesses right off the bat. Specifically, we observe from our pattern that each i value will either be (a) equal to the value preceding it, or (b) exactly one more than the value preceding it. That limits our choices at every value to only 2.

Which means it stays linear. Which means it is, as zingarod put it, “fast as heck”.

Here’s the implementation (his code, plus edits for uniformity with my style to make it easier to compare with the other versions):

def ideal_i(n):
    
    ideals = [(0, 1), (1, 1), (1, 3)] # list of (i, min_moves)
  
    for value in range(3, n + 1):
        options = {}
        anchor = ideals[value - 1][0]
        for i in (anchor, anchor + 1):
            if value - i >= 0:
                options[i] = 2 * ideals[value - i][1] + 2 ** i - 1
        best = min(options, key=options.get)
        ideals.append((best, options[best]))
    
    return ideals[n]

This can handle n up to about 1,000,000, which is still very impressive with half-naive calculations. Woo!

So that’s that. I had a lot of fun working on this problem! Moral: Think outside the box and try new things. Follow a lead, even weeks after the due date, if it really interests you. Always use your professor as a resource outside of class to further your understanding. And so on.

Welp… This is my last post on this here slog. And one of my last three pieces of schoolwork in university! I’ll miss y’all, and thanks for reading along. And good luck to you all in future years!

>>> BerkeyColortran.quit()

****

UPDATE: I worked out a constant-time solution.

Week 11

The part of assignment 2 part 2 that I had the biggest problem with was all_regex_permutations.

Not because it’s hard to generate permutations and filter them through a working is_regex. That’s easy — we saw it in class with searching for anagrams. But we were discouraged from using a model like that. Why? Because the number of permutations to generate and filter gets larger and larger (factor of n). It wasn’t necessary to speed it up, as zingarod said on the board, but I really wanted to.

The first thing I knew was that there are some very easy cases. Base case is that your string is a leaf, in which case it has only one regex permutation, itself. And the star is also easy. Just subtract a star from the string, recursively make all the permutations you can make out of what’s left, and stick a star on the end of each one. Because this was going to be my approach in a couple of cases, I wrote a helper function called complement:

 

def complement(whole: str, s: str) -> str:
    '''
    Helper function for all_regex_permutations and all_combinations.
    Return a string equal to whole with all characters in s removed.
    '''

    for char in s:
        whole = whole.replace(char, '', 1)
    return whole

Now—and not mutually exclusively, that is, not with an elif but with an if—I had to deal with bracket variations.

My approach started off the same: use complement to subtract the brackets from the string, and then, with each operator, subtract it and make valid regexes out of what’s left.

The difficulty is that you have to split the string across the operator in every possible way in order to find every possible valid regex arrangement of them. (I’m sure there’s a better way, but I haven’t figured it out yet.) And that sounds like… yup… permutations.

Well, it’s actually a little better than permutations. It’s combinations, which means the order doesn’t matter. For example, permutations ’01’ and ’10’ are just a single combination, ’01’. So you do have many fewer options.

And specifically, I needed combinations of every length. That is, a base string ‘123’ should yield length 1: ‘1’, ‘2’, ‘3’, length 2: ’12’, ’13’, ’23’, length 3: ‘123’. Then, I could put each possible combination on the left side and use my complement function again to find what would go on the right side for each combination.

So the question was: how could I efficiently find all combinations?

And that was what was so hard, for some reason. I kept trying to build it through iterators. My idea was: take every length 1 string, then add each remaining character once. Take all these length 2 strings, and add each remaining character once. Take all these length 3 strings… etc. But I wasn’t quite sure how to do that. I had multiple iterators iterating through it with inner loops. But I realized that this only worked for string of certain lengths. Once the string was long enough, I needed another iterator to capture everything.

Now… full disclosure: I hadn’t written complement yet, in fact! I had been replicating its effects with longer code, but it was through this process that I eventually realized a lot of what I was doing was solvable with complement.

I actually grabbed someone else’s code from stackoverflow.com, since I’d seen on the board that we can use others’ code if we cite it. I found one that generated all combinations of length n:

 

def _combinations_of_length(s: str, n: int) -> 'generator':
    '''
    Yield all combinations of s of length n.
    '''

    ## Combinations generation adapted from code found at
    ## http://stackoverflow.com/a/2837693
    for i in range(len(s)):
        if n == 1:
            yield s[i]
        else:
            for rest in _combinations_of_length(s[i+1:len(s)], n-1):
                yield s[i] + rest

Then I built a wrapper function to feed every length into this function and return a set of all of them:

def all_combinations_theirs(s: str) -> set:
    '''
    Helper function for all_regex_permutations.
    Return a set of all combinations of s of len 1 < len < len(s).
    '''

    combinations = []

    for n in range(len(s)+1):
        combinations.extend(_combinations_of_length(s, n))

    return set(combinations)

This works wonderfully fast, but I don’t fully understand what it does. Mostly because we haven’t been taught what yield is. (It has something to do with a generator object… it’s odd. If you look at the actual object return by a function that yields instead of returns, you get <generator …>. But if you cast that in a list, you get what you’d expect. Need to do more reading.)

Later on, I felt like I could in fact understand what the other person’s code was doing, and I realized something: in order to generate every combination of length n, it’s recursively generating everything up to that length anyway! But it’s throwing it out because the person designed that function only to produce those of length n. So I monkeyed around with it a bit, trying to alter it to retain the smaller lengths as well. But I couldn’t do it, and (again) I’m pretty sure it relates to yield. So I decided to write my own version, because I felt like if I could capture those smaller lengths as I generate them, I could actually be faster.

Well, I wasn’t quite right. I can’t fight the yield. But I did produce my own version at last, after having spent days on it and then given up. I wrote complement in order to do more or less what I described above, and produced this:


def all_combinations(s: str) -> set:
    '''
    Helper function for all_regex_permutations.
    Return a set of all combinations of s of len 1 < len < len(s).
    '''

    def _helper(s: str, n: int, combinations: list=[['']]) -> list:
        '''
        Return a list of lists of combinations of s of length 1...n.
        '''

        while len(combinations) < n+1:
            new_length = []
            for combo in _helper(s, n-1, combinations)[-1]:

                for new_char in complement(s, combo):

                    if not any(''.join(sorted(combo+new_char)) ==
                      ''.join(sorted(old)) for old in new_length):
                        new_length.append(combo+new_char)

            combinations.append(new_length)

        return combinations

    result = set()
    for length in _helper(s, len(s)):
        for combo in length:
            result = result.union({combo})
    return result

Not too pretty but it gets the job done.

A couple of clarifications. One, note that the embedded helper here separates the overall list into sublists, one for each length. This is just to help identify the anchors to build the combinations that are one character longer than the previous one. Then they’re all joined in the last bit of code in the main function.

Two, note this “any” business. The problem with this code is that it essentially overgenerates, almost generating all permutations (which is bad! bad! bad!). If a character is doubled, as you will often have in regexes, you’ll have variations of the same combination, i.e. permutations. So I used the signature technique zingarod showed us in lecture: I alphabetize the combination and see if any other combinations already generated match it when alphabetized, and I only add it if it’s unique. At first this might just seem like a filter that doesn’t increase efficiency. It would be — if we only ran it at the end. But if at each stage we cut the permutations down to the unique combinations, then we won’t have as many in our anchor and therefore won’t generate as many permutations in the next stage, so the total number of permutations generated and filtered will be far fewer.

Since I had a working alternative to the borrowed code, I decided to ask zingarod if it was okay that I was using this much cited code. He said he’d prefer if I used my own. So I had to suck up the fact that my code was not fast. But at least it was my own solution!

There was one point of slowdown that I could address, though. After generating all these combinations, remember that my solution for binary regexes in all_regex_permutations (oh yeah… all the way back up there!) required that I find all permutations and stick them on the left and right sides of the operator. Running all_regex_permutations recursively on a large list of combinations was going to be very slow. So I did a final optimization by putting many short-circuits in all_regex_permutations, i.e. quick heuristics that allow me to predict when a string will never make a good regex no matter how many permutations I make of it. These included: Does it lack at least one leaf? Is the number of leaves anything but the number of operators plus one? Is the number of brackets not exactly double the number of operators? And so on. This did save a lot of time.

So… long story short: my solution does essentially involve generating lots and lots of permutations, despite my best efforts, but it does cut down on them notably by (1) focusing on combinations rather than permutations, (2) using a signature to eliminate extras, and (3) short-circuiting most of the recursions of all_regex_permutations on the final list of combinations. Yay, compromise!

 

P.S. Sorry no week 10. But I did so much reading for week 9’s list of slogs I like, that, you know…. ah, what am I saying. There’s no excuse.

Slogs I like

I just finished reading one or two posts of every slog (okay, a couple I skimmed). Here are the ones that, for one reason or another, I particularly like: they’re funny, they’re well-designed, they’re informative, they’re well-written, etc.

http://sajidcsc148.blogspot.ca/

http://ilovecsc148.wordpress.com/

http://ddc-csc148slog.blogspot.ca/

http://compscikitty.wordpress.com/

http://whatshappeningcsc148.blogspot.ca/

http://killerapp.4nh.pw/

http://fatmothinmycode.wordpress.com/

http://thegreatmoth.blogspot.ca/

http://pythonmenthol.tumblr.com/

http://csc148sharon.wordpress.com/

http://pythonbivittatus.wordpress.com/

http://pcsc148slog.wordpress.com/

http://hannay148.wordpress.com/

On the other hand, repeating keywords rules a slog out for me. Not updating in over a month also makes it a non-visit, even if that first and only post was brilliant. :/

Also, observation: All poor slogs are alike; each good slog is good in its own way.

Hope the list leads you to an enjoyable read!

Week 8

Assignment 2 part 1 took me forever.

Not because the coding itself was hard. It wasn’t. Nor because I couldn’t figure out how to do it at a design level. I could. It was because no particular design seemed better than any other. Or it did but it contradicted the instructions.

My first idea was to make three nodes: BinaryNode, UnaryNode, and LeafNode. Each took its appropriate number of children and a head. They each had their own methods. There was no inheritance at all, actually, until I wanted error-checking and then I made an empty Node class that any child parameter had to be an instance of.

But typing in these classes in the shell to construct them was so annoying. Three different classes to type and you had to remember how many children for each one?

So I decided to make a RegexTree class that moderated them all, taking a proper regex string as its one parameter and parsing it, distributing it to the various Node classes. It was beautiful! Then I realized that this was part of part 2 and abandoned it.

For a while I continued with that RegexTree wrapper. I just made it take a head and two child parameters, made optional by giving them a default value of None. Then, based on how many children it was passed, it would create BinaryNodes, UnaryNodes, and LeafNodes, like it had done before.

But I realized at this point that RegexTree wasn’t really serving the purpose it had used when I made it parse a full regex string. Now it was just kind of an extra layer. So I returned it to just having those three nodes.

I still wasn’t happy. Remember that I didn’t like how you had to type them so carefully into the shell. Then I realized: wait, if RegexTree was able to sort out its parameters and make the proper Nodes… why not just have it be all those Nodes? That is, why not just make it behave like a BinaryNode if both children are given it, like a UnaryNode if only one child is given it, and like a LeafNode if no children are given it? Make it adaptable. Also, I could have all my error-checking in one place, right in the __init__ method, rather than scattered through various classes’ __init__ methods.

So I implemented that, and it worked fine. And now you just wrote RegexTree(‘0’) or RegexTree (‘*’, RegexTree(‘0’)) or RegexTree(‘.’, RegexTree(‘0’), RegexTree(‘1’)), and it sorted everything out for you. Much easier typing in the shell! I even considered renaming it ReTree to save time.

That was what I submitted first.

But then I read the thread that said you shouldn’t use just a single class, even if it gets the job done. I was dismayed. I didn’t know why this wasn’t preferable. (Since then, zingarod has pointed out that building the behaviours of three classes into one class isn’t too tricky, but what if there were more basic Node types? Having one class adapt to handle all of them would start to look very confusing.)

At first I resisted it and thought, “Eh, let them do with it what they want.” But then I decided I wanted the marks, and I wanted to understand why more than one class was better, so I sorted through all my regex_design, regex_design_old, regex_design_old_2, regex_design_tentative, etc. files and reinstated BinaryNode, UnaryNode, and LeafNode.

I also remembered zingarod  saying that we should use inheritance, which I’d use none of except that empty Node class I described above. So I took a leap of faith and decided to give that Node class something to do.

The first thing was realizing that __init__ was very similar for all three of them. But in order to justify having more than one class (i.e., finding a reason not to just turn Node into the RegexTree class again), I decided to implement a new __init__ in each child class that nevertheless benefited from Node’s __init__. So I had Node’s __init__ be able to take both optional children. And then I made each child class’s __init__ call Node’s init inside it, passing it None for the each child parameter I didn’t want it to have. That way, the child classes could still have parameters for children they didn’t have, but still behave uniquely and appropriately.

Having the same number of children, even if some were None, made __eq__ totally inheritable. It just required that instead of isinstance(other, …), I used type(self) == type(other).

__repr__ could easily have been inheritable too, except that I decided I didn’t want the output to have ‘None’ where there were no children. That is to say, if Node had the only __repr__ and it was inherited, the outputs would have looked like “BinaryNode(‘.’, LeafNode(‘0’), LeafNode(‘1’))” but also “UnaryNode(‘*’, LeafNode(‘0’), None)” and “LeafNode(‘0’, None, None)”, which is hideous. So I just gave each one a unique __repr__.

I also instituted a method called “checker” that gets called during __init__. The method makes sure that the Node has the right kind of head for its number of children and that all children are nodes. It handles errors less specifically than my previous means did, but I could still distinguish a HeadError and a ChildError, and most importantly it could now be inherited.

The final note is that zingarod explained to me the design that he’ll post soon and that we’ll use as our base for part 2. He made a separate Node class for every possible regex item! The advantage of mine is that it can be scaled. I have three constants: BINARY, UNARY, and LEAVES, all tuples that list the possible heads of each node type. If we extended our alphabet, all you have to do is add the new values into my tuples. For example, if we were given a second unary operator, you wouldn’t have to create a new class for it; you’d just add it to UNARY and UnaryNode would handle it.

The advantage of zingarod’s is, I’m guessing, that each separate Node can have very specific behaviour, which I think will help with matching strings. That’d be kind of hard to do right now with my design—you’d have to know which binary operator you want, not just that your operator is binary, which is all my class tells you. So in terms of simplicity of class expansion, my approach is easier, but when you actually want to use these regexes for something, zingarod’s is going to be much better.

Looking forward to part 2. Especially since I’ve got the parsing down already. I just have to rescue it from regex_design_variation_3_old.py.

Week 7

Linked lists. Fun! I’d say the hardest thing about understanding them is the lack of a real index, a consequence of the lack of a real container. It just feels so odd, you know? Having all the elements floating around out there, as far as Python is concerned. Each element only aware of the next one’s location via that rest attribute. The next element not even aware of the element that is aware of it. So tenuous. In fact, it’s that unidirectional awareness that screwed us in labs. That damn prepend method and that damn remove_first method. I mean, queues are first in, first out. That means you’re using one side to put things on and the other to take things off. But both “pre” and “first” deal with just the one side. That was misleading. In order to deal with one side of a conveyor belt—putting things on sometimes and taking them off sometimes from the same side—you would either need a container or you would need the elements to be aware of themselves in both directions (i.e., each element would have an attribute “next” and another attribute “previous”, such that two adjacent elements would be pointing to each other, using up even more memory). Anyway, once we realized that the methods were misleading and decided that we had to ignore at least one of them, the problem became easier to solve, and the benefit of a linked list implementation for a queue ADT became clearer. After all, if you have no index… you have no index to update for each of the potentially millions of elements left in the queue whenever you dequeue.

What else? The midterm. Considering the fact that we only have two lectures this week and one of them hasn’t happened yet, I consider this a chance to unwind rather than to reflect on, well, things we’re about to learn. Therefore, enjoy this hilarious review of a terrible old game.

Week 6

The handout for the lab this week mentioned that we use idioms in a specific language because they’re more elegantly written and their implementation can save time. Intrigued by the idea, I decided to compare the running times of the functions we wrote in the lab vs. the ones we were provided.

First, the functions. You can get the originals on the labs page of the course site. Here are my non-idiomatic implementations:

def ascending_pythagorean(t: (int, int, int)) -> bool:
    return (t[0] <= t[1] <= t[2]) and (t[0]**2 + t[1]**2) == t[2]**2

def dot_prod_2(u: [float, ...], v: [float, ...]) -> float:
    total = 0
    for i in range(len(u)-1):
        total += u[i] + v[i]
    return total

def matrix_vector_prod_2(m: [[float, ...], ...], v: [float, ...]) -> [float, ...]:
    final = []
    for u in m:
        final.append(dot_prod_2(u, v))
    return final

def pythagorean_triples_2(n: int) -> [(int, int, int), ...]:
    final = []
    for i in range(1, n+1):
        for j in range(1, n+1):
            for k in range(1, n+1):
                if ascending_pythagorean((i, j, k)):
                    final.append((i, j, k))
    return final

def any_pythagorean_triples_2(start: int, finish: int) -> bool:
    for i in range(start, finish+1):
        for j in range(start, finish+1):
            for k in range(start, finish+1):
    if ascending_pythagorean((i, j, k)):
        return True
    return False

So then I wrote a little script called timeloops.py. It imports all the functions — original and my own — and then runs them a lot of times. It does two tests on each. In the first, we run the function tens or hundreds of thousands of times, giving it easy parameters. In the second, we run the function a few times, giving it difficult parameters. (For example, we could make dot_prod’s two list parameters have thousands of elements each.) This allows us both to test both how fast it normally runs and how it responds to more complex cases.

Here’s what it spits out with a few test values:

<dot_prod_1...> ran 2000000 times on basic difficulty in 3.46 seconds.
<dot_prod_2...> ran 2000000 times on basic difficulty in 2.60 seconds.

<matrix_vector_prod_1...> ran 1000000 times on basic difficulty in 4.43 seconds.
<matrix_vector_prod_2...> ran 1000000 times on basic difficulty in 3.42 seconds.

<pythagorean_triples_1...> ran 80000 times on basic difficulty in 9.47 seconds.
<pythagorean_triples_2...> ran 80000 times on basic difficulty in 9.30 seconds.

<any_pythagorean_triples_1...> ran 50000 times on basic difficulty in 12.65 seconds.
<any_pythagorean_triples_2...> ran 50000 times on basic difficulty in 12.58 seconds.

<dot_prod_1...> ran 10000 times on difficulty 2500 in 6.80 seconds.
<dot_prod_2...> ran 10000 times on difficulty 2500 in 11.36 seconds.

<matrix_vector_prod_1...> ran 10000 times on difficulty 800 in 11.79 seconds.
<matrix_vector_prod_2...> ran 10000 times on difficulty 800 in 9.18 seconds.

<pythagorean_triples_1...> ran 2500 times on difficulty 22 in 20.56 seconds.
<pythagorean_triples_2...> ran 2500 times on difficulty 22 in 18.25 seconds.

<any_pythagorean_triples_1...> ran 500 times on difficulty 120 in 15.31 seconds.
<any_pythagorean_triples_2...> ran 500 times on difficulty 120 in 15.26 seconds.

As you can see, all the non-idiomatic implementations are a little faster, except that (1) the difference between the two versions of any_pythagorean_triples is non-significant, and (2) dot_prod_2 runs basic difficulty faster than dot_prod but is far worse at handling complex parameters.

Curious…

Makes me wish I knew how to investigate and compare the underlying implementations!

Week 5

What would recursion look like the wrong way round?

A few nights ago I had a dream. I was on an airplane. I don’t remember the details all that well, but I do have a few visual images of it. Then, at one point, I realized it was a dream, and I woke up. I went about my life for a while, lived various episodes. I was slightly puzzled about having realized that my dream was a dream—I’m not the kind of person who becomes aware of that fact while still dreaming. Then… I woke up again. I realized that I had actually still been having a dream that first time that I woke up and realized I had been dreaming. It was so Inceptiony. It was practically Inception 2.

Anyway, so this obviously has a parallel with recursion in that you go deeper and deeper into the same function. The problem is that the base case here feels like it should be waking life, not any of the dreams. That is to say, unlike in proper recursion, where you go deeper and deeper until you hit a solvable case, in the case of my dreams, the deeper you go into instances of the same function, you further you get from resolving it… so you will go infinitely deep.

I don’t know, I guess we could represent it in Python like this:

def recursive_dream(x):
    if x == 0:
        return x
    larger = recursive_dream(x + 1)
    smaller = larger - 1
    return smaller

Here we’re trying desperately to exit the dream by backing out to our base case of 0, but unfortunately our one operation on x is to make it larger. (The analogy might not be perfect.) You might feel like this will run the function x times. In fact it will go infinitely. Actually, you can’t really enter an infinite loop with recursion in Python, because it has a maximum recursion depth of 987, so you’ll get a RuntimeError.

How do we prove that this is infinite? Well, let’s trace it. What happens if we plug in 1 for x? Well, it’s not our base case of 0, so it calls the function with x + 1, i.e. 2. That calls it with 3… which calls it with 4… etc. So the function never gets to set the value of larger at any depth, nor subtract 1 from it to get smaller, nor return smaller.

So now let’s modify our code a little bit and add an upper limit, our own “maximum dream depth”. Think about it: what will the following code do?

def recursive_dream(x):
    if x == 0:
        return x
    if x == 50:
        return x
    larger = recursive_dream(x + 1)
    smaller = larger - 1
    return smaller

Okay, so now x will presumably == 50 at some point and we’ll hit our upper limit. So try to reason this out: what will recursive_dream(20) return? Trace it if you have to.

If you guessed an infinite loop, good thought but wrong. If you guessed 50, good thought but also wrong. It’ll return 20.

Tracing it again (by the “computer method”), if we pass 20 to our function, it’ll call itself with x + 1, i.e. 21. It’ll keep doing that until it calls itself with x = 50, which will trigger our upper limit case. That case returns x itself. Which means we now have a value for larger, i.e. 50. Next we subtract 1 from that to get smaller, i.e. 49, and we return that. Now, remember that this call is itself what’s setting larger for the next call in the stack, i.e. the next less deep call. Said less deep call will then subtract 1 from that to get its value for smaller, and return that… etc. Until the original call we made, with 20, receives the value 21 for larger, subtracts 1 to get its smaller value, and returns 20.

If it helps, remember that we can also trace this by the method zingarod calls the “human method” (in my opinion not necessarily more human, just the opposite direction). Don’t forget that we’re still working with a perversely inversed recursive function here, so our simpler case is nearer to the upper limit, not to the base case. So we should first ask ourselves, what does recursive_dream(50) return? Clearly 50 itself. So what does recursive_dream(49) return? Well, it calls recursive_dream(50). We know that that returns 50. Which our original call stores as larger, then subtracts 1 from to get 49, which it returns. Then you work out recursive_dream(48), plugging in recursive_dream(49) when it’s time to make that call, and pretty soon you realize that your tracing job is done.

Okay, we get that. Before we finish, let’s do one more odd thing for the sake of fun with our awful recursion. So, making our thoughts more convoluted, what will happen if we do… this?

def recursive_dream(x):
    if x == 0:
        return x
    if x == 50:
        return x + 1
    larger = recursive_dream(x + 1)
    smaller = larger - 1
    return smaller

What will recursive_dream(20) return? Again, try to trace it before you read the next sentence.

The answer, you’ll think, is that surely here it’ll be infinite. Surely! After all, we’ll never get a value for larger that allows us to exit the depth at the upper limit call and start working our way down the stack. That’s totes my own conclusion. Right?

Wrong. It returns 21.

If you’re frustrated, trace it again using the “human method”. So we try recursive_dream (50). Clearly this returns 51. Next, we work out recursive_dream(49). That will bypass our base and upper limit conditions and call itself with x + 1, i.e. recursive_dream(50). We know it returns 51, so 51 is the value for larger back at the recursive_dream(49) call. So the value for smaller is 50, and is returned. Now, again with recursive_dream(48). This calls recursive_dream(49), which we know returns 50. So 50 is the value for larger at the level of recursive_dream(48), and thus smaller is 49, which is what’s returned. It’ easy to extrapolate why recursive_dream(20) returns 21.

Okay, one last one. But I’ll leave you with this one, because it should be cake by now. So far we’ve had the +1 and -1 operations be exactly even. What happens if we unbalance them? That is to say, what does recursive_dream(20) do if we use the following version?

def recursive_dream(x):
    if x == 0:
        return x
    if x == 50:
        return x + 10
    larger = recursive_dream(x + 1)
    smaller = larger - 1
    return smaller

Enjoy! Don’t forget to trace! 🙂

Week 4

The problems we studied in lecture while talking about recursion reminded me of something I tried to do last year around this time. I wanted to make a little program that evaluates the validity of modern symbolic logic arguments for the user. A version of the program that I translated to Javascript is here. (It mostly works, but it doesn’t do predicate logic and its order of operations sometimes needs brackets to be correct.)

One of the problems in establishing validity in Modern Symbolic Logic is the idea of the hypothetical. “Valid” means that if all the premises are true, then the conclusion must be true, and “invalid” means that all the premises can be true and yet the conclusion be false. In this wording, it’s hard to tell how we check that. So in MSL these situations are translated into “worlds” or “models”, each of which is an arrangement of truth values across the irreducible names involved in the argument. Then, “valid” means that in every world in which the premises are true, the conclusion is also true, and “invalid” means that there is at least one world in which the premises are true and the conclusion is false.

Well, that’s good news for my program, because that starts to sound less like a thought experiment and more like something a computer can do. So how do we check the criteria? First, we need to generate all the possible worlds. When humans want to do this, they can make a truth table. A glance at that image should remind you of our first recursion problem. We want to generate every possible assignment of binary values to certain names that form the basis of any proposition in MSL. That’s the problem we’ll look at here.

Typically, names are one-character strings, starting at ‘P’. So what we want is:

>>> get_worlds(['P'])
[{'P': True}, {'P': False}]

>>> get_worlds(['P', 'Q'])
[{'Q': True, 'P': True}, {'Q': False, 'P': True}, {'Q': True, 'P': False}, {'Q': False, 'P': False}]

If we have only one name, ‘P’, there are only two possible models: one where it’s True and one where it’s False. If we have two names we have 2**2 models. Etc.

So here was my attempt last January. I’ll let you look at it and figure out how through the commenting. (P.S. Because we’re using dictionaries, it’s important to note that the program guarantees that each element in names is unique.)

def get_worlds(names):
    '''
    names is a list of immutable objects. Return a list of dictionaries of all
    possible different worlds. A world is an assignment of True or False to
    each name.
    '''
    
    # Set up the worlds list. Prepare it by filling it with empty dictionaries.
    
    worlds = []
    rows = 2 ** len(names)
    for i in range(rows):
        worlds.append({})
    
    # Iterate through the names. Each name is put in all the worlds before
    # moving on to the next one. The value it is assigned alternates every x,
    # where x is determined by how many names there are and this one's position.
    
    for pos in range(len(names)):
        name = names[pos]
        alternate = rows / (2 ** (pos + 1))
        
        row_outer = 0
        value = True
        
        # Start a deployment round (a range through all the worlds).
        
        while row_outer < rows:
            
            # Start an alternation round (an inner segment of the whole round).
                
            row_inner = row_outer
            
            # Iterate through it, deploying value until it's time to alternate.
            
            while row_inner &lt; row_outer + alternate:
                worlds[row_inner][name] = value
                row_inner += 1
            
            # Update our position in the outer round, and reverse the value.
            
            row_outer = row_inner
            value = not value
            
    return worlds

Okay, looks complicated but plausible. And indeed, if you run it, it works.

It’s not at all elegant, though. Its method is charmingly similar to how a human would fill out a truth table: write in all the rows and then write in the values row by row for each column. In code, it just looks silly. Surely we can improve on it.

So! Here’s the point of this blog post. Based on what we’re learning this week, why don’t we try rewriting it so that it’s cleaner and benefits from recursion?

Step one, as always, is to establish our base case: a list with no names. This should turn up a list with an empty dictionary.

def get_worlds_2(names):
    if len(names) == 0:
        return [{}]

Next, let’s figure out how to get from n-1 to n. Here they are again:

>>> get_worlds(['P'])
[{'P': True}, {'P': False}]

>>> get_worlds(['P', 'Q'])
[{'Q': True, 'P': True}, {'Q': False, 'P': True}, {'Q': True, 'P': False}, {'Q': False, 'P': False}]

The answer seems to be that to find get_worlds([‘P’, ‘Q’], we…

(1) Take the list returned by get_worlds([‘P’])
(2) Make a copy of each dictionary and add a key ‘Q’ and value True
(3) Make another copy of each one and add a key ‘Q’ and value False

So here’s how we might do that basic operation:

for world in smaller:
    plus_True = world.copy()
    plus_False = world.copy()
        
    plus_True.update({names[len(names)-1]: True})
    plus_False.update({names[len(names)-1]: False})

First we copy the world dictionary twice. Then we update* each copy with a new key:value pair, taking the last item in names for our key and linking it to one of each boolean value. Great!

So let’s put it all together with the proper recursion structure:

def get_worlds_2(names):
    
    if names == []:
        return [{}]
    
    smaller = get_worlds_2(names[:len(names)-1])
    bigger = []
    
    for world in smaller:
        plus_True = world.copy()
        plus_False = world.copy()
        
        plus_True.update({names[len(names)-1]: True})
        plus_False.update({names[len(names)-1]: False})
        
        bigger.append(plus_True)
        bigger.append(plus_False)
    
    return bigger

Yay! A much cleaner function! And if we do our tests on it, we’ll find that it works just as well.

Thank you, recursion! Surprisingly straightforward! And thank you, zingarod, for teaching it to us!

*Concatenating dictionaries in Python is not as easy as using +. One reason that could be problematic is if the keys are not unique. This thread should clarify why I chose dict.update().

Side note: queue solution

Thought I might as well mention my solution to the queue efficiency problem. My idea was basically to not actually remove anything from the list (the implementation we’ve seen for this ADT so far). Instead, simply start a counter at 0 and make the dequeue method (a) return the item at the index of the counter, and (b) increment the counter by 1. Now none of the remaining elements need to have their indexes shifted, which is what took so awfully long in this implementation.

The issue is that your queue could get to be massive—particularly if you have hundreds of thousands of elements to enqueue and dequeue, like zingarod showed us in lecture! I just assume that in a real computing environment, you could check CPU load and quietly clear the excess items (the cache?) in the background whenever the computer has the time and processing power.