Assignment 2 part 1 took me forever.
Not because the coding itself was hard. It wasn’t. Nor because I couldn’t figure out how to do it at a design level. I could. It was because no particular design seemed better than any other. Or it did but it contradicted the instructions.
My first idea was to make three nodes: BinaryNode, UnaryNode, and LeafNode. Each took its appropriate number of children and a head. They each had their own methods. There was no inheritance at all, actually, until I wanted error-checking and then I made an empty Node class that any child parameter had to be an instance of.
But typing in these classes in the shell to construct them was so annoying. Three different classes to type and you had to remember how many children for each one?
So I decided to make a RegexTree class that moderated them all, taking a proper regex string as its one parameter and parsing it, distributing it to the various Node classes. It was beautiful! Then I realized that this was part of part 2 and abandoned it.
For a while I continued with that RegexTree wrapper. I just made it take a head and two child parameters, made optional by giving them a default value of None. Then, based on how many children it was passed, it would create BinaryNodes, UnaryNodes, and LeafNodes, like it had done before.
But I realized at this point that RegexTree wasn’t really serving the purpose it had used when I made it parse a full regex string. Now it was just kind of an extra layer. So I returned it to just having those three nodes.
I still wasn’t happy. Remember that I didn’t like how you had to type them so carefully into the shell. Then I realized: wait, if RegexTree was able to sort out its parameters and make the proper Nodes… why not just have it be all those Nodes? That is, why not just make it behave like a BinaryNode if both children are given it, like a UnaryNode if only one child is given it, and like a LeafNode if no children are given it? Make it adaptable. Also, I could have all my error-checking in one place, right in the __init__ method, rather than scattered through various classes’ __init__ methods.
So I implemented that, and it worked fine. And now you just wrote RegexTree(‘0’) or RegexTree (‘*’, RegexTree(‘0’)) or RegexTree(‘.’, RegexTree(‘0’), RegexTree(‘1’)), and it sorted everything out for you. Much easier typing in the shell! I even considered renaming it ReTree to save time.
That was what I submitted first.
But then I read the thread that said you shouldn’t use just a single class, even if it gets the job done. I was dismayed. I didn’t know why this wasn’t preferable. (Since then, zingarod has pointed out that building the behaviours of three classes into one class isn’t too tricky, but what if there were more basic Node types? Having one class adapt to handle all of them would start to look very confusing.)
At first I resisted it and thought, “Eh, let them do with it what they want.” But then I decided I wanted the marks, and I wanted to understand why more than one class was better, so I sorted through all my regex_design, regex_design_old, regex_design_old_2, regex_design_tentative, etc. files and reinstated BinaryNode, UnaryNode, and LeafNode.
I also remembered zingarod saying that we should use inheritance, which I’d use none of except that empty Node class I described above. So I took a leap of faith and decided to give that Node class something to do.
The first thing was realizing that __init__ was very similar for all three of them. But in order to justify having more than one class (i.e., finding a reason not to just turn Node into the RegexTree class again), I decided to implement a new __init__ in each child class that nevertheless benefited from Node’s __init__. So I had Node’s __init__ be able to take both optional children. And then I made each child class’s __init__ call Node’s init inside it, passing it None for the each child parameter I didn’t want it to have. That way, the child classes could still have parameters for children they didn’t have, but still behave uniquely and appropriately.
Having the same number of children, even if some were None, made __eq__ totally inheritable. It just required that instead of isinstance(other, …), I used type(self) == type(other).
__repr__ could easily have been inheritable too, except that I decided I didn’t want the output to have ‘None’ where there were no children. That is to say, if Node had the only __repr__ and it was inherited, the outputs would have looked like “BinaryNode(‘.’, LeafNode(‘0’), LeafNode(‘1’))” but also “UnaryNode(‘*’, LeafNode(‘0’), None)” and “LeafNode(‘0’, None, None)”, which is hideous. So I just gave each one a unique __repr__.
I also instituted a method called “checker” that gets called during __init__. The method makes sure that the Node has the right kind of head for its number of children and that all children are nodes. It handles errors less specifically than my previous means did, but I could still distinguish a HeadError and a ChildError, and most importantly it could now be inherited.
The final note is that zingarod explained to me the design that he’ll post soon and that we’ll use as our base for part 2. He made a separate Node class for every possible regex item! The advantage of mine is that it can be scaled. I have three constants: BINARY, UNARY, and LEAVES, all tuples that list the possible heads of each node type. If we extended our alphabet, all you have to do is add the new values into my tuples. For example, if we were given a second unary operator, you wouldn’t have to create a new class for it; you’d just add it to UNARY and UnaryNode would handle it.
The advantage of zingarod’s is, I’m guessing, that each separate Node can have very specific behaviour, which I think will help with matching strings. That’d be kind of hard to do right now with my design—you’d have to know which binary operator you want, not just that your operator is binary, which is all my class tells you. So in terms of simplicity of class expansion, my approach is easier, but when you actually want to use these regexes for something, zingarod’s is going to be much better.
Looking forward to part 2. Especially since I’ve got the parsing down already. I just have to rescue it from regex_design_variation_3_old.py.