computer science thoughts before the fall

This post is includes information about these three parts:

  1. The hierarchy of computer science problems
  2. Parts of computer science which I will/might be taking courses in
  3. Parts of computer science which I might want to work in in the future

Disclaimer: This post is just a formal way of trying to connect all these pieces together using make up stories from my head in a very crude and atrociously inaccurate manner.

Hierarchy of computer science problems

The hierarchy is as follows

  1. Computability Theory
  2. Computational Complexity Theory
  3. Design and Analysis of Algorithms
  4. Improving implementation

The first thing to understand before I delve in is what computer science actually does. A definition I found particularly insightful from Philip J Guo, the author of The PhD Grind, who says Computer Science is Efficiently Implementing Automated Abstractions

The thing about computer science that people need to understand is that computer science is revolutionary is that we learned how to control voltages in circuits and built layers and layers of abstraction upon that so that we could have machines/computers that would carry out the instructions perfectly without any chance of error (with some exceptions). Building machines and writing instructions for them to compute has been far more efficient and productive than teaching humans and over organisms to compute because they don’t need food, motivation but only electricity. (Although there has been some headway into cellular computation)

But this automation has a limit. Some problems are undecidable/non-computable. I must emphasis that there is a difference between questions that are undecidable, unsolved and simply have no answer to. This is where Computability Theory steps in, which Turing formalized and basically showed the non-decidability (Turing computability) of certain problems when computers barely even existed around WWII. That was the beginning of theoretical computer science I suppose you could say.

I think I will be taking PHIL1680A which is in Topics in Logic: Incompleteness. It’s sort of related to this topic but more on metalogic, or basically how logic is limited by itself as well. This class is more of an experiment for me as I don’t really have any formal training in logic but the topic seems very interesting and relevant to computer science. But I hope to get some formal logic training out of it and some mind blowing insights into logic itself.

Then computers/mainframes started becoming popular, and we started to try to  harder and more difficult problems.To each of these problems, there is an algorithm, a step by step procedure for calculations. Algorithms is not a construct unique to Computer Science. (many may disagree but) in fact, most of the math we’ve learnt are algorithms. How to find the sum between two large numbers. You start from the right most digit, find the sum of those two digits, carry 1 over if needed, and proceed to the digits to the left and add the carry. That I think is also an algorithm. So people start developing algorithms, and realize that there are different speeds (runtimes) for each in the best, worst and average cases. So this is Analysis of Algorithms. There are more than a dozen ways to sort an unordered list of numbers/strings into an ordered one, each having a different best/average/worst case big-O runtime as it is called in computer science.

I am hopefully going to take cs157 in the spring, which is our algorithms class, learning all the different algorithm classes, mostly likely the deterministic polynomial ones. This class will give me familiarity with all the different classes of algorithms I will ever come across as a commercial software developer if I decide to work as one. It will also be a class heavy in proof writing and help contribute to more mathematical maturity.

But then, during this process of writing and proving the runtime of different algorithms, larger patterns start emerging. Computer scientists started realizing that runtime of some problems took larger than the size of the input list i.e. very famous computer science problem that some of you may have heard of the “The Traveling Salesman problem” , which is given a list of cities and their pairwise distances, find the shortest path that travels to each city once and return to the origin city. Different complexity classes started emerging, with different capital letters and subscripts denoting the differences i.e. P, NP (non-deterministic in polynomial time), EXP (standing for exponential). The sort of big question of this century is whether P = NP? It’s part of the Clay Millennium Mathematics Problems and has huge implications ranging from cryptography, efficiency of our economic markets and protein structure prediction once proved. I am going to paraphrase something my research professor paraphrased heardfrom a talk he attended. “Computer science are in hell, dealing with the complex intricacies and faultiness of computer systems, while mathematicians are in heaven, hidden away by beautiful abstractions. But once in a while, a question from hell, from computer science, P = NP, raises above to heaven that baffles even mathematicians”

I am going to take CS51 this upcoming fall, which is the Theory of Computation class in our department.From what I understand, there will discussion about the aforementioned different complexity classes, and a lot of work in using reduction to go from one formulation of the problem into another one that can be identified as a clear member of one of the complexity classes. There will also be work on automata theory.

Finally, computer science boils down to the implementation part of these algorithms, programming, which is what I think many people mistakenly believe is what computer science is all about. But it must be said that algorithsm are only useful if someone implements them and applies them to some problem. And even the implementation itself varies greatly depending on different languages, understanding of computer systems. The bulk of work in systems, software engineering is on the differences between different implementations and the related performance increases.

I’m TA-ing cs33 in the fall, which is basically this, an intro into computer systems, and low-level programming. I admit my training in this area has been weak and lacking for the most part so this will be a good chance for me to work on.

As to which part of computer science or which part of the hierarchy I want to work in post graduation is a very good question I’m trying to answer as well. Well the first two parts 1)Computability and 2) Computational Complexity pretty much refine me to academia but that doesn’t appeal to me as much. Somehow, I’m not too enarmored with the idea of commercial software development. Software development interviews are very intimidating and being a code monkey is not really what I want to do. This sort of leaves 3) Design and Analysis of Algorithms which can be done in a non-academia setting in certain computational science fields, but again that will and must include some software development. I am also including subfields which require more mathematics i.e. machine learning into this classification. I’m not too sure where I stand on the spectrum yet but hopefully these few courses this year will help me answer those questions. Of course, this completely leaves out things like product manager, business-side roles that involves less computer science.

Jobs is a question that is slowly looming in my head, like a dark overcloud and soon the question and application forms will consume me.

Advertisements

coder

just finished my final presentation for cs32, the notorious software engineering course in the institution which i study.

i’ve always been aware of the fact that i’m not actually a very good programmer. and that statement is truer than ever after this semester of 32. and i wanted to look at this statement from a variety of angles and tear it apart.

cause 1: i’m not an pre-planning coder  (at all).

i tend to see a problem, think for a while, decide i have a good enough solution and then try to write it up in code.
i get a bug, then change my code to fix it, then iterate this process until it “works”.
this is bad on multiple levels.
i have a good idea of what the structure is in my head, but never a complete picture.

so it’s very easy to say i’ll just write this method or call this variable with a dumb name, or copy and paste this code here and there, and then you base your future on this sort of messy hack of code and so forth and the whole piece of code just spirals out of control.
this gets exponentially worse when you’re writing medium sized projects, especially without an ide which lets you refactor automatically.

so i’m a scrappy programmer, i write things really quickly but they usually don’t always work, and things spiral out of control and become exponentially more difficult to maintain and debug. and i add more if var != null statements, and include more checks, and eventually it works. but by now, it looks like a giant mess which someone popped on.

i can’t really tell if that in itself makes me a shitty programmer, or  just a trait which other people have had success with. this is contrasted against other coders (who i need to give credit to )who meticulously spend lots of time laying the foundations of the project, and thinking and rethinking for a better solution. they don’t start until they’re absolutely sure that there’s nothing better.

i don’t do all of the aforementioned crap without some faulty reasoning.

i have a very short attention span and like to jump between algorithms, and problems in my head. and i find it much more rewarding to see connections between different algorithms and problems than try to solve the same problem three times in progressively cleaner, and more elegant ways. if i have a solution, i usually keep it rather than spending more time to come up with a better one.

i also have a hard time believing that when you’re working on a complex program that you can ever have the whole program in your head. i’m at the peak of biological cognitive abilities. people are sharpest, smartest when they are in their early 20’s. so it’s ok now for me to say you got to practice going through this insane loop with 10 changing variables, and 4 break statements. but i know in 20 years them, there will be no bloody way for me to be executing a task of this cognitive complexity.

i also generally don’t believe that you can predict all the complicated interactions between different parts of the program. so you inevitably have to make modifications so why spend so much time to have this perfect little castle, if it’s going to have to be destructed anyway?

any of you who have made it this far down the post are probably interested in this question. so help me out a little bit and tell me what i should do? is it just always better if i take a painstakingly long time to lay out the foundations and formulate all the problems in my head before i write anything down? do i gain anything from re-attacking the same problem multiple times for a more elegant solution. would that be considered pre-mature design? but those  kinds of design questions are especially important when you start looking at things like threading, concurrency, scalability. what is the distinction between design and programming. is there a divide at all?

is programming inherently an act where you have to keep modifying and adding to the same piece of code again and again as you make it more complex? is there any gain for me in trying to get a perfect solution in the beginning? it seems to me that it’s somewhat related to the concepts of writing a very verbose novel or spending a lot of time debilitating on the word choice for a poem

cs and finding jobs

via

this came up on the top visualizations of 2011 from visual.ly.

it’s really exciting to talk about all these fields and discuss how they are changing our lives with friends, but to be honest, i am still inside the center, desperately trying to learn all the pre-requisite knowledge before i can talk of being up to date.

i had a really awesome time learning about computer systems this semester under pvh. it gave me a much deeper understanding of computers, something which i’ve always wanted to learn. but a recent talk with mbwong about has led me to look into this exploding field centered around data: machine learning, comp bio, artificial intelligence. i found some information and have subsequently decided to make my own brainstorm. participation is encouraged.

http://www.mindmeister.com/maps/public_map_shell/130808206/computer-science?width=600&height=400&z=auto
and then i realized that i had always loved big data all along.

as a young kid, my favorite books were the huge ones with cross sections of buildings and structures with gorgeous illustrations explaining every part. i no longer have it on my shelf but i loved this star wars book so so much. (why don’t they make good ones anymore?) it was the perfect book, combining nerdiness and big data.

i’ve been listening to some lectures on ml-class.org, one of three classes that were offered by stanford to the public. i currently plan to take machine learning next semester but i’ll decide after the first few weeks.

the real problem is i still haven’t really decided what kind of field i want to specialize in. i’ve been watching the first lecture of  many courses on opencourseware to get a feel for each subfield inside cs but i’m still unsure.

the other thing that’s been on my mind is jobs/internships.

after getting into college, my naivety led me to believe that studying cs at a good university would  automatically give me tons of internship offers. so i haven’t really been on my game in terms of preparing myself and my resume for jobs. most of the things i’ve done have been short lived and random so it looks terrible on a cv.

but i’ve recently started to remember how many better-qualified candidates than me. the comforting thing to think is that they always say they will hire great talent as soon as they see it but the problem is becoming that great talent. with the limited knowledge i have under my belt, i don’t think i stand out as an applicant compared to many of my peers.

my uncertainty of my interest is also a concern since i don’t really know what job to apply for if i don’t know what i want to do.

i can’t decide whether for the rest of my winter break i should devote more of my time into watching lectures and learning things or looking into ways to beef up my resume? advice?