(lord)Henry's Handy Haversack... of SCHEME! (very much in progress) [Archive]

Dr Bwaa

2009-10-29, 02:41 AM

Inspired by this thread (http://www.giantitp.com/forums/showthread.php?t=129768), I've decided to write up a Newb's Guide to Scheme (the programming language, not the development of masterful plots). So, I'm posting it here while I work on it, so that you guys can give me feedback without cluttering up the other thread, and then it can be linked to from there once it's done. Here's what's done (pending review by you, the playground) so far, vs what is yet to come:

Done:
-Intro
-Glossary
-What is Scheme?
-"Hello, world"

To Do:
-Basic Math
-Scheme Primitives (with car & cdr!)
-saving your code in DrScheme
-conditionals
-my first function (with "my first local variable"!)
-my first recursive function
-more...

And, without further ado...

Scheme!
This is my own tutorial. There are others out there, but I'm bored, I want to write, and I want to try my hand at writing a tutorial for someone with virtually no experience in functional programming (or programming in general). Please leave me feedback; I am sure I've gotten some things wrong (my glossary, for instance, is really, really badly-defined), and I would love to fix them. I would, however, ask that you not correct the glossary unless my definition is actually wrong, missing, or misleading (definitely possible); I'm aware that it's not rigorous, but it's meant to be a help, not an overwhelming amount of information. That's what Wikipedia is for.

0.0 Introduction: How to Use this Tutorial
This is an extremely basic guide to functional programming in Scheme. I highly recommend downloading DrScheme (http://download.plt-scheme.org/drscheme/), which is a very intuitive Scheme interpreter to mess around with. For the rest of the tutorial, I'll assume you're working with DrScheme, although any Scheme interpreter should work.

For the purposes of writing this guide, I'm going to assume you know next to nothing about programming, functional or otherwise. On that note, I'm going to assume that you can catch up very fast, and that you can stick with me. To that end, I am providing a glossary of computing terms here, which I will use freely, expecting you to know them. If you think you need to, please go over these terms before moving on; if you've programmed before, you should be able to skip ahead (but keep in mind that I'm trying to Gloss and/or wiki-link any unfamiliar terms, so if you find something you don't understand, check back here for it). I do assume that you have some experience with basic algebra and possibly a little calculus, though I don't expect this guide to go that far.

0.1 Glossary of Computer Science Terminology

Please note that these definitions are by no means rigorous; I just want to get the point across enough for people with no background in the material to understand it. For good, rigorous definitions, a quick Google and a click on the Wikipedia page of any of these terms will provide great benefits.

Algorithm
An algorithm, simply, is a function, or formula, for taking input and producing certain output. For example, the quadratic formula is a simple algorithm to determine the roots of a quadratic equation. Algorithms can be as simple as this, or they can be phenomenally complex, like the notorious traveling salesman problem. Note: an algorithm need not be efficient (but it is nice)!

Base Case
In recursion, like proofs by induction, a base case is a simple input case where the input needs no further reduction in order for the answer to be clear. For instance, if a recursive function were to count how many items were in a list (the length of the list), the base case would be the input of () (the empty list), where the length is, clearly, 0. Note that the base case here could also be a list with one element, but then it would not be able to count lists with length 0 (as it would crash or loop forever without ever reaching a base case). By convention, the empty list is generally chosen for problems like this because it is easier to see that it is the base case, avoids bugs like this, and works out more simply for certain calculations.

Compiler
Computer programs can be either interpreted or compiled, though some languages do a combination of both. A compiled program is gone over by a Compiler, which checks for syntactical correctness, and then translates the file into machine language. This new code is written to a separate file, which can then be run by the computer. Compare to interpreter.

Computer Science
Computer Science (CS, CompSci, etc) is the study of (among many other things) what is computable, how easily/efficiently, and the practical implementation of this theory in modern (or not-so-modern) computer systems.

Computable/Computability
To say that a problem is computable is to say that there exists (at least in theory) an algorithm that can determine the correct solution any example of the problem. A couple example computable problems are: What is the nth prime number? or With perfect play, who wins a tic-tac-toe game, starting from a given position? Noncomputable problems, like the Halting Problem (http://en.wikipedia.org/wiki/Halting_problem), do exist (see Noncomputable for details).

Data Structure
A data structure is a specialized way of holding and/or manipulating data. Many different data structures exist, for many different purposes. Each has its own distinct functions that it is particularly adept at performing.

Efficiency
In computer science, efficiency generally refers to the speed and/or memory usage of an algorithm. or program. This is an enormous topic, but the general idea is that the less memory a program requires, and the less the computation time depends on the size of the data set, the more efficient it is. For example, here are two ways to determine if a (positive) number is even or odd. The first is inefficient, the second is much more efficient:

1) Set a value, x, to 0. Count backwards from the given number, n, to 0. Each time you take one step backwards, set x to 0 if it was one, or 1 if it was 0. When you reach 0, if x = 1, n is odd. If x = 0, n is even.
2) Divide n by two. If the result is an integer, n is even. Otherwise, n is odd.
The first approach takes longer the bigger n is. This can make it take a very long time if n is very large. The second algorithm takes exactly the same amount of time no matter what n is: this makes it technically O(1) (http://en.wikipedia.org/wiki/Big_O_notation), which is very efficient.

Function
A function is a list of commands to be carried out. It may or may not take input, called arguments, and it may or may not produce some sort of output, normally called [i]return values[i]. Unlike in mathematics, a function need not always produce the same output on identical input (in the case that the function employs a random number generator, for instance). Generally, a function is given a name in the program it resides in, so it may be called (used) repeatedly, but this is not always the case.

Heap
A heap is a tree-based data structure, often used to implement priority queues. To be a heap, the tree must be constructed such that if a node A is another node B's parent, then the key of A ≥ key of B. What precisely the key is depends on what is being placed in the heap, but in general it assures that whatever is “on top” of the heap is the most important, most recent, biggest, or so on.

Implementation
An implementation is a specific way of concretely setting out an abstract idea or theory. This is valid at many levels of abstraction; for instance, the Python programming language is an implementation of Object-Oriented Programming (high-level abstraction), and a heap is the usual way to implement a priority queue, while an array is a common way to implement a heap.

Inefficiency
An inefficient algorithm is one whose processing time takes excessively long, or uses “too much” memory. This generally includes algorithms that take exponentially longer to run based on an input increased linearly, but also includes many others. Compare to efficiency.

Interpret/Interpreter
Any programming language can be either interpreted or compiled, though some modern languages like Python and Java use a combination of the two. Very simply, an interpreter goes through a program, usually line-by-line, translates the code to instructions in machine language, and executes the instructions as it translates them. Compare to compiler.

Iteration
Iteration is the process of looping, or going through data one-by-one. Usually, this is done in order to perform some kind of operation (counting, altering, identifying) on each item in the data set. It is all done within one function call, if functions are being used. Compare to recursion.

Java
Java is a programming language built by Sun Microsystems.

Lisp
Lisp is a programming language featuring a fully-parenthesized syntax, still in widespread use today despite its origination in the 1950s.

Machine Language/Machine Code
Machine code is a programming language that a specific hardware setup can understand. The rules of this language can differ widely between machines and operating systems. Programming in machine language is basically never done anymore, because it is literally just strings of 0s and 1s, and it is fantastically easier and more efficient to code in a modern programming language and then either compile or interpret the code to allow the computer to actually run it, as these operations can be done algorithmically.

Noncomputable
A noncomputable or undecidable problem is, intuitively, one that cannot be solved algorithmically (or, sometimes, at all). Often, problems can be proven to be noncomputable, which seems counterintuitive, but comes from the fact that there is no “proof algorithm” (if there were, the mathematician who discovered it would have hidden it and kept it for his own personal use). Simply, it is a problem no computer (or equivalent computational system, including Lambda Calculus and Turing Machines, among others) can solve. The classic example of an undecidable problem is Alan Turing's Halting Problem (http://en.wikipedia.org/wiki/Halting_problem).

Object
An object, broadly speaking, is a chunk of memory that can store data and, usually, other objects and/or functions. The concept of the object is (intuitively) central to Object-Oriented Programming.

Object-Oriented Programming
Object-Oriented Programming makes use of objects, rather than simply functions or individual data structures. The programs work by creating, deleting, and manipulating objects. This makes OOP very useful for simulations (say, a game in which several discrete bodies must be accounted for at once, like The Sims. Each Sim is its own object, with its own functions and properties) and other complex programming tasks. Java and Python are both examples of Object-Oriented programming languages.

Pointer
A pointer is a memory reference to another area in memory, usually representing an object or some other value. It is not the object or value itself, but simply a “pointer” to where the computer will find the intended target in memory.

Priority Queue
A priority queue is an abstract type of structure that, simply, lists things in order of their priority (for a given kind of priority). This can mean various things (smallest value, most recent function call, etc), and is usually implemented in a heap.

Program
A program is a list of commands that can be run and executed by the computer, in one way or another.

Programming Language
A language (of any kind) is a set of rules (a grammar) that set out syntactically-appropriate ways of combining symbols. Programming languages specifically use that grammar to interpret the programmer's input, and then task the computer with following the directions set out by the rules of the language as they apply to what was written. Put simply, a programming language is a set of rules for telling a computer what to do.

Python
Python is a programming language originated by Guido van Rossum.

Recursion
Recursion is like iteration, in that it allows one to process a data set one (or more) elements at a time. However, unlike iteration, recursion does not remain within one function call, but does its work by calling the recursive function repeatedly on smaller and smaller data sets until it reaches a base case; then returning the values back up through the function calls and putting all the data together before ultimately returning the final results. As this is a Scheme tutorial, you will grow very accustomed to recursion.

Return/Return statement
A return, return value, or return statement is whatever output a function produces. Return values can be almost anything, depending on the flexibility of the programming language and what the programmer desires. Common return values could be things like words or integers, but even other functions can be returned as output.

Scheme
Scheme is a dialect of Lisp, and it's the programming language you will learn a little bit of from this tutorial.

Tree
A tree is a broad category of data structure, generally defined as one or more Nodes, each with zero or more Children (often two, making a Binary Tree), which are also Nodes. Parents are connected (usually through [i]pointers[i]) to their Children, and each Node may store any amount of other data. Trees are widely used in all sorts of data and graph manipulation problems.

Undecidable
See Noncomputable.

0.2 What is Scheme?
Scheme is a functional language (http://en.wikipedia.org/wiki/Functional_programming). I'm not going to go into huge detail about what that means here, because it's a big topic and I just linked you to the wiki, anyway. Strictly speaking, in a purely functional language, everything is a function, and I mean a pure mathematical function that takes once input and gives one output, and has no side-effects. This makes it incredibly difficult to program, as things like “display on the screen” are side-effects. Haskell is a purely functional language, so we're not using Haskell; we're using Scheme. Scheme can do things like show you what it's doing, which is really nice, but it also has most of the great and fun features of functional languages that many CS students don't get to see. So, it's easy to learn, fun to play with, and very elegant to code in. What's not to love? (Correct answer: nothing)

One reason Scheme is not a common intro language is that it, like all functional languages, makes heavy use of recursion, which can sometimes be difficult to wrap the mind around. However, recursion is awesome, and besides, you don't have to do any of that for a while, so let's just move on and we'll get there when we get there. :smallsmile:

1.0 ”Hello, world.”
Conventionally, the first computer program an intro student writes is called “Hello, world” and simply prints those words to the screen. There's some sort of history behind this, I'm sure, but I don't remember it. You're on the internet; go find it there.

I'm going to cheat a little with this program, and just use it to get you into Scheme a little bit. Open DrScheme (or whatever you're using; this is the last time I'll say this). The first thing you'll need to do is set the language, so press Ctrl+L (or go to Language-> Choose Language), and select “Pretty Big” under “Legacy Languages.” You should now have a split window, with a top, blank area, and a bottom area with a prompt and a welcome message.

As I mentioned before, everything in Scheme is a function, and if you read the Glossary, you'll have seen the Scheme is a dialect of Lisp, which I described as “fully-parenthesized.” What this means is that everything you write will be a function, and must start with (with a couple exceptions, which we'll get to right away) a left-paren '(' which indicates the start of a function call. So, the general format for a Scheme function is: ([function name] [arguments]). The first function I'll have you use (this is making Haskell programmers spin in their seats right now) is the display function, which writes what you tell it to to the screen. So, for “Hello, world”, simply type (at the > prompt in the bottom area)

(display “Hello, world.”)
...and press enter. The result should be “Hello, world.” on the screen underneath, a little indented. My computer shows it in purple. If that didn't happen, make sure you've gotten your parentheses right, and that your quotation marks line up. If all else fails, copy/paste from here :smallsmile:

And that's it! Next, we'll play around a little with the Scheme interpreter.

No more updates tonight; need to sleep before class :smalltongue: