2008-09-28

Scala for Pythonistas: Part 2

This is the second installment in the series that examines features that a beginner to intermediate level Pythonista would use, and their Scala equivalents. I am not an expert-level Pythonista, so I won't cover any advanced features.

For-loops

We have already seen the for-comprehensions (see part 1) in Scala. The for-loop construct is similar, but without the yield bit.

# Python
acc = 0
for x in xs:
  acc += x

// Scala
var acc = 0
for (x <- xs) {
  acc += x
} 

No break or continue

Scala does not support break and continue. Additional if-else must be used to simulate continue. To simulate break, an additional flag has to be maintained.

Scala does this to keep the syntax of the core language small and to avoid the complications of adding another control mechanism. Some would point out that this is related to the whole "goto considered harmful" business, because break and continue are restricted forms of goto. However, I think, Scala is trying to push functional-style programming, and that this isn't a misguided attempt to enforce "goto-less programming".

Before I describe how to simulate break in Scala, I want to point out that using break and continue judiciously can make code clearer and more concise than its convoluted work around. So, if your language provides a break construct, do not avoid it just for fanatical adherence to "anti-goto-ism".

Donald Knuth pointed out that fanatically avoiding all goto-like constructs might lead to unnecessarily complex and obscure code[Knuth].

Simulating break

Consider how you will simulate break in Scala. You will notice that the use of break is more concise and clearer. The work-around has to maintain an additional condition variable (think of C-style for-loop):

# Python
acc = 0
for x in xs:
  acc += x
  if test(x, acc):
     break

// Scala
var acc = 0
var continue = true
for (x <- xs.takeWhile(e => continue)) {
  acc += x
  continue = !test(x, acc)
} 

There are cases, when we can avoid the additional condition variable. This is when our test depends only on the values accessible from the takeWhile's predicate, and when the values accessible from the takeWhile's predicate are the appropriate values, and when the initial case would not be false. And we can also rephrase the test case to avoid the negation.

Depending on which would be clearer, you are most likely to use other constructs or break the computation into smaller functions (Scala's compile time and Java VM's runtime optimizations should, hopefully, keep the function call cost a bit lower than that of Python).

Simulating continue

Simulation of continue is described in the discussion of while loops below (which applies to for-loops as well).

While Loops

While loop construct is similar to that of Python (and most other languages). Consider the following Python code:

res = 0
while test(res):
  res += work(res)

Here is the Scala equivalent:

var res = 0
while (test(res)) {
  res += work(res)
}
Simulating break
# Python
res = 0
while test1(res):
  res += work(res)
  if test2(res):
     break

// Scala
var res = 0
var continue = test1(res)
while (continue) {
  res += work(res)
  continue = test1(res) && !test2(res)
}

Scala does not provide break or continue, so we need to use a “continue” flag and update the flag. Alternatively, we can use functional constructs or express it in another way; which may or may not be clearer.

Of course, we can do this kind of programming in Python as well. In fact, some, especially those in academia, might insist on programming the above Python as follows:

res = 0
do_continue = test1(res)
while do_continue:
  res += work(res)
  do_continue = test1(res) and not test2(res)

In this particular case, I think I will stick to using break in Python.

Simulating continue

Similarly, Python's continue can be simulated by an additional if-else block.

# Python (I find this particular case more confusing)
res = 0
while test1(res):
  res += work1(res)
  if test2(res):
     continue
  work2(res)

// Scala
var res = 0
while (test1(res)) {
  res += work1(res)
  if (!test2(res)) {
     work2(res)
  }
}

# Python; better
res = 0
while test1(res):
  res += work1(res)
  if not test2(res):
     work2(res)

In this case, in Python, Java, and Scala, I prefer to use a nested if-else block over continue. The only advantage continue has over the nested-if-else is that you can avoid another level of nesting, which isn't worth it in this case. However, the continue construct might be more appropriate when using a fail/exit-early programming style.

Simulating enumerate

Although the for-each constructs in Python and Scala makes iterating over items in an iterable convenient, it becomes a bit annoying when we also need to keep track of the index. In Python, the solution is to use a wrapper generator, like the built-in enumerate. In Scala, the zipWithIndex method of the iterable is used instead:

# Python
for i, x in enumerate(xs):
  print(i, x)

// Scala
for ((x, i) <- xs.zipWithIndex) {
  println(i + " " + x)
}

Wrap-up

Hopefully, I can continue cover more of this fun stuff in later installments.

References

Notes

[Knuth: Structured Goto]
In "Structured Programming with go-to Statements" describes cases where goto-like constructs are clearer and more concise than convoluted code to avoid goto-like constructs.

2008-09-13

Scala for Pythonistas: Part 1

This examines features that a beginner to intermediate level Pythonista would use, and their Scala equivalents. I am not an expert-level Pythonista, so I won't cover any advanced features.

This is the first installment.

Tuples

Scala has tuples, and supports some of the tuple unpacking tricks familiar to Pythonistas.

-- tuple literals
# Python
resp = (42, "The Answer")
// Scala
val resp = (42, "The Answer")

-- tuple member access (access first item)
# Python
resp[0]
// Scala
resp._1

-- multiple assignments using tuples
# Python
num, desc = 42, "The Answer"
// Scala (parentheses are always required)
val (num, desc) = (42, "The Answer")

Pythonistas sometimes get into the habit of using and abusing the iterable nature of Python's tuples. This won't work in Scala, because, tuples are not considered an iterable in Scala. Also, Tuples are not defined as a general sequence type in Scala and is only defined for up to 20 members.

The idea of a tuple, is that of something like a C struct instance or a database row (see PyFAQ1). This is how we usually use tuples in Python, e.g., to return multiple values, parameters, etc. Python, being a dynamic language, and because Python decided to make tuple a sequence type, tuples start looking like immutable lists, and sometimes we just abuse it. I plead guilty to this; several times.

In Scala, tuples can only be used as, tuples in the C struct-like sense. At least, there won't be endless flame-wars on the nature of tuples. :-)

Comprehensions

Every Pythonista's favorite, comprehensions, has its analogue in Scala. It is called for-comprehension.

Scala's for-comprehensions are more like Python's generator comprehensions, than list comprehensions, in that they build generators. Furthermore, the generators built this way can be invoked multiple times (unlike generators created by Python's comprehensions). Also, the type of generator created by a for-comprehension depends on the type of sequence the for-comprehension was iterating over.

Here are a few examples:

-- simple comprehension
# Python
sqs = (x*x for x in range(1, 10))

// Scala
val sqs = for (x <- (1 until 10)) yield x*x


-- comprehension with filter
# Python
esqs = (x*x for x in range(1, 10) if x % 2 == 0)

// Scala
val esqs = for (x <- (1 until 10) if x % 2 == 0) yield x*x


-- nested comprehensions (cross-product-style results)
# Python
ccps = ((x, y, x+y) for x in range(1, 5) 
                    for y in range(5, 10))

// Scala
val ccps = for (x <- (1 until 5)
                y <- (5 until 10)
           ) yield (x, y, x+y)


-- chained comprehensions
# Python
dds = ((y, y+1) for y in (x*2 for x in range(1, 5)) 
            if y < 5)

// Scala
val dds = for (y <- (for (x <- (1 until 5)) yield x*2) 
               if y < 5
              ) yield (y, y+1)

In the case of the example with nested comprehensions, semi-colons are required if multiple x <- xs expressions are to appear on the same line.

Ranges

Python 2.x's xrange() or 3.x's range() built-in is represented by Scala's until method available to Scala's Int. Just as it is in Python 2.x, ranges are only available for 32-bit integer types. The behavior is very similar (the following examples assumes Python 3.x built-ins, substitute xrange for range for 2.x equivalent):

-- simple range: 0, 1, 2, 3, 4
# Python 3.x (use xrange for 2.x)
xs = range(0, 5)

// Scala
val xs = 0 until 5


-- range with custom increment: 0, 2, 4
# Python 3.x
xs = range(0, 5, 2)

// Scala
val xs = 0 until 5 by 2


-- reversed range: 5, 4, 3, 2, 1
# Python 3.x
xs = range(5, 0, -1)

// Scala
val xs = 5 until 0 by -1


-- invalid ranges => empty
# Python 3.x
range(5, 1)
range(1, 5, -1)

// Scala
5 until 1
1 until 5 by -1

Note that, the behavior is almost identical, including the response to invalid (nonsensical) range specifications like counting from 5 up to 1.

Also, note that the generator created by until can be invoked multiple times, similar to the generators created by Scala's for-comprehensions.

The equivalent of Python 2.x's range() built-in is the range factory method of the List object. The parameters and behavior would match the behavior of Python's range exactly. See the example below:

# Python 2.x (in 3.x use list(range(...)))
xs = range(0, 5, 2)

// Scala
val xs = List.range(0, 5, 2)

Scala also provides inclusive range constructor and provides all the options (Thanks to Seth Tisue for pointing this and the "by" method out). In Python, there is no built-in inclusive range, but you can simulate it by adding 1 to (for positive ranges) or subtracting 1 from (for negative ranges) the range end.

// simple range ; gives 1, 2, 3, 4, 5
val xs = 1 to 5 
# Python
xs = range(1, 6)

// range with custom step; gives 1, 3, 5
val cs = 1 to 5 by 2
# Python
cs = range(1, 6, 2)

// reversed range; gives 5, 4, 3, 2, 1
val rs = 5 to 1 by -1
# Python
rs = range(5, 0, -1)

Note that until and to are not language keywords, but methods available on Int objects through implicit conversion to RichInt objects. For instance, we could have written 0 until 5 as 0.until(5). by is also not a keyword, but a method on Inclusive and Range (the inclusive range object created by to).

Wrap-up

Hopefully, I can cover more of this fun stuff in later installments.

References

The following were used to check the facts:

  • Programming in Scala
  • Python documentation 2.5, 2.6beta and 3.0beta

Notes

[PyFAQ1]
Why are there separate tuple and list data types Python FAQ
2.6 and above will have a factory method namedtuple, that will allow you to build tuples that work more like structs
[Seth Tisue]
Seth Tisue is the lead developer of NetLogo at the Northwestern University, Evanston, IL.