2008-09-13

Scala for Pythonistas: Part 1

This examines features that a beginner to intermediate level Pythonista would use, and their Scala equivalents. I am not an expert-level Pythonista, so I won't cover any advanced features.

This is the first installment.

Tuples

Scala has tuples, and supports some of the tuple unpacking tricks familiar to Pythonistas.

-- tuple literals
# Python
resp = (42, "The Answer")
// Scala
val resp = (42, "The Answer")

-- tuple member access (access first item)
# Python
resp[0]
// Scala
resp._1

-- multiple assignments using tuples
# Python
num, desc = 42, "The Answer"
// Scala (parentheses are always required)
val (num, desc) = (42, "The Answer")

Pythonistas sometimes get into the habit of using and abusing the iterable nature of Python's tuples. This won't work in Scala, because, tuples are not considered an iterable in Scala. Also, Tuples are not defined as a general sequence type in Scala and is only defined for up to 20 members.

The idea of a tuple, is that of something like a C struct instance or a database row (see PyFAQ1). This is how we usually use tuples in Python, e.g., to return multiple values, parameters, etc. Python, being a dynamic language, and because Python decided to make tuple a sequence type, tuples start looking like immutable lists, and sometimes we just abuse it. I plead guilty to this; several times.

In Scala, tuples can only be used as, tuples in the C struct-like sense. At least, there won't be endless flame-wars on the nature of tuples. :-)

Comprehensions

Every Pythonista's favorite, comprehensions, has its analogue in Scala. It is called for-comprehension.

Scala's for-comprehensions are more like Python's generator comprehensions, than list comprehensions, in that they build generators. Furthermore, the generators built this way can be invoked multiple times (unlike generators created by Python's comprehensions). Also, the type of generator created by a for-comprehension depends on the type of sequence the for-comprehension was iterating over.

Here are a few examples:

-- simple comprehension
# Python
sqs = (x*x for x in range(1, 10))

// Scala
val sqs = for (x <- (1 until 10)) yield x*x


-- comprehension with filter
# Python
esqs = (x*x for x in range(1, 10) if x % 2 == 0)

// Scala
val esqs = for (x <- (1 until 10) if x % 2 == 0) yield x*x


-- nested comprehensions (cross-product-style results)
# Python
ccps = ((x, y, x+y) for x in range(1, 5) 
                    for y in range(5, 10))

// Scala
val ccps = for (x <- (1 until 5)
                y <- (5 until 10)
           ) yield (x, y, x+y)


-- chained comprehensions
# Python
dds = ((y, y+1) for y in (x*2 for x in range(1, 5)) 
            if y < 5)

// Scala
val dds = for (y <- (for (x <- (1 until 5)) yield x*2) 
               if y < 5
              ) yield (y, y+1)

In the case of the example with nested comprehensions, semi-colons are required if multiple x <- xs expressions are to appear on the same line.

Ranges

Python 2.x's xrange() or 3.x's range() built-in is represented by Scala's until method available to Scala's Int. Just as it is in Python 2.x, ranges are only available for 32-bit integer types. The behavior is very similar (the following examples assumes Python 3.x built-ins, substitute xrange for range for 2.x equivalent):

-- simple range: 0, 1, 2, 3, 4
# Python 3.x (use xrange for 2.x)
xs = range(0, 5)

// Scala
val xs = 0 until 5


-- range with custom increment: 0, 2, 4
# Python 3.x
xs = range(0, 5, 2)

// Scala
val xs = 0 until 5 by 2


-- reversed range: 5, 4, 3, 2, 1
# Python 3.x
xs = range(5, 0, -1)

// Scala
val xs = 5 until 0 by -1


-- invalid ranges => empty
# Python 3.x
range(5, 1)
range(1, 5, -1)

// Scala
5 until 1
1 until 5 by -1

Note that, the behavior is almost identical, including the response to invalid (nonsensical) range specifications like counting from 5 up to 1.

Also, note that the generator created by until can be invoked multiple times, similar to the generators created by Scala's for-comprehensions.

The equivalent of Python 2.x's range() built-in is the range factory method of the List object. The parameters and behavior would match the behavior of Python's range exactly. See the example below:

# Python 2.x (in 3.x use list(range(...)))
xs = range(0, 5, 2)

// Scala
val xs = List.range(0, 5, 2)

Scala also provides inclusive range constructor and provides all the options (Thanks to Seth Tisue for pointing this and the "by" method out). In Python, there is no built-in inclusive range, but you can simulate it by adding 1 to (for positive ranges) or subtracting 1 from (for negative ranges) the range end.

// simple range ; gives 1, 2, 3, 4, 5
val xs = 1 to 5 
# Python
xs = range(1, 6)

// range with custom step; gives 1, 3, 5
val cs = 1 to 5 by 2
# Python
cs = range(1, 6, 2)

// reversed range; gives 5, 4, 3, 2, 1
val rs = 5 to 1 by -1
# Python
rs = range(5, 0, -1)

Note that until and to are not language keywords, but methods available on Int objects through implicit conversion to RichInt objects. For instance, we could have written 0 until 5 as 0.until(5). by is also not a keyword, but a method on Inclusive and Range (the inclusive range object created by to).

Wrap-up

Hopefully, I can cover more of this fun stuff in later installments.

References

The following were used to check the facts:

  • Programming in Scala
  • Python documentation 2.5, 2.6beta and 3.0beta

Notes

[PyFAQ1]
Why are there separate tuple and list data types Python FAQ
2.6 and above will have a factory method namedtuple, that will allow you to build tuples that work more like structs
[Seth Tisue]
Seth Tisue is the lead developer of NetLogo at the Northwestern University, Evanston, IL.