Python to Rust: Types

August 24, 2017
13 min. read

This post is part of the Python to Rust series.

Python is a dynamically typed language. It is a somewhat strong typed language, with some edge cases. Rust is a static and very strong typed language. What do these mean and how does it affect you?

In a strong, dynamic typed language, it is the object that holds the type information, not the variable. In Python the strong typing means that we would not expect a string to become a number, without some work from us. When comparing this with JavaScript, which is both dynamic and weakly typed, we see behaviors we might not expect.

Strong vs Weak

Python does not support adding a number and string:

>>> 1 + "1"
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'int' and 'str'

If we try that in JavaScript, we may not know how the interpreter will automatically convert the values. When using ScratchPad in Firefox, I got the following:

    1 + "1";
    
    /*
    11
    */

So JavaScript converted the number 1 to a string "1" and then appended the strings together. Since this happened automatically, we call the typing weak. It does not warn you before types change.

As a side note: If you have not seen the Wat lightning talk by Gary Bernhardt on crazy automatic type conversions, you need to take 4 minutes and enjoy it.

There is no complete definition of what “strong” means with referring to types. So it is a grouping with differences between those within groups.

One notable place where Python is not strongly typed is that everything evaluates to a boolean for use in while, if or other similar locations. This is a common shortcut for doing things if a list exists and is not empty. It is one of the time savers in Pythonic code, but it does open up bugs when you don’t fully think about how truthness operates.

There is functionality in Rust similar to this, but is explicit. See discussion of the boolean type below.

If we try to do this same addition in Rust, we can’t compile.

fn main() {
    println!("{}", 1 + "1");
}

The compiler complains to us, much like the Python interpreter did while running.

error[E0277]: the trait bound `{integer}: std::ops::Add<&str>` is not satisfied
 --> src/main.rs:2:22
  |
2 |     println!("{}", 1 + "1");
  |                      ^ no implementation for `{integer} + &str`
  |
  = help: the trait `std::ops::Add<&str>` is not implemented for `{integer}`

Rust has traits that can be shared between types. The compiler is complaining that std::ops::Add<&str> (the add functionality for integer) is not implemented for type &str (one of the two string types in Rust). In Python language, this would be similar to saying the __add__ method for a given type isn’t implemented.

You will generate many compile errors starting out. Much work has been done to make these errors understandable and often suggesting fixes.

Static vs Dynamic

Python strong types are associated with the object, not the variable. This is perfectly valid Python:

count = 12         # Strong Number Type

count = "Dracula"  # Strong String Type

Rust looks like it can do the same thing at first.

fn main() {
    let count = 12;
    let count = "Dracula";
}

This compiles. Rust complains about the two unused variables we created. Unlike Python, Rust did not reuse the first count. It threw it away when we created the second count equal to "Dracula". The let keyword is used only for variable definition. If assigning to a variable already defined, a normal var = value; is used.

In the follow code, we try to reassign count.

fn main() {
    let count = 12;
    count = "Dracula";
}

We have two problems, but only get the first one from the compiler.

error[E0308]: mismatched types
 --> src/main.rs:3:13
  |
3 |     count = "Dracula";
  |             ^^^^^^^^^ expected integral variable, found reference
  |
  = note: expected type `{integer}`
             found type `&'static str`
  = help: here are some functions which might fulfill your needs:
          - .len()

We have mismatched types. It tells you the type it expected {integer} and the type you gave it &'static str. The compiler won’t allow non-compatible data to be stored there.

Note: the 'static indicates that this will live for the life of the program. So location where "Dracula" is stored will be static for the life of the program. This is related to both Strings and Lifetimes that are coming later. You will continue to notice that Rust has much lower level control over data structures than Python.

We have a help message that gives us possible things we missed. Maybe we wanted to set it equal to the length of the string, with the .len() method. These won’t always say what you wanted to do, but a surprising number of times it is exactly right.

Mutability

Let’s go ahead and use the len() method to get a valid integer to assign.

fn main() {
    let count = 12;
    count = "Dracula".len();
}
error[E0384]: re-assignment of immutable variable `count`
 --> src/main.rs:3:5
  |
2 |     let count = 12;
  |         ----- first assignment to `count`
3 |     count = "Dracula".len();
  |     ^^^^^^^^^^^^^^^^^^^^^^^ re-assignment of immutable variable

We no longer get the type error. But now we are trying to re-assign the value of an immutable variable. (This is the second error I mentioned above.)

Mutable means changeable and immutable means unchangeable. In Rust, variables are immutable by default. You only allow changes if you need to. In this code, let count = 12; means that within this scope, count can never equal anything other than 12.

This should be familiar to what you have seen in Python for the immutable tuple. Once a tuple is created, it cannot be changed. You would need to generate a new tuple with the new data, as we did at first with the new let count variable assignment.

We tell Rust that the variable is mutable with the mut keyword. I also added calls to the println! macro to display the two values as they change.

fn main() {
    let mut count = 12;
    println!("{}", count);
    count = "Dracula".len();
    println!("{}", count);
}

Output:

12
7

Scalar Types

Rust has 4 scalar types: integers, floating-point numbers, booleans, and characters. These are variables that hold one value, as apposed to an mroe complex type like array or list.

Integers

The integer type is defined by bit size and signer or unsigned. A signed integer type is prefixed by and i, as in i8, i16, i32, i64 or isize. An unsigned integer is prefixed by u, as in u8, u16, u32, u64 or usize. The size integers are dependent on the machine’s addressing. Most modern PCs and operating systems, this is 64-bit. However, an embedded microcontroller might be an 8-bit or 16-bit system. Older OSes are 32-bit systems.

A signed integer of bit size n will have a number range from -(2^(n - 1)) to 2^(n - 1) - 1. So i8 would be -128 to 127. The method of storing the bits for signed integers is called 2’s compliment. We loose one power of 2 in number range between signed and unsigned, as that bit is used to show positive or negative.

An unsigned integer of bit size n will have a number range from 0 to 2^n - 1. For u8 we have 0 to 255.

Python has two types of integers, a normal signed integer and a long number type. Where a Rust integer can overflow and throw a panic, Python just converts into a long number type for an unlimited sized integer (with a performance hit). This makes Python possible to solve some of the very long mathematical number problems, but at a slow speed. This capability is possible in Rust, but requires using an external big numbers crate.

In our examples so far, we have not explicitly stated the Rust types we are using. Default integer is i32, as this is usually the fastest (even on 64-bit systems). This was what Rust made the first count variable above, until we were assigning it to the len() of the string. Then it was made a usize type variable at compile. Any time you are indexing or getting data from an index, you will receive a usize, because the address range is dependent on the bit size the computer memory is operating.

Floats

Python only has a single type of float. The size of it depends on if you used 32-bit or 64-bit Python. In Rust, this is explicitly defined with f32 and f64 types.

Booleans

Python bool type was added in 2.3. Prior to this, the property that evaluates other types to boolean were used, so most commonly a 1 or 0 were the boolean variable values. Python’s boolean values are True and False, with the Rust equivalent of true and false.

Lets look at Booleans in Python a little closer:

>>> type(False)
<type 'bool'>

Ok, so Python has a true boolean type, right?

>>> False + 2
2
>>> type(False + 2)
<type 'int'>

Well soft of. It is a C subtype of the int type and equal to 1 or 0.

>>> False == 0
True
>>> True == 1
True
>>> True - 1 == False
True

If you want to make enemies, add True = 0 and False = 1 into someone’s code.

As mentioned before, Python will evaluate the following as false: None, False, zero in various numerical formats, empty sets and collections: '', (), [], {}, set() and range(0).

In Rust, bool is a proper type with no automatic type changes:

fn main() {
    println!("{}", true + 1);
}
error[E0369]: binary operation `+` cannot be applied to type `bool`
 --> src/main.rs:2:20
  |
2 |     println!("{}", true + 1);
  |                    ^^^^^^^^
  |
  = note: an implementation of `std::ops::Add` might be missing for `bool`

For testing empty collections, we must explicitly call methods on the type. However, since the type is both known and static, this is not an issue. An example of this is a vector (which is Rust’s version of a Python list). To convert to boolean, we would just use vector_variable.is_empty(). So you don’t lose functionality or increase code too much, but you gain very explicit code.

I have introduced bugs into Python code by forgetting an explicit boolean expression I wanted in an if, and the automatic boolean conversion ran fine with incorrect logic. These subtle issues are not possible in Rust, due to the truly strong typing of boolean.

Characters

There is no character type in Python. There exists only strings of length 1. I don’t fully understand the Unicode in both Python 3 and Rust. Someone much smarter than me has made them to work, and I’ll just use them. But I know I can’t expect them to be ASCII. A character in Rust is a 4-byte Unicode Scalar Value. If you know it is an ASCII value, you can convert into ASCII codes for the typical number to character conversion sometimes used in A-Z looping.

It is also worth noting that a single Unicode code point in a Rust String may not fit into a single character. An example used on the Rust char documentation page is the emoticon ❤️having the value of two chars \u{2764}\u{fe0f}.

Once I start talking about iterators, we will see that we can iterate by chars(), similar to how Python allows iterators on a string with 1 character strings.

I should also point out that for most things in Python, " amd ' are interchangeable. There are subtle differences, but they can often be swapped without any issue. In Rust, a char is represented by the single quote, such as 'A'. A string must use a double quote as in "This is a string.".

Complex Types

I will not go very detailed into the complex types, but want to mention a few and their equivalents.

Tuple

Python and Rust’s tuples are almost exactly the same. An immutable list with the capability of storing different types. Definition is also the same with parenthesis and commas. I will go into special cases in Rust of tuples in a separate post.

Rust Array

Rust has C-style fixed length arrays. These are of a single type and the length is defined at creation. Nothing exists for this in Python.

While type notation did come to Python with 3.6, it is required in Rust when it cannot be inferred. For variable declaration, this comes with a colon after the variable name. For example, the code we wrote above with type inference would be the same as writing the explicit usize type after a :.

fn main() {
    let mut count: usize = 12;
    count = "Dracula".len();
}

For the Rust Array, we use square brackets similar to C, with the type and number. Below we have x equal to a 5 member array of i32 integers, that is initialized to 5 different values. The array y is a boolean with 1000 values, and we initialize all 1000 of these values to true.

let x: [i32; 5] = [-1, 0, 1, 2, 3];
let y: [bool; 1000] = [true; 1000];

Python Array/List and Rust Vector

While most will get started with a list, Python does have an array type. This is a data structure with a single type, but dynamically allocated. This allows growth after creation, but only members of a given type.

While it is common to use a Python list with a single data type, this is not a requirement of the list. It can capture all types of objects. This is not possible in Rust, without a little more advanced work.

If we assume a list full of integers, for simplicity. (For better performance, you could use the Python array instead.) In Python this is:

l = [1, 2, 3]
print(l)

Output

[1, 2, 3]

In Rust, we would have:

let v = vec![1, 2, 3];
println!("{:?}", v);

Output

[1, 2, 3]

We have a macro vec! that allows us to define a vector in very similar to Python notation otherwise. Rust probably picked the default i32 type, as we did not specify.

Notice the {:?} in the println!. This is a debug formatter that is implemented with the Debug trait. This is similar to Python’s __repr__. We will cover this more in traits. However, it is useful to know if you want an object representation printed out.

This works for an empty list in Python, because type isn’t important.

l = []
print(l)

If we wanted to make an empty vector, would use a static method on the Vec struct: Vec::new().

let v = Vec::new();

However, with an empty vector the Rust compiler has no data from which to infer type.

 --> src/main.rs:2:13
  |
2 |     let v = Vec::new();
  |         -   ^^^^^^^^ cannot infer type for `T`
  |         |
  |         consider giving `v` a type

For many types, you will see the <T> notation for type. We are saying that v is a Vec holding type i32.

let v: Vec<i32> = Vec::new();

It would work to call the macro with no values, but best practices is to call the new constructor when you want an empty object. The double colon :: is only used on the static methods. Once you have a valid instance, you can use methods of the object.

let v: Vec<i32> = Vec::new();
println!("{}", v.len());

Output

0

Python Dictionary and Rust HashMap

The functionality is similar between a Python Dictionary and the Rust HashMap. However, the big difference is that types of the key and value must be defined for Rust. Again, it is possible to work around this for values, but it is a little to complex to look at now and uses some thing we haven’t learned yet.

HashMap isn’t automatically available in Rust. We will be using our first use statement, which is similar to Python’s import. This is part of Rust and not an external crate, it just isn’t automatically included.

use std::collections::HashMap;

fn main() {
    let mut hm: HashMap<u32, &str> = HashMap::new();
    hm.insert(1, "Joe");
    hm.insert(8, "Amy");
    println!("{:?}", hm);
}

From std::collections we are using HashMap. We create an empty HashMap<K, V> with key type of u32 and value type of &str, which is a string pointer. Look for the Strings post to cover strings in more detail. Since we have no macro for creation, like the vector’s vec!, we cannot use a simple notation for a single line dictionary creation and loading. So we make two calls to the HashMap insert method and provide key and values. Then use the Debug trait of the HashMap to display the structure.

Output is similar to Python’s dictionary notation.

{8: "Amy", 1: "Joe"}

There are many more data structures in both languages, but hopefully this gives you a feel for the most common ones. We still have a few more topics to cover before we can start really writing code, but we are getting there.


Part 3 of 4 in the Python to Rust series.

Series Start | Python to Rust: PIP to Cargo | Python to Rust: Enum

comments powered by Disqus