Python to Rust: PIP to Cargo

August 23, 2017
8 min. read

This post is part of the Python to Rust series.

This post will discuss Cargo, compare it a little with Python’s processes for dependencies, and explain how to use it in getting started.

Many of the differences between pip and cargo are there due to the interpreted Python vs compiled Rust. To make separate environments to deal with incompatible dependencies in Python, you will create either a virtual environment or use a docker instance. Rust is compiling, so you just need a way to point to the proper source code you depend on.

There is a similar action where you use an import in Python or a use in Rust. Packages to make available are defined in the Cargo.toml file and they are made available in source code via extern crate [name] at the top of the source file. Cargo will automatically get the versions to build, in contrast to the more manual pip process. For those setting up project correctly in Python, this is more like using requirements.txt to setup the virtual environment.

One of the biggest differences that has potential to cause issues is Python only allowing a single version of a package to be installed. Rust can have dependencies that reference different versions of the same package, as cargo crawls up the dependency chains.

The only real problem I see with crates.io currently is the large number of package name squatting that is going on. Discussions have shown that those in control currently do not considered it a problem. However, we are seeing more people announce the name of a library as something strange “because other good names were being squatted.” Many of the squatted names are empty and registered 2 years ago. This will most likely need to be addressed in the future.

Starting a new project

When installing rust, you also get the cargo command line utility in your path. While you can make a stand alone .rs source code file and compile it manually with rustc, I find it much easier to use cargo for everything. Even one file script programs.

In the terminal, navigate to the folder you want the source inside and type:

cargo new my_library

The default is to generate library code, as you will do that more than executables. On Windows, these would compile into .dll and on Linux or Mac, they would compile to .so.

This command will create the following in the current directory:

my_library
└─── Cargo.toml
└─── src
    └─── lib.rs

To make an executable, you would add --bin to the end.

cargo new my_program --bin

This command will create the following in the current directory:

my_program
└─── Cargo.toml
└─── src
    └─── main.rs

lib.rs and main.rs are your library and program source code files, respectively. You will see a main() function already created in main.rs and a testing module with a single empty test function in the lib.rs file.

Cargo.toml

This is what replaces the functionality of Python’s virtual environment and requirements.txt. It manages your link to dependencies and version of that code. It also can contain the data the you need if you decided to share your library on crates.io. It works as a one file version of the many setuptools used for packaging a PyPi project.

The Cargo.toml automatically created for my_library above, looks like this:

[package]
name = "my_library"
version = "0.1.0"
authors = ["sacherjj <sacherjj@gmail.com>"]

[dependencies]

The one for my_program is exactly the same, except for the name. These three field are mandatory, so this file would be the absolute minimum Cargo.toml that would be valid, if we removed the [dependencies] section. Using external dependencies is very common, so this is included to reduce programmer work.

Cargo uses Semantic Versioning with three numbers for [major].[minor].[patch]. In 0.x.x versions, anything goes. Try to be nice, but the interface is not baked in. Most developers treat minor version as increment on changes to the interface and patch as increment on non-breaking build.

At 1.0.0, you cannot make breaking changes without incrementing the major version. You also can not add new public interfaces structs, functions, types, etc., without incrementing the major version. This is true of Rust. For the most part, everything that compiled with Rust 1.0.0, still compiles. It is impressive with how rapid the release schedule is for the language. With every release, a tool is used that compiles all the code for all crates. This makes for a large test sample for prior version compatibility.

You may also only care about pointing to the major or minor versions. Using 1 would give you the latest version until 2.0.0. 1.1 would give you the latest version below either 1.2.0.

You can also specify a minimum version. ~1.0.0 would use any version >= 1.0.0 but less than 2.0.0. If you had a leading zero ~0.2.3, we would get any patch update >= 0.2.3 but less than 0.3.0.

Wildcards can also be used where 1.* has the same meaning as ~1.0.0. You can also use single and combined inequality operators, such as > 1.2 or >= 1.2 < 1.5.

Adding Dependencies

If our project uses serialization and deserialization, we might want to include the serde library. You will notice at the top of that linked page, where is a Cargo.toml block with a copy button. You would then paste this into your file, so our library Cargo.toml file would look like:

[package]
name = "my_library"
version = "0.1.0"
authors = ["sacherjj <sacherjj@gmail.com>"]

[dependencies]
serde = "1.0.11"

What happens when you are needing to reference a library you are developing or for some reason not on crates.io?

Lets use my two cargo projects as an example. I created both my_library and my_program in the same folder. So if I want to reference my_library from my_program, I would need to add the dependency like this:

[dependencies]
my_library = { path = "../my_library" }

If I build my_program, I see the serde dependency that my_library has compiles first, then my_library compiles, and finally my_program compiles. (More on how to build in the next section.)

Updating registry `https://github.com/rust-lang/crates.io-index`
Compiling serde v1.0.11
Compiling my_library v0.1.0 (file:///C:/Users/joesacher/Documents/Repositories/Personal/joesacher_com/scratch/my_library)
Compiling my_program v0.1.0 (file:///C:/Users/joesacher/Documents/Repositories/Personal/joesacher_com/scratch/my_program)

You may also reference a library directly from git (if this existed on github):

my_library = { git = "https://github.com/sacherjj/my_library.git", rev = "1f8e324" }

This can be useful if you are using features and a yet to be unpublished version, or just something that the author has not yet uploaded to crates.io. Notice how we can specify a rev to call out a specific commit.

Additional package fields

The documentation field under package can be set to the documentation URL for the package.

exclude or include allows you to eliminate or include source files, using an array of paths with optional wildcards. exclude will be seeded into your .gitignore file. include must be exhaustive if it is used, or source code will not be included.

Read through the entire manifest documentation for all the possible fields and configurations of your Cargo.toml file. This becomes more import as you fully flush out a libray for inclusion into crates.io. I have contributed to PyPi, but not yet to crates.io.

Building and Running

To compile a program, you type cargo build. This builds in debug mode, which is much faster as performance optimizations are not being done. To both build and run, type cargo run. Either of these will fetch the dependencies and then compile everything.

Below is the output from a Lode Runner clone I’ve been working on to learn Rust. The first time a package has been referenced, you would see ‘Downloading’ of the package as the first step, before compiling.

   Compiling rayon-core v1.2.1
   Compiling heapsize v0.4.1
   Compiling sdl2-sys v0.27.3
   Compiling ole32-sys v0.2.0
   Compiling kernel32-sys v0.2.2
   Compiling shell32-sys v0.1.1
   Compiling bzip2-sys v0.1.5
   Compiling miniz-sys v0.1.9
   Compiling gfx_gl v0.3.1
   Compiling flate2 v0.2.19
   Compiling app_dirs v1.1.1
   Compiling time v0.1.38
   Compiling cpal v0.4.5
   Compiling sdl2 v0.29.1
   Compiling euclid v0.15.1
   Compiling rayon v0.8.2
   Compiling rodio v0.5.1
   Compiling msdos_time v0.1.5
   Compiling bzip2 v0.3.2
   Compiling lyon_bezier v0.7.1
   Compiling lyon_core v0.7.0
   Compiling zip v0.2.5
   Compiling jpeg-decoder v0.1.13
   Compiling lyon_path_iterator v0.7.0
   Compiling lyon_path_builder v0.7.0
   Compiling lyon_svg v0.7.0
   Compiling lyon_path v0.7.0
   Compiling lyon_extra v0.7.0
   Compiling lyon_tessellation v0.7.1
   Compiling image v0.15.0
   Compiling lyon v0.7.1
   Compiling gfx_device_gl v0.14.1
   Compiling gfx_window_sdl v0.6.0
   Compiling ggez v0.3.3 (file:///C:/Users/joesacher/Documents/Repositories/ggez)
   Compiling lode_ruster v0.1.0 (file:///C:/Users/joesacher/Documents/Repositories/lode_ruster)

My only dependency is ggez, but cargo automatically downloaded and compiled the dependencies of ggez and the chain above that.

If you are working on a small program for solving a problem on something like Project Euler, you might want good performance each time. This means using the release option cargo run --release. This takes longer to compile, but runs significantly faster. This is of course what you want when you are done testing and releasing your library or program.

Cargo.lock

Once you have built your project, cargo writes a Cargo.lock file next to Cargo.toml. While Cargo.toml allows a little wiggle room, as far as versions of dependencies, Cargo.lock is a blueprint of exactly how the binary was created.

For our my_program from above, I get this Cargo.lock when compiling:

[root]
name = "my_program"
version = "0.1.0"
dependencies = [
 "my_library 0.1.0",
]

[[package]]
name = "my_library"
version = "0.1.0"
dependencies = [
 "serde 1.0.11 (registry+https://github.com/rust-lang/crates.io-index)",
]

[[package]]
name = "serde"
version = "1.0.11"
source = "registry+https://github.com/rust-lang/crates.io-index"

[metadata]
"checksum serde 1.0.11 (registry+https://github.com/rust-lang/crates.io-index)" = "f7726f29ddf9731b17ff113c461e362c381d9d69433f79de4f3dd572488823e9"

I could have 1.* as my version for serde, but it would still have the full version in the Cargo.lock. Also notice the checksum that can be used to verify the library binary. This is all designed to allow you to get exactly what you got before if you code is exactly the same.

Binary Crates

Much of the issues I run into with Python on Windows is due to packages with associates C or other dependencies. The default crates are pure Rust code. Cargo also has the ability to install binaries with the cargo install command. I have not used this yet, and just mention it if you come across something that needs it. I would assume this sits closer to the idea of wheels in PyPi. Binaries are used on Python for speed up or interfaces not easily accomplished in Python. There is much less need for binary in Rust for speed improvement.


Part 2 of 4 in the Python to Rust series.

Python to Rust: Beginning | Python to Rust: Types

comments powered by Disqus