Running ML models in Elixir using Pythonx

I just discovered Pythonx, which runs a Python interpreter in the same OS process as Elixir.

Looks like it also wraps uv so it sets up a virtual environment with all the Python dependencies your code needs.

Here's how I got the MLX version of the recently released SmolVLM model running:

Mix.install([
  {:pythonx, "~> 0.3.0"}
])

Pythonx.uv_init("""
[project]
name = "project"
version = "0.0.0"
requires-python = "==3.10.*"
dependencies = [
  "mlx-vlm == 0.1.14",
  "transformers @ git+https://github.com/huggingface/transformers@v4.49.0-SmolVLM-2-release",
  "torch == 2.6.0",
  "num2words == 0.5.14"
]
""")

Pythonx.eval(
  """
  import mlx.core as mx
  from mlx_vlm import load, generate
  from mlx_vlm.prompt_utils import apply_chat_template
  from mlx_vlm.utils import load_config

  # Load the model
  model_path = "mlx-community/SmolVLM2-500M-Video-Instruct-mlx"
  model, processor = load(model_path)
  config = load_config(model_path)

  # Prepare input
  image = ["http://images.cocodataset.org/val2017/000000039769.jpg"]
  prompt = "Describe this image."

  # Apply chat template
  formatted_prompt = apply_chat_template(
      processor, config, prompt, num_images=len(image)
  )

  # Generate output
  output = generate(model, processor, formatted_prompt, image, verbose=False)
  print(output)
  """,
  %{}
)

And run it using:

> elixir smolvlm.exs

Fetching 12 files: 100%|##########| 12/12 [00:00<00:00, 1906.57it/s]
Fetching 12 files: 100%|##########| 12/12 [00:00<00:00, 1606.04it/s]
 The image depicts two tabby cats lying on a pink blanket on a red couch. The cats are both in a relaxed state, with one cat lying on its side and the other cat curled up in a ball. The blanket appears to be a soft, plush material, and it covers the entire area of the couch.

In the background, there is a white remote control placed on the couch, close to the camera. The remote control has several buttons, including a red button, a green button, and a yellow button. The remote control is relatively small in size compared to the cats.

The cats are positioned in such a way that they appear to be enjoying their time on the couch. The cat on the left is lying on its side, with its head resting on the blanket, while the cat on the right is curled up in a ball, with its head tucked under its body. Both cats are facing the same direction, which is towards the left side of the image.

The overall scene suggests a comfortable and relaxed environment for the cats. The presence of the remote control indicates that the cats might be enjoying a leisurely moment on the couch, possibly watching TV or engaging in some other activity.

In summary, the image

Caching the loaded model

The example above runs the entire Python script in one go, which is only marginally better than shelling out to a Python process. If you wanted to run it for different images, for example, the model would get loaded multiple times.

Pythonx lets us do better. Because Pythonx runs on the same OS process as Elixir though, we can load the model once, save it in an Elixir variable and invoke it multiple times:

{_, globals} = Pythonx.eval(
  """
  import mlx.core as mx
  from mlx_vlm import load, generate
  from mlx_vlm.prompt_utils import apply_chat_template
  from mlx_vlm.utils import load_config

  # Load the model
  model_path = "mlx-community/SmolVLM2-500M-Video-Instruct-mlx"
  model, processor = load(model_path)
  config = load_config(model_path)

  def describe_image(image_url):
    # Prepare input
    image = [image_url]
    prompt = "Describe this image."

    # Apply chat template
    formatted_prompt = apply_chat_template(
        processor, config, prompt, num_images=len(image)
    )

    # Generate output
    return generate(model, processor, formatted_prompt, image, verbose=False)
  """,
  %{}
)

{desc1, _} = Pythonx.eval(
  """
  describe_image("http://images.cocodataset.org/val2017/000000039769.jpg")
  """,
  globals
)

{desc2, _} = Pythonx.eval(
  """
  describe_image("http://images.cocodataset.org/test-stuff2017/000000000001.jpg")
  """,
  globals
)

IO.inspect(desc1)
IO.inspect(desc2)

Almost any model coming out can be run using Python, and since it takes some effort to get a model running on Bumblebee, I think this could be a useful way of running models in Elixir. Of course, Pythonx isn't necessarily limited to running just ML models– you get access to the entire Python ecosystem.

However, note that this does have limitations. From the readme:

The goal of this project is to better integrate Python workflows within Livebook and its usage in actual projects must be done with care due to Python's global interpreter lock (GIL), which prevents multiple threads from executing Python code at the same time.

[...]

In other words, if you are using this library to integrate with Python, make sure it happens in a single Elixir process or that its underlying libraries can deal with concurrent invocation. Otherwise, prefer to use Elixir's System.cmd/3 or Ports to manage multiple Python programs via I/O.