My Elixir port of Hugging Face Hub APIs

IT Pornography
5 min readFeb 21, 2022

As maybe someone knows, in the last weeks my programmer sixth sense has been attracted by Elixir projects like Axon, Nx, Evision. I know, all of you young and impatient padawans can’t wait for the end of my stories about it. But in the meantime, I would like to write about another project. You know, when talking today about ML/AI, we cannot ignore the great work performed by Hugging Face.

Considering they are currently supporting a bunch of libraries, it would be a quite good idea adding also Axon library.

In that way the Elixir Machine Learning community could be part of the Hugging Face effort to “democratize good machine learning” and leverage on the quite good tools offered by them to ease the life of ML developers.

So, inspired by a thread in an Elixir ML dedicated channel, I started to analyze the Huggingface_hub code to port it in Elixir.

At first sight it is a quite good project, complex but not too much complicated. When reading the code a bit deeper, you can see it is alive and kicking, hence some parts is not coherent. As you can see here:

the login function in hf_api class is telling this method is deprecated. But the login command into the CLI interface is calling… the login method in hf_api class…

Apart from this kind of temporary glitches, as I said before, I think this code is quite good, and it will be fun to port it in Elixir.

REST API in Elixir: how to perform HTTP GET, POST, PUT and DELETE requests?

The APIs used into the hub scripts are HTTP(s) REST. So I needed to manage JSON objects and four HTTP methods (i.e. GET, POST, PUT and DELETE).

I almost instantly found the JSON management library. I choose Jason which is very quick and easy to use. I can do anything I need with just Jason.decode! and Jason.encode! functions. That’s a good piece of work!

I spent a little more time to find the complete and lightweight HTTP client library. During my search I read an interesting article about how Tzdata library is using Hackney, which is an Erlang low-level library managing HTTP(s) connections. So I added the Hackney dependency to my mix.exs file and started creating some higher level functions to easily manage the HTTP methods. I started from Tzdata code and learned how to use the typespecs and behaviours.

The typespecs are a way to define what is the shape of the variables used by the functions. So for example in following code:

I am defining three custom types (rows 1–3) to be used when creating the get function signature (row 5) which is specifying the “shape” of the get function parameters. This means that Elixir will check at runtime the type of variables passed to this function and raise an error in case they are not as expected. For example Elixir will raise an error in case the headers parameter will not be a list of tuples with two strings (i.e. the header name and the header value).

The behaviors are a sort of class inheritance offered by Elixir. It is a way to define the interfaces of functions and let to other modules do the implementation:

Here the syntax is quite similar to the previous example. I used@callback instead of the@spec keyword. That little change will allow me to create another module which will implement the get method as described into the @callback definition. Ok, that’s cryptic, I can see your brain melting, let me show you an example:

The Hackney module will assume the @behaviourof HTTPClient module. This means that it will at least@impl-ement the functions declared into the HTTPClient module. In snip4 example the get function interface is declared in HTTPClient module and its implementation is defined into the Hackney module. A call to Hackney.get function will return an error whether one of the parameters passed is not in the expected format, as defined into the HTTPClient @callbackdefinition. For a complete example of this structure you can take a look at HTTPClient and Hackney codes in my github repo.

Runtime dependency checks and Elixir macro meta-programming

Another nice feature I found in Tzdata code is the possibility to check whether a dependency has been loaded at runtime. So I learned the use of the useful Code.ensure_loaded? function which is returning true or false in case a library (identified by the atom used into the mix.exs file) has been loaded. Since the Hackney module is using the HTTPClient behavior, the Elixir compiler is expecting the implementation of all the HTTPClient APIs. Leveraging on the Tzdata implementation, in case the :hackneylibrary has not been loaded (i.e. Elixir runtime couldn’t load the library for any reasons), I decided to raise an exception returning a message per each method of the HTTPClient behavior. Since the contents of each function will be the same, I can use a macro to create at compile time all the functions defined in HTTPClient.

Ok, your brain is melting again, allow me to show you the following code:

The core of this solution is at rows 27–30. Here I am implementing a function with the name obtained from the variable f and arguments obtained from the args variable. The unquote(f) call is needed to use the f variable as a function name instead of defining a function called f. In order to understand why I used the unquote_splicing function, you need to see how args has been created at row 26. The Macro.generate_arguments function is creating an abstract representation tree of a generic set of function attributes, with a specific arity. So the unquote_splicing function at row 28 is needed to expand the abstract representation tree inargs to a set of valid arguments in function definition.

The Hf_Api implementation

So now I have an HTTP client library and a JSON encoder/decoder. It’s pretty straightforward to start implementing the HF APIs. I have implemented all the functions which are using the APIs defined here. See for instance the whoami implementation:

As you can see that’s a quite easy function which is getting a tokenand a path and performing a call to the HF servers using the Requests.auth_get function. TheRequests module is an alias to Hackney so, in case in future I decide to switch to another HTTP client library I’ll not be forced to find and replace all the Hackney occurrences in my code.

I know, you can’t wait to see the whole code of my Elixir HuggingfaceHub github repo.

The Command Line Interface

Yes, the project is including also a CLI to just perform login/logout operations onto the HF servers. I found an interesting “trick” to ask for a password in a terminal, but that’s something I’ll talk about in another story.

That’s all, folks… Or Not?

My work on this project is not finished. I think I will add also something like the Repository class to allow Elixir developers upstream their models into HF repos.

Ok, that’s enough for today. Bye!

--

--

IT Pornography

Are you here because of “pornography”? Well… Here I just undress IT.