Code and Life

Programming, electronics and other cool tech stuff

Supported by

Supported by Picotech

Combine Go (golang) and SvelteKit for GUI

This post outlines the basics of creating a project that combines Go (or "golang" as googling for "go" is a pain — why didn't the guys at Google think of this?) native backend serving a web UI / GUI running on SvelteKit.

In a nutshell, this involves creating a new go project, creating a simple web server program that supports serving files from a static folder, and finally creating a SvelteKit project and configuring it to produce static content into that folder. But let's do a short detour on why this might be useful!

Combining native executable with Web UI

Native graphical user interfaces are not easy on any platform, and after looking at Qt, WxWidgets, Electron etc. I decided all had either major shortcomings, huge learning curves or resulted in way too large packages.

Doing a native web server, on the other hand, is quite easy using Go. I also investigated C and C++, but at least on Windows you very quickly run into MinGW vs. Visual Studio issues, runtimes, build systems and all that chaos, whereas Go pretty much produces executables with minimum fuss.

Once you have a web server, you can just serve a web UI and the user can run the executable and open the UI in their browser.

Simple web server with Go

Once you are comfortable creating a "Hello world" level app in Go, making a simple app for web server is not too hard:

$ mkdir project
$ cd project
project$ go mod init example/project

Here's a simple web server you can paste into main.go

package main

import (
    "encoding/json"
    "log"
    "mime"
    "net/http"
)

func databases(w http.ResponseWriter, r *http.Request) {
    w.Header().Set("Content-Type", "application/json")
    w.Header().Set("Access-Control-Allow-Origin", "*") // for CORS
    w.WriteHeader(http.StatusOK)
    test := []string{}
    test = append(test, "Hello")
    test = append(test, "World")
    json.NewEncoder(w).Encode(test)
}

func main() {
    // Windows may be missing this
    mime.AddExtensionType(".js", "application/javascript")

    http.Handle("/test", http.HandlerFunc(databases))
    http.Handle("/", http.FileServer(http.Dir("static")))
    log.Fatal(http.ListenAndServe(":8080", nil))
}

Read post

Use ChatGPT with golang and SvelteKit GUI, including GPT4

OpenAI came out with ChatGPT, and wow, that is quite something. What is also remarkable is the load the ChatGPT client is under, and how often it is "experiencing high demand". Or just requires you to prove you are human and log in again.

You can get ChatGPT Plus for $20 a month, but hey, you can also get chat experience for $0.002 per 1000 tokens. To hit that monthly fee, you need to use 10 M tokens, which is not that far from 10 M words. That is pretty heavy use...

Using OpenAI ChatGPT (gpt-3.5-turbo) through Python API

To use the ChatGPT API, at its simplest form with Python3 you just pip install openai and create a short script:

#!/usr/bin/python3

import openai
import sys

openai.api_key = 'sk-yourkeyhere'

if len(sys.argv) < 2:
    prompt = input('Prompt: ')
else:
    prompt = ' '.join(sys.argv[1:])

resp = openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=[
        {"role": "system", "content": "You are a programming expert giving advice to a colleague."},
        {"role": "user", "content": prompt}
    ]
)

print(resp['choices'][0]['message']['content'])

print('Usage was', resp['usage'])

You need to create credentials at OpenAI platform, enter your credit card and set a warning and hard treshold for monthly billing (I set mine to $4 and $10, respectively). But after filling your API key to the script, you can just run it:

$ python3 chat.py What is the capital of Alaska
The capital of Alaska is Juneau. However, I believe you were looking for programming advice. What specifically are you working on and what kind of advice are you seeking?
Usage was {
  "completion_tokens": 34,
  "prompt_tokens": 30,
  "total_tokens": 64
}

Now that is pretty nice, but we can do better!

Golang client with SvelteKit frontend

In my previous Golang+SvelteKit GUI post I explored how to create a Go application acting as a web server and making a user interface with SvelteKit:

  1. Golang has high performance and excellent set of libraries to accomplish many tasks
  2. Cross-platform support out of the box with compact executables
  3. SvelteKit is fast to develop as a frontend, requiring very low amount of code for rich interactive UIs

OpenAI does not produce it's own Go library, but that API as well documented and shabaronov has made an excellent Golang OpenAI API library that makes calling the API simple. It even supports GPT4, so if you have access to that, you can create a GPT4 chat client as well.

Without further ado, here's the Github repository for my GoChatGPT client. You can basically git clone://github.com/jokkebk/gochatgpt and follow the instructions in README.md to get it running, it's about 4 commands all in all.

Let's look a bit what the code does!

Golang ChatGPT Backend

Main method of the backend is nothing too complex:

  1. Serve the SvelteKit GUI from static folder (including index.html when user requests /).
  2. Have a chat endpoint at /chat that takes a JSON object with chat messages and passes it to OpenAI API.
  3. Return the OpenAI [ChatGPT] response as a string to calling client.
func main() {
	// Get the API key from the environment variable OPENAI_API_KEY
	apiKey := os.Getenv("OPENAI_API_KEY")
	client := openai.NewClient(apiKey)

	http.Handle("/", http.FileServer(http.Dir("static")))
	http.HandleFunc("/chat", func(w http.ResponseWriter, r *http.Request) {
		var messages []openai.ChatCompletionMessage
		err := json.NewDecoder(r.Body).Decode(&messages)
		if err != nil {
			http.Error(w, err.Error(), http.StatusBadRequest)
			return
		}

		ans, shouldReturn := chatGPT(client, messages)
		if shouldReturn {
			http.Error(w, "ChatGPT error", http.StatusBadRequest)
			return
		}
		// Serialize the response as JSON and write it back to the client
		w.Header().Set("Content-Type", "text/plain")
		w.Write([]byte(ans))
	})

	address := "localhost:1234"
	log.Printf("Starting server, go to http://%s/ to try it out!", address)
	http.ListenAndServe(address, nil)
}

With the rather straightforward OpenAI, the chatGPT() function is nothing to write home about either – we get the chat messages, put them into ChatCompletion object, pass to API and (hopefully) return the top answer (or an empty error if it failed):

Read post

CORS middleware for julienschmidt/httprouter with Golang

Just a small note / Gist type of thing for today: I got tired of adding w.Header().Set("Access-Control-Allow-Origin", "*") to every handler function in my small Golang web app. I'm using Julien Schmidt's excellent httprouter module for simple routing. Turns out the Basic Authentication example is quite simple to adjust for a set-and-forget type of httprouter.Handle middleware:

// https://github.com/julienschmidt/httprouter middleware to set CORS header
func MiddleCORS(next httprouter.Handle) httprouter.Handle {
	return func(w http.ResponseWriter,
    r *http.Request, ps httprouter.Params) {
		w.Header().Set("Access-Control-Allow-Origin", "*")
		next(w, r, ps)
	}
}

Using the middleware is simple, just wrap your normal handler function:

router.GET("/someurl", MiddleCORS(SomeURLFunc))

Or both the middleware and the function it takes implement httprouter.Handle, you can just chain multiple middleware with MiddleCORS(AnotherMiddleware(SomeURLFunc)).

Read post

Mastering the Huggingface CLIP Model: How to Extract Embeddings and Calculate Similarity for Text and Images

Article image, neural network transforming images and text into vector data

Huggingface's transformers library is a great resource for natural language processing tasks, and it includes an implementation of OpenAI's CLIP model including a pretrained model clip-vit-large-patch14. The CLIP model is a powerful image and text embedding model that can be used for a wide range of tasks, such as image captioning and similarity search.

The CLIPModel documentation provides examples of how to use the model to calculate the similarity of images and captions, but it is less clear on how to obtain the raw embeddings of the input data. While the documentation provides some guidance on how to use the model's embedding layer, it is not always clear how to extract the embeddings for further analysis or use in other tasks.

Furthermore, the documentation does not cover how to calculate similarity between text and image embeddings yourself. This can be useful for tasks such as image-text matching or precalculating image embeddings for later (or repeated) use.

In this post, we will show how to obtain the raw embeddings from the CLIPModel and how to calculate similarity between them using PyTorch. With this information, you will be able to use the CLIPModel in a more flexible way and adapt it to your specific needs.

Benchmark example: Logit similarity score between text and image embeddings

Here's the example from CLIPModel documentation we'd ideally like to split into text and image embeddings and then calculate the similarity score between them ourselves:

from PIL import Image
import requests
from transformers import AutoProcessor, CLIPModel

model = CLIPModel.from_pretrained("openai/clip-vit-large-patch14")
processor = AutoProcessor.from_pretrained("openai/clip-vit-base-patch32")

url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)

inputs = processor(
    text=["a photo of a cat", "a photo of a dog"], images=image, return_tensors="pt", padding=True
)

outputs = model(**inputs)
logits_per_image = outputs.logits_per_image  # this is the image-text similarity score
probs = logits_per_image.softmax(dim=1)  # we can take the softmax to get the label probabilities

If you run the code and print(logits_per_image) you should get:

tensor([[18.9041, 11.7159]], grad_fn=<PermuteBackward0>)

The code calculating the logits is found in forward() function source

Acquiring image and text features separately

There are pretty promising looking examples in get_text_features() and get_image_features() that we can use to get CLIP features for either in tensor form:

from PIL import Image
import requests
from transformers import AutoProcessor, AutoTokenizer, CLIPModel

model = CLIPModel.from_pretrained("openai/clip-vit-large-patch14")

# Get the text features
tokenizer = AutoTokenizer.from_pretrained("openai/clip-vit-large-patch14")

inputs = tokenizer(["a photo of a cat", "a photo of a dog"], padding=True, return_tensors="pt")
text_features = model.get_text_features(**inputs)

print(text_features.shape) # output shape of text features

# Get the image features
processor = AutoProcessor.from_pretrained("openai/clip-vit-large-patch14")

url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)

inputs = processor(images=image, return_tensors="pt")

image_features = model.get_image_features(**inputs)

print(image_features.shape) # output shape of image features

Running this should yield the following output:

$ python cliptest.py
torch.Size([2, 768])
torch.Size([1, 768])

Looks pretty good! Two 768 item tensors for the two labels, and one similarly sized for the image! Now let's see if we can calculate the similarity between the two...

Read post

Fixing MIME type errors when using Golang net/http FileServer

Today as I was finishing my Go+SvelteKit article, I ran into frustrating Chrome error message:

Failed to load module script: Expected a JavaScript module script but the server responded with a MIME type of "text/plain". Strict MIME type checking is enforced for module scripts per HTML spec.

Don't you just love Chrome? It knows what it needs to do (load a JavaScript module), but utterly refuses to do that because of a wrong MIME type. This happened with a client-side SvelteKit application, when it tried to open some part of the .js code.

At the time of writing, it seemed I could not find the answer easily to this one, but there actually seems to be a StackOverflow solution discussing this. But to help others hitting the same issue:

The problem on my Windows install was likely that Windows 10 registry did not contain a MIME type definition for .js files. Informing user how to tweak registry to get your program working is not ideal, but thankfully you can augment the mime types:

import "mime"

func main() {
	// Windows may be missing this
	mime.AddExtensionType(".js", "application/javascript")

    // And then you create the FileServer like you normally would
	http.Handle("/", http.FileServer(http.Dir("static")))
}

After adding the mime fix, remember to force reload Chrome page (hold Control key down while you press refresh), otherwise the problem persists as Chrome does not really bother reloading the offending files.

Read post

$6 DIY bluetooth sheet music page turn pedal with ESP32

I've had an iPad Pro 12.9" for some time, and been happily using it for sheet music when playing the piano. However, having to interrupt your playing to swipe to the next page does get annoying. You can get a $100 commercial AirTurn pedal, but since one can get a microcontroller from Ebay/Aliexpress for $4 and a simple foot pedal switch for $2, I thought it would be a fun one evening hacking project. It turned out quite nice:

Getting started: ESP32 devkit

The sheet music applications on iPad (and Android) usually have bluetooth keyboard support, turning the page when user presses an arrow key or spacebar. So the minimum viable product is just a bluetooth-enabled microcontroller that can pair with the iPad and send a single key upon request.

The ESP32 chip has both WiFi and Bluetooth, and I chose it for this project, as it is readily available in a compact form factor, and it's easy to program with Arduino. Searching [AliExpress](https://www.aliexpress.com for ESP32 should give you plenty of options.

I had a ESP32 board labelled with "ESP32 DEVKITV1" in my parts box, and it was quite easy to set up with this ESP32 tutorial:

  1. Install the driver for USB-UART bridge
  2. Add source URLs for ESP32 to Arduino and install ESP32 support
  3. Select "DOIT ESP32 DEVKIT" from the board menu
  4. Hold down "Boot" button on the board while selecting "Upload", release when the console says "Connecting..."

Before you proceed with the tutorial, check that you can get the lights blinking or flash some other example code successfully for the board. There are plenty of resources around if you hit into any issues! I had to google the step 4 myself, although it would have sufficed to read the linked tutorial carefully...

Read post